SAS has one of the highest percentages of women working in the technology industry. And yet, a persistent gender gap in technology is cause for concern as the number of women seeking degrees in computing continues to shrink. Why is that? Kicking off day three of SAS Global Forum, SAS [...]
In a previous post, I looked at promotion from SAS 9.4 to Viya. In this post, I will look at promotion within SAS Viya. I will look at what can be promoted, the tools that support promotion, and some details about how the process works and what happens to your content. If you are used to promotion using the import export wizards in SAS 9.4, I will point out some of the current differences in promotion within Viya.
Firstly, you must be an Administrator in Viya to be able to export and import content. This is currently (as of Viya 3.4) something that cannot be changed. The two main tools you can use for promoting content between Viya Environments are SAS Environment Manager import/export wizards and the sas-admin command-line interface.
For a lot of Viya content, promotion is supported using the transfer plug-in of the sas-admin command-line interface. The transfer plug-in and SAS Environment Manager both use the transfer service under the covers. This post will focus on the content supported by the transfer service. The list of Viya content supported by the transfer service has increased with each Viya release. The table below shows the supported resources for export by Viya release.
When performing an export/import, the transfer service coordinates the export process and the creation of the package. However, it calls other related services which deal with their specific content. For example, services related to Visual Analytics will deal with reports, and Model Manager with models, etc.
The result of the export process is a Viya promotion package, which is a json file containing a collection of transfer objects describing the content that has been exported. The transfer service's package will include the objects you select for export and the following related platform objects:
There is no mechanism in Viya, like there was in 9.4, to automatically include all dependent objects in a package. To see what is included in a package, let's look at an example. In this example, we will use a Visual Analytics report, but this could apply to any supported content type.
In scenario 1, if we select the report and export it, the package will contain the report and the folders that are included in its path /gelcontent/GELCorp/Sales/Reports. What about authorization settings? Currently, the two interfaces behave slightly differently. The transfer plug-in will always include authorization settings in the package. However, exporting from SAS Environment manager does not include authorization settings. In terms of what authorization is included, directly set authorization are included for objects that are explicitly included in the package. In the export example above, that means we would only get any authorization rules applied directly to the report. To include authorization rules for a folder, we would need to select the folder or one of its parent folders for export.
In scenario 2, if we select the GELCORP folder and do an export, we will get all sub-folders and content below that folder, including any authorization rules applied directly to those objects. In Viya 3.4, you cannot export the complete folder tree. There is no way in the cli or environment manager to select the root of the folder tree. To export the complete folder tree, you need to export each root folder separately. A tool (exportfoldertree.py) has been added to the pyviyatools that can help with this issue. It will loop the folders and export each root folder to a package in a directory.
Viya content is uniquely identified by its Uniform Resource Identifier (URI). When importing to Viya, objects in the package are matched to objects in the target based on the URI. When matching on URI during an import, if:
- no match occurs, then a new object is created.
- a match does occur, then the object is overwritten.
The match on URI is an important concept. It can have some results that you might not expect if you don’t understand it. For example, if a report is renamed, a subsequent import may rename the report back based on the name of the report in the package.
In the example below, a report, identified by the uri /reports/reports/c99s5a2-ccb-4552-b1a5-d8b0e3cb1afo, has been moved to a different folder than the same report in the package being imported.
You might expect in this scenario that a new report will be created in the original folder that the report was moved from. However, since the import matches on URI, the location in the folder structure is not relevant. The report is not added to the folder location stored in the package but is overwritten in its new location. The import process will issue a clear warning that this has happened.
How is authorization dealt with during import? In general, when importing a resource that already exists in the target environment, the authorization settings will be merged with the target resource authorization. During the merge, if the rule (by URI of the rule):
- already exists, then it may be updated.
- does not exist, then a rule may be created.
Authorization is not synched during an import, it is a merge. A rule will never be deleted during an import.
Finally, there is some functionality during import that you may be used to in SAS 9.4 that is not available in Viya yet. When importing a package to Viya you cannot:
- Subset the content from the package during import.
- Specify a new location in the target folder tree for imported objects.
I hope this helps you gain a better understanding of the features of promotion within SAS Viya and how they work. Here are some related resources that may also help:
- SAS® Viya® 3.4 Administration: Promotion (Import and Export)
- SAS Viya 3.4 Promotion Workshop
- Promotion of Viya Resources that are not stored in folders
- Promoting caslibs between Viya environments
- Saving and reloading Viya Configuration
- Creating Viya authorization from 9.4 metadata
- New functionality for transitioning from Visual Analytics on 9.4 to Viya
Phil Simon weighs in on using data to make the most of AI.
The post Which data sources will organizations use to make AI as effective as possible? appeared first on The Data Roundtable.
Artificial intelligence is the attention-grabbing, overhyped, shiny object that every organization is searching to make use of. Yes, it is overhyped, but it’s also very real and very powerful. “We do not want to add to the hype. We do not want to add to the confusion. We want to [...]
Did you know that SAS provides built-in support for working with probability distributions that are finite mixtures of normal distributions? This article shows examples of using the "NormalMix" distribution in SAS and describes a trick that enables you to easily work with distributions that have many components.
As with all probability distributions, there are four essential functions that you need to know: The PDF, CDF, QUANTILE, and RAND functions.
What is a normal mixture distribution?
A finite mixture distribution is a weighted sum of component distributions. When all of the components are normal, the distribution is called a mixture of normals. If the i_th component has parameters (μi, σi), then you can write the probability density function (PDF) of the normal mixture as
f(x) = Σi wi φ(x; μi, σi)
where φ is the normal PDF and the positive constants wi are the mixing weights. The mixing weights must sum to 1.
The adjacent graph shows the density function for a three-component mixture of normal distributions. The means of the components are -6, 3, and 8, respectively, and are indicated by vertical reference lines. The mixing weights are 0.1, 0.3, and 0.6. The SAS program to create the graph is in the next section.
The "NormalMix" distribution in SAS
The PDF and CDF functions in Base SAS support the "NormalMix" distribution. The syntax is a little unusual because the function needs to support an arbitrary number of components. If there are k components, the PDF and CDF functions require 3k + 3 parameters:
- The first parameter is the name of the distribution: "NormalMix". The second parameter is the value, x, at which to evaluate the density function.
- The third parameter is the number of component distributions, k > 1.
- The next k parameters specify the mixing weights, w1, w2, ..., wk.
- The next k parameters specify the component means, μ1, μ2, ..., μk.
- The next k parameters specify the component standard deviations, σ1, σ2, ..., σk.
If you are using a model that has many components, it is tedious to explicitly list every parameter in every function call. Fortunately, there is a simple trick that prevents you from having to list the parameters. You can put the parameters into arrays and use the OF operator (sometimes called the OF keyword) to reference the parameter values in the array. This is shown in the next section.
The PDF and CDF of a normal mixture
The following example demonstrates how to compute the PDF and CDF for a three-component mixture-of-normals distribution. The DATA step shows two tricks:
- The parameters (weights, means, and standard deviations) are stored in arrays. In the calls to the PDF and CDF functions, syntax such as OF w[*} enables you to specify the parameter values with minimal typing.
- A normal density is extremely tiny outside of the interval [μ - 5*σ, μ + 5*σ]. You can use this fact to compute the effective domain for the PDF and CDF functions.
/* PDF and CDF of the normal mixture distribution. This example specifies three components. */ data NormalMix; array w _temporary_ ( 0.1, 0.3, 0.6); /* mixing weights */ array mu _temporary_ (-6, 3, 8); /* mean for each component */ array sigma _temporary_ (0.5, 0.6, 2.5); /* standard deviation for each component */ /* For each component, the range [mu-5*sigma, mu+5*sigma] is the effective support. */ minX = 1e308; maxX = -1e308; /* initialize to extreme values */ do i = 1 to dim(mu); /* find largest interval where density > 1E-6 */ minX = min(minX, mu[i] - 5*sigma[i]); maxX = max(maxX, mu[i] + 5*sigma[i]); end; /* Visualize the functions on the effective support. Use arrays and the OF operator to specify the parameters. An alternative syntax is to list the arguments, as follows: cdf = CDF('normalmix', x, 3, 0.1, 0.3, 0.6, -6, 3, 8, 0.5, 0.6, 2.5); */ dx = (maxX - minX)/200; do x = minX to maxX by dx; pdf = pdf('normalmix', x, dim(mu), of w[*], of mu[*], of sigma[*]); cdf = cdf('normalmix', x, dim(mu), of w[*], of mu[*], of sigma[*]); output; end; keep x pdf cdf; run;
As shown in the program, the OF operator greatly simplifies and clarifies the function calls. The alternative syntax, which is shown in the comments, is unwieldy.
The following statements create graphs of the PDF and CDF functions. The PDF function is shown at the top of this article. The CDF function, along with a few reference lines, is shown below.
title "PDF function for Normal Mixture Distribution"; title2 "Vertical Lines at Component Means"; proc sgplot data=NormalMix; refline -6 3 8 / axis=x; series x=x y=pdf; run; title "CDF function for Normal Mixture Distribution"; proc sgplot data=NormalMix; xaxis grid; yaxis min=0 grid; refline 0.1 0.5 0.7 0.9; series x=x y=cdf; run;
The quantiles of a normal mixture
The quantile function for a continuous distribution is the inverse of the CDF distribution. The graph of the CDF function for a mixture of normals can have flat regions when the component means are far apart relative to their standard deviations. Technically, these regions are not completely flat because the normal distribution has infinite support, but computationally they can be very flat. Because finding a quantile is equivalent to finding the root of a shifted CDF, you might encounter computational problems if you try to compute the quantile that corresponds to an extremely flat region, such as the 0.1 quantile in the previous graph.
The following DATA step computes the 0.1, 0.5, 0.7, and 0.9 quantiles for the normal mixture distribution. Notice that you can use arrays and the OF operator for the QUANTILE function:
data Q; array w _temporary_ ( 0.1, 0.3, 0.6); /* mixing weights */ array mu _temporary_ (-6, 3, 8); /* mean for each component */ array sigma _temporary_ (0.5, 0.6, 2.5); /* standard deviation for each component */ array p (0.1, 0.5, 0.7, 0.9); /* find quantiles for these probabilities */ do i = 1 to dim(p); prob = p[i]; qntl = quantile('normalmix', prob, dim(mu), of w[*], of mu[*], of sigma[*]); output; end; keep qntl prob; run; proc print; run;
The table tells you that 10% of the density of the normal mixture is less than x=-3.824. That is essentially the result of the first component, which has weight 0.1 and therefore is responsible for 10% of the total density. Half of the density is less than x=5.58. Fully 70% of the density lies to the left of x=8, which is the mean of the third component. That result makes sense when you look at the mixing weights.
Random values from a normal mixture
The RAND function does not explicitly support the "NormalMix" distribution. However, as I have shown in a previous article, you can simulate from an arbitrary mixture of distributions by using the "Table" distribution in conjunction with the component distributions. For the three-component mixture distribution, the following DATA step simulates a random sample:
/* random sample from a mixture distribution */ %let N = 1000; data RandMix(drop=i); call streaminit(12345); array w _temporary_ ( 0.1, 0.3, 0.6); /* mixing weights */ array mu _temporary_ (-6, 3, 8); /* mean for each component */ array sigma _temporary_ (0.5, 0.6, 2.5); /* standard deviation for each component */ do obsNum = 1 to &N; i = rand("Table", of w[*]); /* choose the component by using the mixing weights */ x = rand("Normal", mu[i], sigma[i]); /* sample from that component */ output; end; run; title "Random Sample for Normal Mixture Distribution"; proc sgplot data=RandMix; histogram x; refline -6 3 8 / axis=x; /* means of component distributions */ run;
The histogram of a random sample looks similar to the graph of the PDF function, as it should.
In summary, SAS provides built-in support for working with the density (PDF), cumulative probability (CDF), and quantiles (QUANTILE) of a normal mixture distribution. You can use arrays and the OF operator to call these Base SAS functions without having to list every parameter. Although the RAND function does not natively support the "NormalMix" distribution, you can use the "Table" distribution to select a component according to the mixing weights and use the RAND("Normal") function to simulate from the selected normal component.
Remember when it seemed like the only way to explain analytics to a layperson was to reference "Moneyball"? My how things have changed. Analytics and big data went mainstream and, more recently, AI and algorithms grace the headlines of national news pieces.
As analytics has moved from the backroom to front page, the related careers and learning options have exploded. I don’t need to tell readers of this blog about the high demand for analytics and data science talent.
I have worked in the training and education groups at SAS for 22 years. For SAS, a stalwart in higher education and the commercial world, the last decade has been a time of change. With so many choices for statistics, programming and analytics, we introduced many free options for learning and using SAS.
On April 28, we announced our latest investments in analytics education, headlined by SAS Viya for Learners, which offers free access to AI and machine learning software for higher education teaching and learning.
Introducing SAS Viya for Learners
SAS Viya for Learners is a full suite of cloud-based software that supports the entire analytics life cycle – from data, to discovery, to deployment. It makes it easy for professors to incorporate AI and machine learning into coursework, including the ability to integrate R & Python with SAS through Jupyter notebooks.
People with expertise in an industry-standard like SAS, plus open source skills, will stand out in such a competitive job market.
SAS Viya for Learners provides support tools like online chat, web tutorials, e-learning opportunities, documentation, communities and technical support, freeing educators to teach creative applications of analytics, and critical thinking skills. To support the successful use of SAS Viya for Learners at academic institutions, we offer free educator workshops and teaching materials.
Students learn to explore data, discover insights and deploy AI and machine learning models. They gain real-world experience through true business use cases and showcase their skills with badges and certification opportunities.
Professors can apply for access to SAS Viya for Learners via its home page. Students sign up through their professors.
SAS Viya for Learners is also available to those who enroll in a new SAS machine learning course, available now, for just $79 for three months access. Learners can also soon gain AI and machine learning skills via two new Coursera courses that will offer access to SAS Viya for Learners.
SAS Viya for Learners is just the latest free offering to help people teach and learn SAS.
- SAS University Edition is a download option for anyone, anywhere to learn programming and statistics, for free.
- SAS OnDemand for Academics provides educators and students free online access to more advanced analytics.
- Numerous free e-learning and tutorial options are also available.
I also encourage educators to check out Cortex, a new analytics simulation game co-developed by SAS and Canadian business school HEC Montreal. Cortex teaches analytics and predictive modeling skills through a competitive game. Educators can bring real-world experience into the classroom by having students compete to create the best model to support a fictional charitable foundation’s fundraising efforts. The game provides students with information on the nonprofit and a data set of potential donors, as well as access to SAS data mining tools. Students are ranked on a leaderboard based on the quality of their model and its results.
You DO need stinking badges!
I know, I’m dating myself with that reference, but it’s critical that professionals and students be able to stand out from the pack. Digital credentials that validate expertise enhance degrees and carry significant weight with savvy employers seeking people who can get the job done.
An AI, big data, advanced analytics or data science credential fosters lucrative opportunities across industries. The SAS Global Certification program has long been the standard for industries like banking and life sciences, having awarded more than 142,000 SAS credentials to individuals in 112 countries.
This week, we launched three new specialist-level SAS certifications in machine learning, natural language and computer vision, and forecasting and optimization. The learners who pursue the certification automatically earn the professional-level credential, SAS Certified Professional: AI and Machine Learning. An immersive two-week classroom experience or flexible, online option taken over 12 months are available. Both options include certification exams.
In addition, we partnered with Acclaim to create digital badges for SAS credentials. Professionals can add badges to online resumes, social media and email signatures to showcase expertise in a variety of analytical skills.
These new programs were announced at SAS Global Forum 2019. Like every year, the event is an amazing gathering of thousands of SAS users which gives educators and students their time to shine. We hope the attendees and SAS users around the world are as excited about these new offerings as we are. We look forward to helping more people learn, grow and succeed.
New AI offerings highlight many free ways to learn SAS was published on SAS Users.
A persistent analytics talent gap creates big opportunities for people who can wield analytics to help organizations make better decisions. Innovative analytics users and students who are rushing to fill that gap, and those who teach them, are being honored this week at SAS Global Forum. A special Sunday event [...]
SAS celebrates analytics talent, and those who shape it was published on SAS Voices by Trent Smith
A record-breaking crowd of more than 5,500 analytics enthusiasts received a Texas-sized welcome from CEO Jim Goodnight as he opened SAS® Global Forum 2019. This is the fourth time the forum has been held in Dallas, and this year, the evening started with a look back at one of the [...]
One small step for man, one giant leap for analytics was published on SAS Voices by Shannon Heath
"Practical AI" might seem like an oxymoron to some. But that’s only if you view artificial intelligence as a futuristic and unrealistic pursuit. Kirk Borne, PhD, decidedly does not. Borne is the Principal Data Scientist and an Executive Advisor at global technology and consulting firm Booz Allen Hamilton. In this [...]
As a SAS programmer, you are asked to do many things with your data -- reading, writing, calculating, building interfaces, and occasionally sending data outside of SAS. One of the most popular outputs you may be tasked with creating is likely a Microsoft Excel workbook. Have you ever heard, “just send me the spreadsheet”?
For an internal project the task is easy, just open the SAS ODS EXCEL destination, run PROC PRINT, and close SAS ODS EXCEL and the workbook spreadsheet is ready. But if the workbook or the spreadsheet is to be delivered somewhere else you may need to spruce it up a bit. Of course, you can manually change virtually everything on the spreadsheet, but that takes lots of employee time. And if the spreadsheet is delivered on a periodic basis, you may not run it the same every time.
Saving you time and money
Suppose you run a PROC PRINT with a “BY” statement and produce a Microsoft Excel workbook with 100 pages. If each of those pages need to be printed and distributed to 100 clients by mail, do you want to be the person who changes each of those to print as a landscape printout? The SAS ODS Excel destination has over 125 options and sub-options that can perform various tasks while the workbook is being written. One such task sets the worksheet to print in “landscape” format.
As a programmer, I know that when I want to start a new project or learn new software, I look to two places in a book: the index and the table of contents. If I can think of a key word that might help me, I look to the index. But when searching general topics, I use the Table of Contents (TOC). It always frustrates me if the TOC is in alphabetical order, so I decided to write my TOC as groups of options and SAS commands that impacted similar parts or features of the Excel Workbook.
To see this in action, the bullet points I have listed below identify the major topic sections of the book. These are, in fact, chapter titles presented after the introduction:
• ODS Tagset versus Destination
• ODS Excel Destination Actions
• Setting Excel Document Property Values
• Options That Affect the Workbook
• Arguments that Affect Output Features
• Options That Affect Worksheet Features
• Options That Affect Print Features
• Column, Row, and Cell Features
Take a look inside
Allow me to describe each of these topics in a few words:
ODS Tagset versus Destination
Many people have used the SAS ODS Tagset EXCELXP and will find many parts of the SAS ODS EXCEL destination to be very similar in both syntax and function. A tagset is Proc template code that can be changed by the user, while a SAS ODS destination is a built-in feature of SAS, much like a PROC or FUNCTION that cannot be changed by the users. This section also describes the ID feature of ODS which allows you to write more than one EXCEL workbook at a time.
ODS Excel Destination Actions
This area describes ODS features that may not be exclusive to ODS EXCEL but are useful in finding and choosing SAS outputs to be processed.
Setting Excel Document Property Values
Here you are shown how to change the comments, keywords, author, title, and other parts of the Excel Property sheet.
Options That Affect the Workbook
This section of the book shows you how to name the workbook, create blank worksheets, create a table of contents or index of worksheets within the workbook, change worksheet tab colors, and other options.
Arguments that Affect Output Features
The output features described here include changing the output style (coloration of the worksheet sections), finding and using stylesheet anchors, building and using Cascading Style Sheets, changing the Dots Per Inch (DPI) of the output data and or graphs, and adding text to the worksheet.
Options That Affect Worksheet Features
SAS has many options that describe the output data. These include titles, footnotes, and byline text. Additionally, SAS can group output worksheets in many ways including by page, by proc, by table, by by-group, or even no separation at all. Data can also be set to “FITTOPAGE” or the height or width can be selected along with adding sheet names or labels to the EXCEL output worksheets.
Options That Affect Print Features
Excel has many print features like printing in “black and white” only, centering horizontally or vertically, landscape or portrait, draft quality or standard, selecting the print order of the data, selecting the area to print, EXCEL headers and EXCEL footnotes, and others. All of which SAS can adjust as the workbook is being written.
Column, Row, and Cell Features
Finally, SAS can adjust column and row features like adding filters, changing the widths and heights or rows and columns, hiding rows or columns, inserting formulas, and even placing the data somewhere other than row one column one.
Ready to see the full Table of Contents? Click here.
Ready, set, go
My book Exchanging Data From SAS® to Excel: The ODS Excel Destination expands upon the SAS documentation by giving full descriptions and examples including SAS code and EXCEL output for nearly every option and sub-option of the SAS ODS EXCEL software. In addition to this blog, check out a free chapter of my book to get started making your worksheets beautifully formatted. Get ready to follow the money and make your reports come out perfect for publication in no time!