sas books

3月 192020
 

At SAS Press, we agree with the saying “The best things in life are free.” And one of the best things in life is knowledge. That’s why we offer free e-books to help you learn SAS or improve your skills. In this blog post, we will introduce you to one of our amazing titles that is absolutely free.

SAS Programming for R Users

Many data scientists today need to know multiple programming languages including SAS, R, and Python. If you already know basic statistical concepts and how to program in R but want to learn SAS, then SAS Programming for R Users by Jordan Bakerman was designed specifically for you! This free e-book explains how to write programs in SAS that replicate familiar functions and capabilities in R. This book covers a wide range of topics including the basics of the SAS programming language, how to import data, how to create new variables, random number generation, linear modeling, Interactive Matrix Language (IML), and many other SAS procedures. This book also explains how to write R code directly in the SAS code editor for seamless integration between the two tools.

The book is based on the free, 14-hour course of the same name offered by SAS Education available here. Keep reading to learn more about the differences between SAS and R.

SAS versus R

R is an object-oriented programming language. Results of a function are stored in an object and desired results are pulled from the object as needed. SAS revolves around the data table and uses procedures to create and print output. Results can be saved to a new data table.

Let’s briefly compare SAS and R in a general way. Look at the following table, which outlines some of the major differences between SAS and R.

Here are a few other things about SAS to note:

  • SAS has the flexibility to interact with objects. (However, the book focuses on procedural methods.)
  • SAS does not have a command line. Code must be run in order to return results.

SAS Programs

A SAS program is a sequence of one or more steps. A step is a sequence of SAS statements. There are only two types of steps in SAS: DATA and PROC steps.

  • DATA steps read from an input source and create a SAS data set.
  • PROC steps read and process a SAS data set, often generating an output report. Procedures can be called an umbrella term. They are what carry out the global analysis. Think of a PROC step as a function in R.

Every step has a beginning and ending boundary. SAS steps begin with either of the following statements:

  • a DATA statement
  • a PROC statement

After a DATA or PROC statement, there can be additional SAS statements that contain keywords that request SAS perform an operation or they can give information to the system. Think of them as additional arguments to a procedure. Statements always end with a semicolon!

SAS options are additional arguments and they are specific to SAS statements. Unfortunately, there is no rule to say what is a statement versus what is an option. Understanding the difference comes with a little bit of experience. Options can be used to do the following:

  • generate additional output like results and plots
  • save output to a SAS data table
  • alter the analytical method

SAS detects the end of a step when it encounters one of the following statements:

  • a RUN statement (for most steps)
  • a QUIT statement (for some procedures)

Most SAS steps end with a RUN statement. Think of the RUN statement as the right parentheses of an R function. The following table shows an example of a SAS program that has a DATA step and a PROC step. You can see that both SAS statements end with RUN statements, while the R functions begin and end with parentheses.

If you want to learn more about this book or any other free e-books from SAS Press, visit https://support.sas.com/en/books/free-books.html. Subscribe to our newsletter to get the latest information on new books.

Free e-book: SAS Programming for R Users was published on SAS Users.

3月 052020
 

Have you heard that SAS offers a collection of new, high-performance CAS procedures that are compatible with a multi-threaded approach? The free e-book Exploring SAS® Viya®: Data Mining and Machine Learning is a great resource to learn more about these procedures and the features of SAS® Visual Data Mining and Machine Learning. Download it today and keep reading for an excerpt from this free e-book!

In SAS Studio, you can access tasks that help automate your programming so that you do not have to manually write your code. However, there are three options for manually writing your programs in SAS® Viya®:

  1. SAS Studio provides a SAS programming environment for developing and submitting programs to the server.
  2. Batch submission is also still an option.
  3. Open-source languages such as Python, Lua, and Java can submit code to the CAS server.

In this blog post, you will learn the syntax for two of the new, advanced data mining and machine learning procedures: PROC TEXTMINE and PROCTMSCORE.

Overview

The TEXTMINE and TMSCORE procedures integrate the functionalities from both natural language processing and statistical analysis to provide essential functionalities for text mining. The procedures support essential natural language processing (NLP) features such as tokenizing, stemming, part-of-speech tagging, entity recognition, customized stop list, and so on. They also support dimensionality reduction and topic discovery through Singular Value Decomposition.

In this example, you will learn about some of the essential functionalities of PROC TEXTMINE and PROC TMSCORE by using a text data set containing 1,830 Amazon reviews of electronic gaming systems. The data set is named Amazon. You can find similar data sets of Amazon reviews at http://jmcauley.ucsd.edu/data/amazon/.

PROC TEXTMINE

The Amazon data set has already been loaded into CAS. The review content is stored in the variable ReviewBody, and we generate a unique review ID for each review. In the proc call shown in Program 1 we ask PROC TEXTMINE to do three tasks:

  1. parse the documents in table reviews and generate the term by document matrix
  2. perform dimensionality reduction via Singular Value Decomposition
  3. perform topic discovery based on Singular Value Decomposition results

Program 1: PROC TEXTMINE

data mycaslib.amazon;
    set mylib.amazon;
run;

data mycaslib.engstop;
    set mylib.engstop;
run;

proc textmine data=mycaslib.amazon;
    doc_id id;
    var reviewbody;

 /*(1)*/  parse reducef=2 entities=std stoplist=mycaslib.engstop 
          outterms=mycaslib.terms outparent=mycaslib.parent
          outconfig=mycaslib.config;

 /*(2)*/  svd k=10 svdu=mycaslib.svdu outdocpro=mycaslib.docpro
          outtopics=mycaslib.topics;

run;

(1) The first task (parsing) is specified in the PARSE statement. Parameter “reducef” specifies the minimum number of times a term needs to appear in the text to be included in the analysis. Parameter “stop” specifies a list of terms to be excluded from the analysis, such as “the”, “this”, and “that”. Outparent is the output table that stores the term by document matrix, and Outterms is the output table that stores the information of terms that are included in the term by document matrix. Outconfig is the output table that stores configuration information for future scoring.

(2) Tasks 2 and 3 (dimensionality reduction and topic discovery) are specified in the SVD statement. Parameter K specifies the desired number of dimensions and number of topics. Parameter SVDU is the output table that stores the U matrix from SVD calculations, which is needed in future scoring. Parameter OutDocPro is the output table that stores the new matrix with reduced dimensions. Parameter OutTopics specifies the output table that stores the topics discovered.

Click the Run shortcut button or press F3 to run Program 1. The terms table shown in Output 1 stores the tagging, stemming, and entity recognition results. It also stores the number of times each term appears in the text data.

Output 1: Results from Program 1

PROC TMSCORE

PROC TEXTMINE is used with large training data sets. When you have new documents coming in, you do not need to re-run all the parsing and SVD computations with PROC TEXTMINE. Instead, you can use PROC TMSCORE to score new text data. The scoring procedure parses the new document(s) and projects the text data into the same dimensions using the SVD weights derived from the original training data.

In order to use PROC TMSCORE to generate results consistent with PROC TEXTMINE, you need to provide the following tables generated by PROC TEXTMINE:

  • SVDU table – provides the required information for projection into the same dimensions.
  • Config table – provides parameter values for parsing.
  • Terms table – provides the terms that should be included in the analysis.

Program 2 shows an example of TMSCORE. It uses the same input data layout used for PROC TEXTMINE code, so it will generate the same docpro and parent output tables, as shown in Output 2.

Program 2: PROC TMSCORE

Proc tmscore data=mycaslib.amazon svdu=mycaslib.svdu
        config=mycaslib.config terms=mycaslib.terms
        svddocpro=mycaslib.score_docpro outparent=mycaslib.score_parent;
    var reviewbody;
    doc_id id;
run;

 

Output 2: Results from Program 2

To learn more about advanced data mining and machine learning procedures available in SAS Viya, including PROC FACTMAC, PROC TEXTMINE, and PROC NETWORK, you can download the free e-book, Exploring SAS® Viya®: Data Mining and Machine Learning. Exploring SAS® Viya® is a series of e-books that are based on content from SAS® Viya® Enablement, a free course available from SAS Education. You can follow along with examples in real time by watching the videos.

 

Learn about new data mining and machine learning procedures in SAS Viya was published on SAS Users.

2月 262020
 

Do you wish you could predict the likelihood that one of your customers will open your marketing email? Or what if you could tell whether a new medical treatment for a patient will have a better outcome than the standard treatment? If you are familiar with propensity modeling, then you know such predictions about future behavior are possible! Propensity models generate a propensity score, which is the probability that a future behavior will occur. Propensity models are used often in machine learning and predictive data analytics, particularly in the fields of marketing, economics, business, and healthcare. These models can detect and remove bias in analysis of real-world, observational data where there is no control group.

SAS provides several approaches for calculating propensity scores. This excerpt from the new book, Real World Health Care Data Analysis: Causal Methods and Implementation Using SAS®, discusses one approach for estimating propensity scores and provides associated SAS code. The example code and data used in the examples is available to download here.

A priori logistic regression model

One approach to estimating a propensity score is to fit a logistic regression model a priori, that is, identify the covariates in the model and fix the model before estimating the propensity score. The main advantage of an a priori model is that it allows researchers to incorporate knowledge external to the data into the model building. For example, if there is evidence that a covariate is correlated to the treatment assignment, then this covariate should be included in the model even if the association between this covariate and the treatment is not strong in the current data. In addition, the a priori model is easy to interpret. The directed acyclic graph approach could be very informative in building a logistic propensity score model a priori, as it clearly points out the relationship between covariates and interventions. The correlation structure between each covariate and the intervention selection is pre-specified and in a fixed form. However, one main challenge of the a priori modeling approach is that it might not provide the optimal balance between treatment and control groups.

Building an a priori model

To build an a priori model for propensity score estimation in SAS, we can use either PROC PSMATCH or PROC LOGISTIC as shown in Program 1. In both cases, the input data set is a one observation per patient data set containing the treatment and baseline covariates from the simulated REFLECTIONS study. Also, in both cases the code will produce an output data set containing the original data set with the additional estimated propensity score for each patient (_ps_).

Program 1: Propensity score estimation: a priori logistic regression

PROC PSMATCH DATA=REFL2 REGION=ALLOBS;
  CLASS COHORT GENDER RACE DR_RHEUM DR_PRIMCARE;
  PSMODEL COHORT(TREATED='OPIOID')= GENDER RACE AGE BMI_B BPIINTERF_B BPIPAIN_B
             CPFQ_B FIQ_B GAD7_B ISIX_B PHQ8_B PHYSICALSYMP_B SDS_B DR_RHEUM
             DR_PRIMCARE;
  OUTPUT OUT=PS PS=_PS_;
RUN;

PROC LOGISTIC DATA=REFL2;
  CLASS COHORT GENDER RACE DR_RHEUM DR_PRIMCARE;
  MODEL COHORT = GENDER RACE AGE BMI_B BPIINTERF_B BPIPAIN_B CPFQ_B FIQ_B GAD7_B
           ISIX_B PHQ8_B PHYSICALSYMP_B SDS_B DR_RHEUM DR_PRIMCARE;
  OUTPUT OUT=PS PREDICTED=PS;
RUN;

Before building a logistic model in SAS, we suggest examining the distribution of the intervention indicator at each level of the categorical variable to rule out the possibility of “complete separation” (or “perfect prediction”), which means that for subjects at some level of some categorical variable, they would all receive one intervention but not the other. Complete separation can occur for several reasons and one common example is when using several categorical variables whose categories are coded by indicators. When the logistic regression model is fit, the estimate of the regression coefficients βs is based on the maximum likelihood estimation, and MLEs under logistic regression modeling do not have a closed form. In other words, the MLE β̂ cannot be written as a function of Xi and Ti. Thus, the MLE of βs are obtained using some numerical analysis algorithms such as the Newton-Raphson method. However, if there is a covariate X that can completely separate the interventions, then the procedure will not converge in SAS. If PROC LOGISTIC was used, the following warning message will be issued.

WARNING: There is a complete separation of data points. The maximum likelihood estimate does not exist.

WARNING: The LOGISTIC procedure continues in spite of the above warning. Results shown are based on the last maximum likelihood iteration. Validity of the model fit is questionable.

Notice that SAS will continue to finish the computation despite issuing warning messages. However, the estimate of such βs are incorrect, and so are the estimated propensity scores. If after examining the intervention distribution at each level of the categorical variables complete separation is found, then efforts should be made to address this issue. One possible solution is to collapse the categorical variable causing the problem. That is, combine the different outcome categories such that the complete separation no longer exists.

Firth logistic regression

Another possible solution is to use Firth logistic regression. It uses a penalized likelihood estimation method. Firth bias-correction is considered an ideal solution to the separation issue for logistic regression (Heinze and Schemper, 2002). In PROC LOGISTIC, we can add an option to run the Firth logistic regression as shown in Program 2.

Program 2: Firth logistic regression

PROC LOGISTIC DATA=REFL2;
  CLASS COHORT GENDER RACE DR_RHEUM DR_PRIMCARE;
  MODEL COHORT = GENDER RACE DR_RHEUM DR_PRIMCARE BPIInterf_B BPIPain_B 
        CPFQ_B FIQ_B GAD7_B ISIX_B PHQ8_B PhysicalSymp_B SDS_B / FIRTH;
  OUTPUT OUT=PS PREDICTED=PS;
RUN;

 

References

Heinze G, Schemper M (2002). A solution to the problem of separation in logistic regression. Statistics in Medicine 21.16: 2409-2419.

Propensity Score Estimation with PROC PSMATCH and PROC LOGISTIC was published on SAS Users.

2月 182020
 

In case you missed the news, there is a new edition of The Little SAS Book! Last fall, we completed the sixth edition of our book, and even though it is actually a few pages shorter than the fifth edition, we managed to add many more topics to the book. See if you can answer this question.

The answer is D – all of the above! We also added new sections on subsetting, summarizing, and creating macro variables using PROC SQL, new sections on the XLSX LIBNAME engine and ODS EXCEL, more on iterative DO statements, a new section on %DO, and more. For a summary of all the changes, see our blog post “The Little SAS Book 6.0: The best-selling SAS book gets even better."

Updating The Little SAS Book meant updating its companion book, Exercises and Projects for The Little SAS Book, as well. The exercises and projects book contains multiple choice and short answer questions as well as programming exercises that cover the same topics that are in The Little SAS Book. The exercises and projects book can be used in a classroom setting, or for anyone wanting to test their SAS knowledge and practice what they have learned.

Here are examples of the types of questions you might find in the exercises and projects book.

Multiple Choice

Short Answer

Programming Exercise

Solutions

In the book, we provide solutions for odd-numbered multiple choice and short answer questions and hints for the programming exercises.

  1. B
  2. Hint: New variables (columns) can be specified in the SELECT clause. Also, see our blog post “Expand your SAS Knowledge by Learning PROC SQL.”

While we don’t provide solutions for even-numbered questions, we can tell you that the iterative DO statement is covered in Section 3.12 of The Little SAS Book, Sixth Edition, “Using Iterative DO, DO WHILE, and DO UNTIL Statements.” The %DO statement is covered in Section 7.7, “Using %DO Loops in Macros.”

For more information about these books, explore the following links to the SAS website:

The Little SAS Book, Sixth Edition

Exercises and Projects for The Little SAS Book, Sixth Edition

Test your SAS skills with the newest edition of Exercises and Projects for The Little SAS Book was published on SAS Users.

2月 142020
 

In honor of Valentine’s day, we thought it would be fitting to present an excerpt from a paper about the LIKE operator because when you like something a lot, it may lead to love! If you want more, you can read the full paper “Like, Learn to Love SAS® Like” by Louise Hadden, which won best paper at WUSS 2019.

Introduction

SAS provides numerous time- and angst-saving techniques to make the SAS programmer’s life easier. Among those techniques are the ability to search and select data using SAS functions and operators in the data step and PROC SQL, as well as the ability to join data sets based on matches at various levels. This paper explores how LIKE is featured in each one of these techniques and is suitable for all SAS practitioners. I hope that LIKE will become part of your SAS toolbox, too.

Smooth Operators

SAS operators are used to perform a number of functions: arithmetic calculations, comparing or selecting variable values, or logical operations. Operators are loosely grouped as “prefix” (for example a sign before a variable) or “infix” which generally perform an operation BETWEEN two variables. Arithmetic operations using SAS operators may include exponentiation (**), multiplication (*), and addition (+), among others. Comparison operators may include greater than (>, GT) and equals (=, EQ), among others. Logical, or Boolean, operators include such operands as || or !!, AND, and OR, and serve the purpose of grouping SAS operations. Some operations that are performed by SAS operators have been formalized in functions. A good example of this is the concatenation operators (|| and !!) and the more powerful CAT functions which perform similar, but not identical, operations. LIKE operators are most frequently utilized in the DATA step and PROC SQL via a DATA step.

There is a category of SAS operators that act as comparison operators under special circumstances, generally in where statements in PROC SQL and the data step (and DS2) and subsetting if statements in the data step. These operators include the LIKE operator and the SOUNDS LIKE operator, as well as the CONTAINS and the SAME-AND operators. It is beyond the scope of this short paper to discuss all the smooth operators, but they are definitely worth a look.

LIKE Operator

Character operators are frequently used for “pattern matching,” that is, evaluating whether a variable value equals, does not equal, or sounds like a specified value or pattern. The LIKE operator is a case-sensitive character operator that employs two special “wildcard” characters to specify a pattern: the percent sign (%) indicates any number of characters in a pattern, while the underscore (_) indicates the presence of a single character per underscore in a pattern. The LIKE operator is akin to the GREP utility available on Unix/Linux systems in terms of its ability to search strings.

The LIKE operator also includes an escape routine in case you need to use a string that includes a comparison operator such as the carat, the underscore, or the percent sign, etc. An example of the escape routine syntax, when looking for a string containing a percent sign, is:

where yourvar like ‘100%’ escape ‘%’;

Additionally, SAS practitioners can use the NOT LIKE operator to select variables WITHOUT a given pattern. Please note that the LIKE statement is case-sensitive. You can use the UPCASE, LOWCASE, or PROPCASE functions to adjust input strings prior to using the LIKE statement. You may string multiple LIKE statements together with the AND or OR operators.

SOUNDS LIKE Operator

The LIKE operator, described above, searches the actual spelling of operands to make a comparison. The SOUNDS LIKE operator uses phonetic values to determine whether character strings match a given pattern. As with the LIKE operator, the SOUNDS LIKE operator is useful for when there are misspellings and similar sounding names in strings to be compared. The SOUNDS LIKE operator is denoted with a short cut ‘-*’. SOUNDS LIKE is based on SAS’s SOUNDEX algorithm. Strings are encoded by retaining the original first column, stripping all letters that are or act as vowels (A, E, H, I, O, U, W, Y), and then assigning numbers to groups: 1 includes B, F, P, and V; 2 includes C, G, J, K, Q, S, X, Z; 3 includes D and T; 4 includes L; 5 includes M and N; and 6 includes R. “Tristn” therefore becomes T6235, as does Tristan, Tristen, Tristian, and Tristin.

For more on the SOUNDS LIKE operator, please read the documentation.

Joins with the LIKE Operator

It is possible to select records with the LIKE operator in PROC SQL with a WHERE statement, including with joins. For example, the code below selects records from the SASHELP.ZIPCODE file that are in the state of Massachusetts and are for a city that begins with “SPR”.

proc sql;
    CREATE TABLE TEMP1 AS
    select
        a.City ,
        a.countynm  , a.city2 ,
         a.statename , a.statename2
    from sashelp.zipcode as a
    where upcase(a.city) like 'SPR%' and 
upcase(a.statename)='MASSACHUSETTS' ; 
quit;

The test print of table TEMP1 shows only cases for Springfield, Massachusetts.

The code below joins SASHELP.ZIPCODE and a copy of the same file with a renamed key column (city --> geocity), again selecting records for the join that are in the state of Massachusetts and are for a city that begins with “SPR”.

proc sql;
    CREATE TABLE TEMP2 AS
    select
        a.City , b.geocity, 
        a.countynm  ,
        a.statename , b.statecode, 
        a.x, a.y
    from sashelp.zipcode as a, zipcode2 as b
    where a.city = b.geocity and upcase(a.city) like 'SPR%' and b.statecode
= 'MA' ;
quit;

The test print of table TEMP2 shows only cases for Springfield, Massachusetts with additional variables from the joined file.

The LIKE “Condition”

The LIKE operator is sometimes referred to as a “condition,” generally in reference to character comparisons where the prefix of a string is specified in a search. LIKE “conditions” are restricted to the DATA step because the colon modifier is not supported in PROC SQL. The syntax for the LIKE “condition” is:

where firstname=: ‘Tr’;

This statement would select all first names in Table 2 above. To accomplish the same goal in PROC SQL, the LIKE operator can be used with a trailing % in a where statement.

Conclusion

SAS provides practitioners with several useful techniques using LIKE statements including the smooth LIKE operator/condition in both the DATA step and PROC SQL. There’s definitely reason to like LIKE in SAS programming.

To learn more about SAS Press, check out our up-and-coming titles, and to receive exclusive discounts make sure to subscribe to our newsletter.

References

    Gilsen, Bruce. September 2001. “SAS® Program Efficiency for Beginners.” Proceedings of the Northeast SAS Users Group Conference, Baltimore, MD.

    Roesch, Amanda. September 2011. “Matching Data Using Sounds-Like Operators and SAS® Compare Functions.” Proceedings of the Northeast SAS Users Group Conference, Portland, ME.

    Shankar, Charu. June 2019. “The Shape of SAS® Code.” Proceedings of PharmaSUG 2019 Conference, Philadelphia, PA.

Learn to Love SAS LIKE was published on SAS Users.

2月 052020
 

One of the first and most important steps in analyzing data, whether for descriptive or inferential statistical tasks, is to check for possible errors in your data. In my book, Cody's Data Cleaning Techniques Using SAS, Third Edition, I describe a macro called %Auto_Outliers. This macro allows you to search for possible data errors in one or more variables with a simple macro call.

Example Statistics

To demonstrate how useful and necessary it is to check your data before starting your analysis, take a look at the statistics on heart rate from a data set called Patients (in the Clean library) that contains an ID variable (Patno) and another variable representing heart rate (HR). This is one of the data sets I used in my book to demonstrate data cleaning techniques. Here is output from PROC MEANS:

The mean of 79 seems a bit high for normal adults, but the standard deviation is clearly too large. As you will see later in the example, there was one person with a heart rate of 90.0 but the value was entered as 900 by mistake (shown as the maximum value in the output). A severe outlier can have a strong effect on the mean but an even stronger effect on the standard deviation. If you recall, one step in computing a standard deviation is to subtract each value from the mean and square that difference. This causes an outlier to have a huge effect on the standard deviation.

Macro

Let's run the %Auto_Outliers macro on this data set to check for possible outliers (that may or may not be errors).

Here is the call:

%Auto_Outliers(Dsn=Clean.Patients,
               Id=Patno,
               Var_List=HR SBP DBP,
               Trim=.1,
               N_Sd=2.5)

This macro call is looking for possible errors in three variables (HR, SBP, and DBP); however, we will only look at HR for this example. Setting the value of Trim equal to .1 specifies that you want to remove the top and bottom 10% of the data values before computing the mean and standard deviation. The value of N_Sd (number of standard deviations) specifies that you want to list any heart rate beyond 2.5 trimmed standard deviations from the mean.

Result

Here is the result:

After checking every value, it turned out that every value except the one for patient 003 (HR = 56) was a data error. Let's see the mean and standard deviation after these data points are removed.

Notice the Mean is now 71.3 and the standard deviation is 11.5. You can see why it so important to check your data before performing any analysis.

You can download this macro and all the other macros in my data cleaning book by going to support.sas.com/cody. Scroll down to Cody's Data Cleaning Techniques Using SAS, and click on the link named "Example Code and Data." This will download a file containing all the programs, macros, and data files from the book.  By the way, you can do this with any of my books published by SAS Press, and it is FREE!

Let me know if you have questions in the comments section, and may your data always be clean! To learn more about SAS Press, check out up-and-coming titles, and to receive exclusive discounts make sure to subscribe to the newsletter.

Finding Possible Data Errors Using the %Auto_Outliers Macro was published on SAS Users.

1月 132020
 

Are you ready to get a jump start on the new year? If you’ve been wanting to brush up your SAS skills or learn something new, there’s no time like a new decade to start! SAS Press is releasing several new books in the upcoming months to help you stay on top of the latest trends and updates. Whether you are a beginner who is just starting to learn SAS or a seasoned professional, we have plenty of content to keep you at the top of your game.

Here is a sneak peek at what’s coming next from SAS Press.
 
 

For students and beginners

For beginners, we have Exercises and Projects for The Little SAS® Book: A Primer, Sixth Edition, the best-selling workbook companion to The Little SAS Book by Rebecca Ottesen, Lora Delwiche, and Susan Slaughter. Exercises and Projects for The Little SAS® Book, Sixth Edition will be updated to match the updates to the new The Little SAS® Book: A Primer, Sixth Edition. This hands-on workbook is designed to hone your SAS skills whether you are a student or a professional.

 

 

For data explorers of all levels

This free e-book explores the features of SAS® Visual Data Mining and Machine Learning, powered by SAS® Viya®. Users of all skill levels can visually explore data on their own while drawing on powerful in-memory technologies for faster analytic computations and discoveries. You can manually program with custom code or use the features in SAS® Studio, Model Studio, and SAS® Visual Analytics to automate your data manipulation and modeling. These programs offer a flexible, easy-to-use, self-service environment that can scale on an enterprise-wide level. This book introduces some of the many features of SAS Visual Data Mining and Machine Learning including: programming in the Python interface; new, advanced data mining and machine learning procedures; pipeline building in Model Studio, and model building and comparison in SAS® Visual Analytics

 

 

For health care data analytics professionals

If you work with real world health care data, you know that it is common and growing in use from sources like observational studies, pragmatic trials, patient registries, and databases. Real World Health Care Data Analysis: Causal Methods and Implementation in SAS® by Doug Faries et al. brings together best practices for causal-based comparative effectiveness analyses based on real world data in a single location. Example SAS code is provided to make the analyses relatively easy and efficient. The book also presents several emerging topics of interest, including algorithms for personalized medicine, methods that address the complexities of time varying confounding, extensions of propensity scoring to comparisons between more than two interventions, sensitivity analyses for unmeasured confounding, and implementation of model averaging.

 

For those at the cutting edge

Are you ready to take your understanding of IoT to the next level? Intelligence at the Edge: Using SAS® with the Internet of Things edited by Michael Harvey begins with a brief description of the Internet of Things, how it has evolved over time, and the importance of SAS’s role in the IoT space. The book will continue with a collection of chapters showcasing SAS’s expertise in IoT analytics. Topics include Using SAS Event Stream Processing to process real world events, connectivity, using the ESP Geofence window, applying analytics to streaming data, using SAS Event Stream Processing in a typical IoT reference architecture, the role of SAS Event Stream Manager in managing ESP deployments in an IoT ecosystem, how to use deep learning with Your IoT Digital, accounting for data quality variability in streaming GPS data for location-based analytics, and more!

 

 

 

Keep an eye out for these titles releasing in the next two months! We hope this list will help in your search for a SAS book that will get you to the next step in updating your SAS skills. To learn more about SAS Press, check out our up-and-coming titles, and to receive exclusive discounts make sure to subscribe to our newsletter.

Foresight is 2020! New books to take your skills to the next level was published on SAS Users.

1月 082020
 

Did I trick you into seeing what this blog is about with its mysterious title? I am going to talk about how to use the FIND function to search text values.

The FIND function searches for substrings in character values. For example, you might want to extract all email addresses ending in .edu from a list of email addresses. If you are a slightly older SAS programmer like me, you may be more familiar with the INDEX function. If you use only two arguments in the FIND function, the first being the string you are searching and the second being the substring you are looking for, the FIND function is identical to the INDEX function. Both of these functions will searczh the string (first argument) for the substring (second argument) and return the position where the substring starts. If the substring is not found, the function returns a zero.

The newer FIND function has several advantages over the older INDEX function. These advantages are realized by the optional third and fourth arguments to the FIND function. These two arguments allow you to specify a starting position for the search and modifiers that allow you to ignore case. You can use either of these two arguments, or both, and the order doesn't matter! How is this possible? The value for the starting position is always a numeric value and the value for the modifier is always a character value. Thus, SAS can always figure out if a value is a starting position or a modifier.

Let's look at an example

Suppose you have a SAS data set called Emails, and each observation in the data set contains a name and an email address.

Here is a listing of the SAS data set Emails:

You want to select all observations where the variable Email_Address contains .edu (ignoring case).

The program below does just that:

*Searching for .edu;
data Education;
   set Emails;
   if find(Email_Address,'.edu','i') then output;
run;
title "Listing of Data Set Education";
proc print data=Education noobs;
run;

The 'i' modifier is an instruction to ignore case. In the listing of Education below, notice that all the .edu addresses are listed, regardless of case.

Not only is the FIND function more flexible than the older INDEX function, the ignore case modifier is really handy.

For more tips on writing code and how to get started in SAS Studio, check out my book, Learning SAS by Example: A Programmer’s Guide, Second Edition. You can also download a free book excerpt. To also learn more about SAS Press, check out the up-and-coming titles and receive exclusive discounts, make sure to subscribe to the SAS Books newsletter.

Adventures of a SAS detective and the fantastic FIND function was published on SAS Users.

12月 172019
 

The next time you pick up a book, you might want to pause and think about the work that has gone into producing it – and not just from the authors!

The authors of the SAS classic, The Little SAS Book, Sixth Edition, did just that. The Acknowledgement section in the front of a book is usually short – just a few lines to thank family and significant others. In their sixth edition, authors Lora D. Delwiche and Susan J. Slaughter, took it to the next level and produced The Little SAS Book Family Tree. The authors explained:

“Over the years, many people have helped to make this book a reality. We are grateful to everyone who has contributed both to this edition, and to editions in the past. It takes a family to produce a book including reviewers, copyeditors, designers, publishing specialists, marketing specialists, and of course our editors.”

So what happens after you sign a book contract?

First you will be assigned a development editor (DE) who will answer questions and be with you every step of the way – from writing your sample chapter to publication. Your DE will discuss schedules and milestones as well as give you an authoring template, style guidelines, and any software you need to write your book.

Once you have all the resources you need, you'll write your sample chapter. This will help your DE evaluate your writing style, quality of graphics and output, structure, and any potential production issues.

The next step is submitting a draft manuscript for technical review. You'll get feedback from internal and external subject-matter experts, and then you can revise the manuscript based on this feedback. When you and your editor are satisfied with the technical content, your DE will perform a substantive edit on your manuscript, taking particular care with the structure and flow of your writing.

Once in production, your manuscript will be copy edited, and a production specialist will lay out the pages and make sure everything looks good. A graphic designer will work with you to create a cover that encompasses both our branding and your suggestions. Your book will be published in print and e-book versions and made ready for sale.

Finally, after the book is published, a marketing specialist will start promoting your book through our social media channels and other campaigns.

So the next time you pick up a book, spare a thought for the many people who have worked to make it a reality!

For more information about publishing with SAS or for our full catalogue, visit our online bookstore.

How many people does it take to publish a book? was published on SAS Users.

12月 042019
 

If you’re like me, you struggle to buy gifts. Most folks in my inner circle already have everything they need and most of what they want. Most folks, that is, except the tech-lovers. That’s because there’s always something new on the horizon. There’s always a new gadget or program. Or a new way to learn those things. If you know folks who fall into this category, and who love SAS, boy do I have some gift ideas for you:

Try SAS for free

If you know someone who is interested in SAS but doesn’t work with it on a daily basis, or someone looking to experiment in a different area of SAS, let them know about our free software trials. Who doesn’t love free? And you’ll look like a rock star when you point out that they can explore areas like SAS Data Preparation, SAS Visual Forecasting, SAS Visual Statistics and lots more, for free.

SAS University Edition

Let’s keep things in the vein of free, shall we? SAS University Edition includes SAS Studio, Base SAS, SAS/STAT, SAS/IML, SAS/ACCESS, and several time series forecasting procedures from SAS/ETS. It’s the same software used by sites around the world; that means it’s the most up-to-date statistical and quantitative methods. And it’s free for academic, noncommercial use.

Registration to SAS Global Forum 2020

Oh my gosh, you would be your loved one’s favorite person! I’ve been to many SAS Global Forums and I love seeing the excitement on someone’s face as they explore the booths, sit in on sessions, and network to their heart’s content. SAS Global Forum is the place to be for SAS professionals, thought leaders, decision makers, partners, students, and academics. There truly is something for everyone at this event for analytics enthusiasts. And who wouldn’t want to be in Washington, D.C. in the Spring? It’s cherry blossom time! Be sure to register before Jan. 29, 2020 to get early bird pricing and save $600 off the on-site registration fee.

SAS books

I worked for many years in SAS Press and saw, firsthand, how excited folks get over a new book. I mean, who doesn’t want the 6th edition of the Little SAS Book or the latest from Ron Cody? Browse more than 100 titles and find the perfect gift. Don’t forget, SAS Press is offering 25% off everything in the bookstore for the month of December! Use promo code HOLIDAYS25 at checkout when placing your order. Offer expires at midnight Tuesday, December 31, 2019 (US).

A SAS Learning subscription

Give the gift of unlimited learning with access to our selection of e-learning and video tutorials. You can give a monthly or yearly subscription. Help the special people in your life get a head start on a new career. The gift of education is one of the best gifts you can give.

SAS OnDemand for Academics

For those who don't want to install anything, but run SAS in the cloud, we offer SAS OnDemand for Academics. Users get free access to powerful SAS software for statistical analysis, data mining and forecasting. Point-and-click functionality means there’s no need to program. But if you like to program, you do can do that, too! It’s really the best of both worlds.

SAS blogs

I began this post with something free, and I’ll end it with something free: learning from our SAS blogs. Pick from areas such as customer intelligence, operations research, data science and more. You can also read content from specific regions such as Korea, Latin America, Japan and others. To get you started, here are a few posts from three of our most popular blog authors:

Chris Hemedinger, author of The SAS Dummy:

Rick Wicklin, author of The DO Loop:

Robert Allison, major contributor to Graphically Speaking:

Happy holidays and happy learning. Now go help someone geek out. And I won’t tell if you purchase a few of these things for yourself!

Gifts to give the SAS fan in your life was published on SAS Users.