tips & techniques

1月 072022
 

Welcome to the sixth installment in my series Getting Started with Python Integration to SAS Viya. In previous posts, I discussed how to connect to the CAS serverhow to execute CAS actions, and how to work with the results. Now it's time to generate simple descriptive statistics of a CAS table.

Let's begin by confirming the cars table is loaded into memory. With a connection to CAS established, execute the tableInfo action to view available in-memory tables. If necessary, you can execute the following code in SAS Studio to load the sashelp.cars table into memory.

conn.tableinfo(caslib="casuser")

The results show the cars table is loaded into memory and available for processing. Next, reference the cars table in the variable tbl. Then use the print function to show the value of the variable.

tbl = conn.CASTable('cars', caslib='casuser')
print(tbl)
CASTable('cars', caslib='casuser')

The results show that the tbl variable references the cars table in the CAS server.

Preview the CAS Table

First things first. Remember, the SWAT package blends the world of Pandas and CAS into one. So you can begin with the traditional head method to preview the CAS table.

tbl.head()

The SWAT head method returns five rows from the CAS server to the client as expected.

The Describe Method

Next, let's retrieve descriptive statistics of all numeric columns by using the familiar describe method on the CAS table.

tbl.describe()

The SWAT describe method returns the same descriptive statistics as the Pandas describe method. The only difference is that the SWAT version uses the CAS API to convert the describe method into CAS actions behind the scenes to process the data on the distributed CAS server. CAS processes the data and returns summarized results back to the client as a SASDataFrame, which is a subclass of the Pandas DataFrame. You can now work with the results as you would a Pandas DataFrame.

Summary CAS Action

Instead of using the familiar describe method, let's use a CAS action to do something similar. Here I'll use the summary CAS action.

tbl.summary()

Summary CAS Action

The results of the summary action return a CASResults object (Python dictionary) to the client. The CASResults object contains a single key named Summary with a SASDataFrame as the value. The SASDataFrame shows a variety of descriptive statistics.  While the summary action does not return exactly the same statistics as the describe method, it can provide additional insights into your data.

What if we don't want all the statistics for all of the data?

Selecting Columns and Summary Statistics with the Summary Action

Let's add additional parameters to the summary action. I'll add the inputs parameter to specify the columns to analyze in the CAS server.

tbl.summary(inputs = ['MPG_City','MPG_Highway'])

The results show only the MPG_City and MPG_Highway columns were analyzed.

Next, I'll use the subSet parameter to specify the summary statistics to produce. Here I'll obtain the MEAN, MIN and MAX.

tbl.summary(inputs = ['MPG_City','MPG_Highway'],
                       subSet = ['mean','min','max'])

The results processed only the MPG_City and MPG_Highway columns, and returned only the specified summary statistics to the client.

Creating a Calculated Column

Lastly, let's create a calculated column within the summary action. There are a variety of ways to do this. I like to add it as a parameter to the CASTable object. You can do that by specifying the tbl object, then computedVarsProgram parameter. Within computedVarsProgram you can use SAS assignment statements with most SAS functions. Here we will create a new column name MPG_Avg that takes the mean of MPG_City and MPG_Highway. Lastly, add the new column to the inputs parameter.

tbl.computedVarsProgram = 'MPG_Avg = mean(MPG_City, MPG_Highway);'
tbl.summary(inputs = ['MPG_City','MPG_Highway', 'MPG_Avg'],
                       subSet = ['mean','min','max'])

In the results I see the calculated column and requested summary statistics.

Summary

The SWAT package blends the world of Pandas and CAS. You can use many of the familiar Pandas methods within the SWAT package, or the flexible, highly optimized CAS actions like summary to easily obtain summary statistics of your data in the massively parallel processing CAS engine.

Additional and related resources

Getting Started with Python Integration to SAS® Viya® - Index
SWAT API Reference
CAS Action Documentation
SAS® Cloud Analytic Services: Fundamentals
SAS Scripting Wrapper for Analytics Transfer (SWAT)
CAS Action! - a series on fundamentals
Execute the following code in SAS Studio to load the sashelp.cars table into memory

Getting Started with Python Integration to SAS® Viya® - Part 6 - Descriptive Statistics was published on SAS Users.

12月 072021
 

This post is written in the hopes of easing the SAS Viya deployment process for novices like me. Firstly, deploying SAS Viya, like most enterprise software packages, isn't a skill we're innately born with. We're going to need a little help, some good documentation, and time to absorb the intricoes of the task.

There are many parts and pieces to standing up SAS Viya, depending on what you’re trying to accomplish and how you’d like to go about doing it. Know that the documentation and process can seem colossal and overwhelming, so take your time and don’t rush things. You got this.

Scope of the post

What this blog is and is not

This post will not walk you through the entirety of a deployment. Instead, it’ll point you to the right resources, guide you away from pitfalls, and show you how to accomplish certain tasks the documentation may not entirely cover. Many of these nuances were hard-earned lessons either by me or by people who have been kind enough to show me the way.

Please note the following

  • my experience is limited, and mostly pertains to AWS and Azure
  • the information is current at the time of this writing (December, 2021)

Please feel free to reach out to me if you have any suggestions, comments, or spot any mistakes. Many thanks!

Santa’s Workshop

Deploying SAS Viya is akin to creating toy trains in Santa’s workshop.

At its core, each toy train requires an engine, several cars, and a track. Likewise, each SAS Viya deployment requires a CAS engine that lives on a Kubernetes cluster together with several other servers (e.g., Compute, Connect, Stateful/Stateless), and storage.

Each toy train can be modified in numerous ways depending on the person’s preferences, whether it’s a steam locomotive or a bullet train. Or maybe it’s something more trivial, like merely the color of the train. Regardless of the need, Santa’s workshop contains a plethora of tools, materials, and plenty of knowledgeable elves who have different expertise and insights to customize the pipelines and trains.

Once again, each instance of SAS Viya can also be modified greatly depending on the customer’s needs. There are many hosts, flavors of servers, storage options, and common customizations. A SAS Viya deployment has its own kitchen sink full of tools, pipelines, and methods. And just like in Santa’s workshop, there are plenty of people who are experienced with deploying SAS Viya (and have specialties in different aspects of the deployment) who will assist if you run into issues.

Links Galore

There’s never a shortage of links required to complete deployments. I find myself with multiple windows filled with tabs (for referencing info) while I’m deploying so here’s a list of some I have found helpful.

      1. Setting up SAS Viya Monitoring for K8S
      2. Azure
      3. AWS
      4. GCP
      5. SAS Viya 4 Resource Guide

      Setting Up Your Environment

      There are several required tools for deployment. These include but are not necessarily limited to:

      • kubectl v1.19.9
      • kustomize 3.7.0
      • Docker

      Ensure your environment is set up precisely the way the docs recommend. For example, if you’re going the Terraform route from Viya4-IaC-AWS, you’re going to need this:

      • Terraform v1.0.0
      • kubectl v1.19.9
      • jq v1.6
      • AWS CLI v2.1.29

      The documentation is rather specific in terms of the required version, so please read carefully.

      Starting Off

      To start off, there are a few required readings to get a better understanding of SAS Viya architecture and requirements. Please review these webpages as often as you’d like to ensure understanding and avoid missing any steps.

      1. Getting Started portion of the Viya Operations documentation (linked previously).
      2. In the System Requirements section, please pay special attention to the “Kubernetes Client Machine Requirements” (under Virtual Infrastructure Requirements) to ensure you have the right tools and versions installed.
      3. When you’re done reading the above, it’s time to set up the IaC.

      Choose the corresponding link for “Help with Cluster Setup” (under Virtual Infrastructure Requirements) based on your cloud host of choice.

      IaC

      IaC stands for Infrastructure as Code. These are essentially scripts allowing you to build your cloud infrastructure and provision them through code instead of through the GUI. Several things to note here:

      1. I prefer cloning the IaC repo alongside the other folders, not within them so they’re better organized. It looks something like this:
        Viya4 <– Parent directory
        |– IaC
        |– Deploy
      2. Grab a sample .tfvars file under /examples and paste it into the root IaC directory. I recommend the “sample-input-minimal.tfvars” file if you’re just practicing.
        • Rename this file to “terraform.tfvars” (or preferred name, just be aware that the doc’s instructions assume that you have named it “terraform.tfvars”)
        • This file has several important values to keep in mind / input.
          1. This file contains the cluster configuration and details what all will be created
          2. “prefix” is essentially the name given to all your resources
          3. “default_public_access_cidrs” are CIDRs that you’d like to allow access to your cluster.
          4. “tags” you should include are {“resourceowner”=”your_Email”} (this is to ensure that people will be able to tell who owns the resource. Also, note that the preferred syntax is dependent on the cloud provider, please check the docs to be sure)
          5. “postgres_servers” should only be uncommented if you require an external db server (more expensive), if you don’t and you’re just practicing, leave it commented and it should create an internal one
      3. I highly recommend going the Docker route instead of Terraform (I have personally run into fewer problems through Docker, especially the tearing down process as compared to Terraform).
      4. It takes a while to create the cloud resources so have patience (takes about 15 mins at most).
      5. Once the resources exist, ensure you copy the [prefix]-eks-kubeconfig.conf file into your $(pwd) as well as your ~/.kube/config file if you’d like to keep it. The command to copy the conf file to your ~/.kube location is cp &lt;.conf file&gt; ~/.kube/config
      6. After you’re done with the above, make sure you run export KUBECONFIG=&lt;.conf file&gt;
      7. Test that your deployment is actually up: kubectl get nodes

      Post-IaC

      The next section covers additional SAS Viya requirements for the cluster after standing it up. There are a few things I’d recommend building after ensuring the deployment is up.

      • Ingress Controller
        kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.44.0/deploy/static/provider/cloud/deploy.yaml
      • Cert Manager
        kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.2.0/cert-manager.yaml
      • Helm/nfs-provisioner (this part is specifically for AWS)
        What’s happening here is that we’re getting the elastic load balancer URL from the ingress-nginx, the EFS ID, and installing the NFS server provisioner

        kubectl get service -n ingress-nginx
         
        ELBURL=$(kubectl get svc -n ingress-nginx ingress-nginx-controller --output jsonpath='{.status.loadBalancer.ingress[0].hostname}')
        echo $ELBURL
         
        EFSFSID=$(aws efs describe-file-systems --region $AWS_DEFAULT_REGION --query "FileSystems[*].FileSystemId" --region $AWS_DEFAULT_REGION --output text)
        echo $EFSFSID
         
        helm repo add stable  --force-update
        helm install stable/nfs-server-provisioner
         
        kubectl get storageclass # to check if the NFS server is up
      • Create a namespace where your SAS Viya deployment lives in the cluster – kubectl create ns . It is critical to go through the System Requirements entirely to ensure you don’t miss any steps (Just be sure that you’re following the portions meant for your cloud host). Examples in the Hardware and Resource Requirements page:
        • Azure – There’s an “Additional PVC Requirements for Microsoft Azure”: a link for “Specify PersistentVolumeClaims to Use ReadWriteMany StorageClass” where you’re required to add a file in the /site-config directory and an additional portion under “transformers” in the kustomization.yaml file
        • AWS – Under “File System and Shared Storage Requirements” refer to the notes on installing a provisioner for the EBS volumes. (The instructions are in the code block above)

      Installation

      This section sets up the parameters and additional customizations to included in the $deploy folder. It falls specifically under the Deployment tab of the SAS Viya Operations documentation.
      After retrieving the required files (under the desired version!), the certificates, license, and all assets and untarring them, take a good look at the section named "Directory Structure" so you have an understanding your desired file structure.

      Under “Installation -> Initial kustomization.yaml file”, once you’ve created your kustomization.yaml file, there are a few things of note here to change:

      1. {{ NAME-OF-NAMESPACE }}
        • If you haven’t already created the namespace where SAS Viya will live, do so now (instructions above #4)
        • Once you have a namespace, replace the entire thing including the {{}} with the name you have chosen.
        • You can always check what namespaces your cluster has by running kubectl get ns
      2. {{ NAME-OF-INGRESS-HOST }} and {{ PORT }} (note that there are multiple references in the kustomization.yaml file)
        • Use kubectl get service -n ingress-nginx and use the external-ip of the output
        • port is 80

      There are plenty of instructions beneath the kustomization.yaml file example, be sure to read through them and follow their instructions thoroughly.

      Additionally, Configure TLS

    1. kustomize build -o site.yaml
      kubectl apply --kubeconfig=kubeconfig-file --selector="sas.com/admin=cluster-wide" -f site.yaml
      kubectl wait --kubeconfig=kubeconfig-file --for condition=established --timeout=60s -l "sas.com/
      admin=cluster-wide" crd
      kubectl apply --kubeconfig=kubeconfig-file --selector="sas.com/admin=cluster-local" -f site.yaml --prune
      kubectl apply --kubeconfig=namespace-kubeconfig-file --selector="sas.com/admin=namespace" -f site.yaml --prune

OR
kustomize build . | kubectl apply -f -
(Note that this is the shortcut of building and piping the results to be applied in kubectl. It does not output a site.yaml file.)

There are a few false-positive errors that may appear during the process (the documentation outlines them pretty clearly).

Post-Deployment

You may run the readiness service to check for when your deployment is ready. Note that this process is lengthy and the fastest I’ve seen a deployment go up is about 15-20 mins. (Now’s a good time to go for a walk or get a cup of coffee).

I highly recommend using Lens to visualize the deployment process and to take a look at the pods and their logs (mini section below).

While all of these steps are possible in Lens, it’s good to know the commands required to inspect and manipulate pods.

kubectl get pods -n  # Take a look at all the pods, add a -W flag to watch them as they update
kubectl describe pod  -n  # To describe specific pods
kubectl logs  -n  # To see the logs of specific pods
kubectl delete pods  -grace-period=0 --force # To force deletion of pods, pods will automatically restart after being deleted.

Important pods to look at:

  • Logon
  • Consul
  • Cache

These pods are pre-requisites for many other pods to come up. If they’re stuck, go ahead and delete them to initiate a restart. This seems to work frequently.

If the pods look good, try going to this website: www.name-of-ingress-host:port/SASDrive. You should see a blue background and a SAS login screen.

Hooray! Now you just have to follow the Sign In as the sasboot User instructions and complete other post deployment tasks (Post-Installation Tasks, Validating the Deployment, etc.” that are pertinent to your use case.

Quick aside: Lens

K8s Lens is an incredibly useful IDE to visualize what is going on in your Kubernetes cluster.
Here are two quick screenshots to help you get situated when you’re looking at pods.

First, you need your .conf file to connect to your cluster. Upon entry, click on Workloads -> Pods to look at the pods. Also click on your namespace for all of the pods for the SAS Viya Deployment to show up.

There are times where you’ll see a yellow triangle with an exclamation mark. While this is technically a warning, it may be an indicator of an error your pod is suffering from. (If you see a HTTP 503 Readiness Probe error, it may just mean that the pod is starting up)

Click on the pod and the lines on the top right in order to see the logs for the chosen pod.

Conclusion

Hopefully this post was helpful for your start in deploying SAS Viya.

Please remember there’s a lot more to it than is covered here. Don’t be disheartened if this wasn’t particularly easy, it certainly wasn’t for me.
Know there are plenty of customizations as well as a constant stream of changes (updates, product related etc.), new methods, and places to deploy.
So there’s always plenty to learn.

Please feel free to reach out and let me know if you have any questions or suggestions for this post.

Acknowledgements

Many thanks to my colleagues Ali Aiello and Jacob Braswell for answering my incessant questions and helping me on this journey!

A Novice Perspective on SAS Viya Deployment was published on SAS Users.

12月 072021
 

This post is written in the hopes of easing the SAS Viya deployment process for novices like me. Firstly, deploying SAS Viya, like most enterprise software packages, isn't a skill we're innately born with. We're going to need a little help, some good documentation, and time to absorb the intricoes of the task.

There are many parts and pieces to standing up SAS Viya, depending on what you’re trying to accomplish and how you’d like to go about doing it. Know that the documentation and process can seem colossal and overwhelming, so take your time and don’t rush things. You got this.

Scope of the post

What this blog is and is not

This post will not walk you through the entirety of a deployment. Instead, it’ll point you to the right resources, guide you away from pitfalls, and show you how to accomplish certain tasks the documentation may not entirely cover. Many of these nuances were hard-earned lessons either by me or by people who have been kind enough to show me the way.

Please note the following

  • my experience is limited, and mostly pertains to AWS and Azure
  • the information is current at the time of this writing (December, 2021)

Please feel free to reach out to me if you have any suggestions, comments, or spot any mistakes. Many thanks!

Santa’s Workshop

Deploying SAS Viya is akin to creating toy trains in Santa’s workshop.

At its core, each toy train requires an engine, several cars, and a track. Likewise, each SAS Viya deployment requires a CAS engine that lives on a Kubernetes cluster together with several other servers (e.g., Compute, Connect, Stateful/Stateless), and storage.

Each toy train can be modified in numerous ways depending on the person’s preferences, whether it’s a steam locomotive or a bullet train. Or maybe it’s something more trivial, like merely the color of the train. Regardless of the need, Santa’s workshop contains a plethora of tools, materials, and plenty of knowledgeable elves who have different expertise and insights to customize the pipelines and trains.

Once again, each instance of SAS Viya can also be modified greatly depending on the customer’s needs. There are many hosts, flavors of servers, storage options, and common customizations. A SAS Viya deployment has its own kitchen sink full of tools, pipelines, and methods. And just like in Santa’s workshop, there are plenty of people who are experienced with deploying SAS Viya (and have specialties in different aspects of the deployment) who will assist if you run into issues.

Links Galore

There’s never a shortage of links required to complete deployments. I find myself with multiple windows filled with tabs (for referencing info) while I’m deploying so here’s a list of some I have found helpful.

      1. Setting up SAS Viya Monitoring for K8S
      2. Azure
      3. AWS
      4. GCP
      5. SAS Viya 4 Resource Guide

      Setting Up Your Environment

      There are several required tools for deployment. These include but are not necessarily limited to:

      • kubectl v1.19.9
      • kustomize 3.7.0
      • Docker

      Ensure your environment is set up precisely the way the docs recommend. For example, if you’re going the Terraform route from Viya4-IaC-AWS, you’re going to need this:

      • Terraform v1.0.0
      • kubectl v1.19.9
      • jq v1.6
      • AWS CLI v2.1.29

      The documentation is rather specific in terms of the required version, so please read carefully.

      Starting Off

      To start off, there are a few required readings to get a better understanding of SAS Viya architecture and requirements. Please review these webpages as often as you’d like to ensure understanding and avoid missing any steps.

      1. Getting Started portion of the Viya Operations documentation (linked previously).
      2. In the System Requirements section, please pay special attention to the “Kubernetes Client Machine Requirements” (under Virtual Infrastructure Requirements) to ensure you have the right tools and versions installed.
      3. When you’re done reading the above, it’s time to set up the IaC.

      Choose the corresponding link for “Help with Cluster Setup” (under Virtual Infrastructure Requirements) based on your cloud host of choice.

      IaC

      IaC stands for Infrastructure as Code. These are essentially scripts allowing you to build your cloud infrastructure and provision them through code instead of through the GUI. Several things to note here:

      1. I prefer cloning the IaC repo alongside the other folders, not within them so they’re better organized. It looks something like this:
        Viya4 <– Parent directory
        |– IaC
        |– Deploy
      2. Grab a sample .tfvars file under /examples and paste it into the root IaC directory. I recommend the “sample-input-minimal.tfvars” file if you’re just practicing.
        • Rename this file to “terraform.tfvars” (or preferred name, just be aware that the doc’s instructions assume that you have named it “terraform.tfvars”)
        • This file has several important values to keep in mind / input.
          1. This file contains the cluster configuration and details what all will be created
          2. “prefix” is essentially the name given to all your resources
          3. “default_public_access_cidrs” are CIDRs that you’d like to allow access to your cluster.
          4. “tags” you should include are {“resourceowner”=”your_Email”} (this is to ensure that people will be able to tell who owns the resource. Also, note that the preferred syntax is dependent on the cloud provider, please check the docs to be sure)
          5. “postgres_servers” should only be uncommented if you require an external db server (more expensive), if you don’t and you’re just practicing, leave it commented and it should create an internal one
      3. I highly recommend going the Docker route instead of Terraform (I have personally run into fewer problems through Docker, especially the tearing down process as compared to Terraform).
      4. It takes a while to create the cloud resources so have patience (takes about 15 mins at most).
      5. Once the resources exist, ensure you copy the [prefix]-eks-kubeconfig.conf file into your $(pwd) as well as your ~/.kube/config file if you’d like to keep it. The command to copy the conf file to your ~/.kube location is cp &lt;.conf file&gt; ~/.kube/config
      6. After you’re done with the above, make sure you run export KUBECONFIG=&lt;.conf file&gt;
      7. Test that your deployment is actually up: kubectl get nodes

      Post-IaC

      The next section covers additional SAS Viya requirements for the cluster after standing it up. There are a few things I’d recommend building after ensuring the deployment is up.

      • Ingress Controller
        kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.44.0/deploy/static/provider/cloud/deploy.yaml
      • Cert Manager
        kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.2.0/cert-manager.yaml
      • Helm/nfs-provisioner (this part is specifically for AWS)
        What’s happening here is that we’re getting the elastic load balancer URL from the ingress-nginx, the EFS ID, and installing the NFS server provisioner

        kubectl get service -n ingress-nginx
         
        ELBURL=$(kubectl get svc -n ingress-nginx ingress-nginx-controller --output jsonpath='{.status.loadBalancer.ingress[0].hostname}')
        echo $ELBURL
         
        EFSFSID=$(aws efs describe-file-systems --region $AWS_DEFAULT_REGION --query "FileSystems[*].FileSystemId" --region $AWS_DEFAULT_REGION --output text)
        echo $EFSFSID
         
        helm repo add stable  --force-update
        helm install stable/nfs-server-provisioner
         
        kubectl get storageclass # to check if the NFS server is up
      • Create a namespace where your SAS Viya deployment lives in the cluster – kubectl create ns . It is critical to go through the System Requirements entirely to ensure you don’t miss any steps (Just be sure that you’re following the portions meant for your cloud host). Examples in the Hardware and Resource Requirements page:
        • Azure – There’s an “Additional PVC Requirements for Microsoft Azure”: a link for “Specify PersistentVolumeClaims to Use ReadWriteMany StorageClass” where you’re required to add a file in the /site-config directory and an additional portion under “transformers” in the kustomization.yaml file
        • AWS – Under “File System and Shared Storage Requirements” refer to the notes on installing a provisioner for the EBS volumes. (The instructions are in the code block above)

      Installation

      This section sets up the parameters and additional customizations to included in the $deploy folder. It falls specifically under the Deployment tab of the SAS Viya Operations documentation.
      After retrieving the required files (under the desired version!), the certificates, license, and all assets and untarring them, take a good look at the section named "Directory Structure" so you have an understanding your desired file structure.

      Under “Installation -> Initial kustomization.yaml file”, once you’ve created your kustomization.yaml file, there are a few things of note here to change:

      1. {{ NAME-OF-NAMESPACE }}
        • If you haven’t already created the namespace where SAS Viya will live, do so now (instructions above #4)
        • Once you have a namespace, replace the entire thing including the {{}} with the name you have chosen.
        • You can always check what namespaces your cluster has by running kubectl get ns
      2. {{ NAME-OF-INGRESS-HOST }} and {{ PORT }} (note that there are multiple references in the kustomization.yaml file)
        • Use kubectl get service -n ingress-nginx and use the external-ip of the output
        • port is 80

      There are plenty of instructions beneath the kustomization.yaml file example, be sure to read through them and follow their instructions thoroughly.

      Additionally, Configure TLS

    1. kustomize build -o site.yaml
      kubectl apply --kubeconfig=kubeconfig-file --selector="sas.com/admin=cluster-wide" -f site.yaml
      kubectl wait --kubeconfig=kubeconfig-file --for condition=established --timeout=60s -l "sas.com/
      admin=cluster-wide" crd
      kubectl apply --kubeconfig=kubeconfig-file --selector="sas.com/admin=cluster-local" -f site.yaml --prune
      kubectl apply --kubeconfig=namespace-kubeconfig-file --selector="sas.com/admin=namespace" -f site.yaml --prune

OR
kustomize build . | kubectl apply -f -
(Note that this is the shortcut of building and piping the results to be applied in kubectl. It does not output a site.yaml file.)

There are a few false-positive errors that may appear during the process (the documentation outlines them pretty clearly).

Post-Deployment

You may run the readiness service to check for when your deployment is ready. Note that this process is lengthy and the fastest I’ve seen a deployment go up is about 15-20 mins. (Now’s a good time to go for a walk or get a cup of coffee).

I highly recommend using Lens to visualize the deployment process and to take a look at the pods and their logs (mini section below).

While all of these steps are possible in Lens, it’s good to know the commands required to inspect and manipulate pods.

kubectl get pods -n  # Take a look at all the pods, add a -W flag to watch them as they update
kubectl describe pod  -n  # To describe specific pods
kubectl logs  -n  # To see the logs of specific pods
kubectl delete pods  -grace-period=0 --force # To force deletion of pods, pods will automatically restart after being deleted.

Important pods to look at:

  • Logon
  • Consul
  • Cache

These pods are pre-requisites for many other pods to come up. If they’re stuck, go ahead and delete them to initiate a restart. This seems to work frequently.

If the pods look good, try going to this website: www.name-of-ingress-host:port/SASDrive. You should see a blue background and a SAS login screen.

Hooray! Now you just have to follow the Sign In as the sasboot User instructions and complete other post deployment tasks (Post-Installation Tasks, Validating the Deployment, etc.” that are pertinent to your use case.

Quick aside: Lens

K8s Lens is an incredibly useful IDE to visualize what is going on in your Kubernetes cluster.
Here are two quick screenshots to help you get situated when you’re looking at pods.

First, you need your .conf file to connect to your cluster. Upon entry, click on Workloads -> Pods to look at the pods. Also click on your namespace for all of the pods for the SAS Viya Deployment to show up.

There are times where you’ll see a yellow triangle with an exclamation mark. While this is technically a warning, it may be an indicator of an error your pod is suffering from. (If you see a HTTP 503 Readiness Probe error, it may just mean that the pod is starting up)

Click on the pod and the lines on the top right in order to see the logs for the chosen pod.

Conclusion

Hopefully this post was helpful for your start in deploying SAS Viya.

Please remember there’s a lot more to it than is covered here. Don’t be disheartened if this wasn’t particularly easy, it certainly wasn’t for me.
Know there are plenty of customizations as well as a constant stream of changes (updates, product related etc.), new methods, and places to deploy.
So there’s always plenty to learn.

Please feel free to reach out and let me know if you have any questions or suggestions for this post.

Acknowledgements

Many thanks to my colleagues Ali Aiello and Jacob Braswell for answering my incessant questions and helping me on this journey!

A Novice Perspective on SAS Viya Deployment was published on SAS Users.

10月 142021
 

Trimming strings left and right

I am pretty sure you have never heard of the TRIMS function, and I would be genuinely surprised if you told me otherwise. This is because this function does not exist (at least at the time of this writing).

But don’t worry, the difference between "nonexistence" and "existence" is only a matter of time, and from now it is less than a blog away. Let me explain. Recently, I published two complementary blog posts:

[1] Removing leading characters from SAS strings

[2] Removing trailing characters from SAS strings

While working on these pieces and researching “prior art” I stumbled upon a multipurpose function in the SAS FedSQL Language that alone does either one or both of these things – remove leading or/and trailing characters from SAS strings.

FedSQL Language and Proc FedSQL

The FedSQL language is the SAS proprietary implementation of the ANSI SQL:1999 core standard. Expectedly, the FedSQL language is implemented in SAS by means of the FedSQL procedure (PROC FEDSQL). This procedure enables you to submit FedSQL language statements from a Base SAS session, and it is supported in both SAS 9.4 and SAS Viya.

Using the FEDSQL procedure, you can submit FedSQL language statements to SAS and third-party data sources that are accessed with SAS and SAS/ACCESS library engines. Or, if you have SAS Cloud Analytic Services (CAS) configured, you can submit FedSQL language statements to the CAS server.

FedSQL TRIM function

FedSQL language has its own vast FedSQL Functions library with hundreds of functions many of which replicate SAS 9.4 Functions. Many, but not all. Deep inside this FedSQL functions library, there is a unique treasure modestly called TRIM Function which is quite different from the BASE SAS Language TRIM() function.

While SAS 9.4 BASE TRIM() function capabilities are quite limited - it removes just trailing blanks from a character string, the FedSQL TRIM() function is way much more powerful. This triple-action function can remove not just trailing blanks, but also leading blanks, as well as both, leading and trailing blanks. On top of it, it can remove not just blanks, but any characters (although one character at a time). See for yourself, this function has the following pretty self-explanatory syntax:

TRIM( [BOTH | LEADING | TRAILING] [trim-character] FROM column)

Here trim-character specifies one character (in single quotations marks) to remove from column. If trim-character is not specified, the function removes blanks.

While being called a function, it does not look like a regular SAS function where arguments are separated by commas.  It looks more like an SQL statement (which it understandably is – it is part of the FedSQL language). However, this function is available only in PROC FEDSQL; it’s not available in SAS DATA steps or other PROC steps. Still, it gives us pretty good idea of what such a universal function may look like.

User-defined function TRIMS to remove leading or/and trailing characters in SAS strings

Let’s build such a function by means of the PROC FCMP for the outside the FedSQL usage (it is worth noticing that the FCMP procedure is not supported for FedSQL). To avoid confusion with the existing TRIM function we will call our new function TRIMS (with an ‘S’ at the end) which suits our purpose quite well denoting its plural purpose. First, we define what we are going to create.

User-defined TRIMS function

TRIMS Function

Removes leading characters, trailing characters, or both from a character string.

Syntax

TRIMS(function-modifier, string, trim-list, trim-list-modifier)

Required Arguments

  • function-modifier is a case-insensitive character constant, variable, or expression that specifies one of three possible operations:
    'L' or 'l' – removes leading characters.
    'T' or 't' – removes trailing characters.
    'B' or 'b' – removes both, leading and trailing characters.
  • string is a case-sensitive character constant, variable, or expression that specifies the character string to be trimmed.
  • trim-list is a case-sensitive character constant, variable, or expression that specifies character(s) to remove from the string.
  • trim-list-modifier is a case-insensitive character constant variable, or expression that supplements the trim-list.
    The valid values are those modifiers of the FINDC function that “add” groups of characters (e.g. 'a' or 'A', 'c' or 'C', 'd' or 'D', etc.) to the trim-list.

The following user-defined function implementation is based on the coding techniques described in the two previous posts, [1] and [2] that I mentioned above. Here goes.

 
libname funclib 'c:\projects\functions';
 
/* delete previous function definition during debugging */
options cmplib=funclib.userfuncs;
proc fcmp outlib=funclib.userfuncs.package1;
   deletefunc trims;
run;
 
/* new function defintion */
proc fcmp outlib=funclib.userfuncs.package1;
   function trims(f $, str $, clist $, mod $) $32767;
      from = 1;
      last = length(str);
      if upcase(f) in ('L', 'B') then from = findc(str, clist, 'K'||mod);
      if from=0 then return('');
      if upcase(f) in ('T', 'B') then last = findc(str, clist, 'K'||mod, -last); 
      if last=0 then return('');
      return(substr(str, from, last-from+1));      
   endfunc; 
run;

Code highlights

  • In the function definition, we first assign initial values of the target substring positions as from=1 and last=length(str).
  • Then for Leading or Both character removal, we calculate an adjusted value of from as a position of the first character in str that is not listed in clist and not defined by the mod
  • If from=0 then we return blank and stop further calculations as this means that ALL characters are to be removed.
  • Then for Trailing or Both character removal, we calculate an adjusted value of last as a position of the last character in str that is not listed in clist and not defined by the mod
  • If last=0 then we return blank and stop further calculations as this means that ALL characters are to be removed.
  • And finally, we return a substring of str starting at the from position and ending at the last position, that is with the length of last-from+1.

TRIMS function usage

Let’s define SAS data set SOURCE as follows:

data SOURCE;
   input X $ 1-30;
   datalines;
*00It's done*2*1**-
*--*1****9*55
94*Clean record-*00
;

In the following DATA step, we will create three new variables with removed leading (variable XL), trailing (variable XT) and both - leading and trailing (variable XB) characters '*' and '-' as well as any digits:

options cmplib=funclib.userfuncs;
data TARGET;
   set SOURCE;
   length XB XL XT $30;
   XB = trims('b', X, '*-', 'd');
   XL = trims('L', X, '*-', 'd');
   XT = trims('t', X, '*-', 'd');
run;

In this code we use the TRIM function three times, each time with a different first argument to illustrate how this affects the outcome.

Arguments usage highlights

  • The first argument of the TRIMS function specifies whether we remove characters from both leading and trailing positions ('b'), from leading positions only ('L'), or from trailing positions only ('t'). This argument is case-insensitive. (I prefer using capital 'L' for clarity since lowercase 'l' looks like digit '1').
  • The second argument specifies the name of the variable (X) that we are going to remove characters from (variable X is coming from the dataset SOURCE).
  • The third argument '*-' specifies which character (or characters) to remove. In our example we are removing '*' and '-'. If you do not need to explicitly specify any character here, you still must supply a null value ('') since it is a required argument. In this case, the fourth argument (trim-list-modifier) will determine the set of characters to be removed.
  • And finally, the fourth argument (case-insensitive) of the TRIMS function specifies the FINDC function modifier(s) to remove certain characters in bulk (in our example 'd' will remove all digits). If such modifier is not needed, you still must supply a null value ('') since all four arguments of the TRIMS function are positional and required.

Here is the output data table TARGRET showing the original string X and the resulting strings XB (Both leading and trailing characters removed), XL (Leading characters removed) and XT (Trailing characters removed) side by side:

Result of leading and trailing characters trimming

Conclusion

The new TRIMS function presented in this blog post goes ways further the ubiquitous LEFT and TRIM functions that remove the leading (LEFT) or trailing (TRIM) blanks. The TRIMS function handles ANY characters, not just blanks. It also expands the character deletion functionality of the powerful  FedSQL TRIM function beyond just removing any single leading and/or trailing character. The TRIMS function single-handedly removes any number of explicitly specified characters from either leading, trailing or both (leading and trailing) positions. Plus, it removes in bulk many implicitly specified characters. For example 'd' modifier removes all digits, 'du' modifier removes all digits ('d') and all uppercase letters ('u'), 'dup' modifier removes all digits ('d'), all uppercase letters ('u') and all punctuation marks ('p'); and so on as described by the FINDC function modifiers. The order in which modifier characters are listed does not matter.

Additional resources

Questions? Thoughts? Comments?

Do you find this post useful? Please share your thoughts with us below.

Introducing TRIMS function to remove any leading and/or trailing characters from SAS strings was published on SAS Users.

10月 092021
 

Just because you are using CAS actions doesn't mean you can forget about the powerful SAS DATA step. The dataStep.runCode CAS action is here!

Welcome back to my SAS Users blog series CAS Action! - a series on fundamentals. I've broken the series into logical, consumable parts. If you'd like to start by learning a little more about what CAS Actions are, please see CAS Actions and Action Sets - a brief intro. Or if you'd like to see other topics in the series, see the overview page.

In this example, I will use the CAS procedure to execute the dataStep.runCode CAS action. Be aware, instead of using the CAS procedure, I could execute the action with Python, R, or even a REST API with some slight changes to the syntax for the specific language.

Why use the DATA Step?

It's pretty simple, the DATA step is a powerful way to process your data. It gives you full control of each row and column, ability to easily create multiple output tables, and provides a variety of statements to pretty much do anything you need.

In this example, I will use the DATA step to quickly create three CAS tables based on the value of a column.  Before we execute the DATA step, let's view the frequency values of the Origin column in the cars table. To do that, I'll use the simple.freq action.

proc cas;
    simple.freq / 
          table={name='cars', caslib='casuser'},
           input='Origin';
quit;

The result of the freq action shows that the Origin column in the cars CAS table has three distinct values: Asia, Europe and USA. I can use that information to create three CAS tables based off these unique values using the SAS DATA step.

Execute DATA Step in SAS Viya's CAS Server

One way to execute the DATA step directly in CAS is to use the runCode action with the code parameter. In the code parameter just specify the DATA step as a string. That's it!

In this example, I'll add the DATA step within a SOURCE block. The SOURCE block stores the code as variable. The DATA step code is stored in the variable originTables. This DATA step will create three CAS tables, one table for each unique value of the Origin column in the cars table.

proc cas;
    source originTables;
        data casuser.Asia
             casuser.Europe
             casuser.USA;
            set casuser.cars;
            if Origin='Asia' then output casuser.Asia;
            else if Origin='Europe' then output casuser.Europe;
            else if Origin='USA' then output casuser.USA;
        run;
    endsource;
 
    dataStep.runCode / code=originTables;
quit;

The runCode action executes the DATA step in the distributed CAS environment and returns information about the input and output tables. Notice three CAS tables were created: Asia, Europe and USA.

DATA Step in CAS has Limitations

Now, one thing to be aware of is not all functionality of the DATA step is available in CAS. If you are using the runCode action with an unsupported statement or function in CAS, you will receive an error. Let's look at an example using the first function, which gets the first letter of a string, and is not supported in CAS.

proc cas;
    source originTables;
        data casuser.bad;
            set casuser.cars;
            NewCol=first(Model);
        run;
    endsource;
    dataStep.runCode / code=originTables;
quit;

 

The results of the runCode action return an error. The error occurs because the FIRST function is unknown or cannot be accessed. In situations like this you will need to find a CAS supported method to complete the task. (HINT: Here instead of the first function you can use the substr function).

For more information visit Restrictions and Supported Language Elements. Be sure to find the version of your SAS Viya environment.

Summary

In SAS Viya, the runCode action provides an easy way to execute most of the traditional DATA step in CAS in any language, from the CAS Language (CASL), to Python, R, Lua, Java and more.

Additional Resources

runCode Action
DATA Step Action Set: Details
Restrictions and Supported Language Elements
SOURCE statement
SAS® Cloud Analytic Services: Fundamentals
Code

CAS-Action! Executing the SAS DATA Step in SAS Viya was published on SAS Users.

10月 092021
 

Just because you are using CAS actions doesn't mean you can forget about the powerful SAS DATA step. The dataStep.runCode CAS action is here!

Welcome back to my SAS Users blog series CAS Action! - a series on fundamentals. I've broken the series into logical, consumable parts. If you'd like to start by learning a little more about what CAS Actions are, please see CAS Actions and Action Sets - a brief intro. Or if you'd like to see other topics in the series, see the overview page.

In this example, I will use the CAS procedure to execute the dataStep.runCode CAS action. Be aware, instead of using the CAS procedure, I could execute the action with Python, R, or even a REST API with some slight changes to the syntax for the specific language.

Why use the DATA Step?

It's pretty simple, the DATA step is a powerful way to process your data. It gives you full control of each row and column, ability to easily create multiple output tables, and provides a variety of statements to pretty much do anything you need.

In this example, I will use the DATA step to quickly create three CAS tables based on the value of a column.  Before we execute the DATA step, let's view the frequency values of the Origin column in the cars table. To do that, I'll use the simple.freq action.

proc cas;
    simple.freq / 
          table={name='cars', caslib='casuser'},
           input='Origin';
quit;

The result of the freq action shows that the Origin column in the cars CAS table has three distinct values: Asia, Europe and USA. I can use that information to create three CAS tables based off these unique values using the SAS DATA step.

Execute DATA Step in SAS Viya's CAS Server

One way to execute the DATA step directly in CAS is to use the runCode action with the code parameter. In the code parameter just specify the DATA step as a string. That's it!

In this example, I'll add the DATA step within a SOURCE block. The SOURCE block stores the code as variable. The DATA step code is stored in the variable originTables. This DATA step will create three CAS tables, one table for each unique value of the Origin column in the cars table.

proc cas;
    source originTables;
        data casuser.Asia
             casuser.Europe
             casuser.USA;
            set casuser.cars;
            if Origin='Asia' then output casuser.Asia;
            else if Origin='Europe' then output casuser.Europe;
            else if Origin='USA' then output casuser.USA;
        run;
    endsource;
 
    dataStep.runCode / code=originTables;
quit;

The runCode action executes the DATA step in the distributed CAS environment and returns information about the input and output tables. Notice three CAS tables were created: Asia, Europe and USA.

DATA Step in CAS has Limitations

Now, one thing to be aware of is not all functionality of the DATA step is available in CAS. If you are using the runCode action with an unsupported statement or function in CAS, you will receive an error. Let's look at an example using the first function, which gets the first letter of a string, and is not supported in CAS.

proc cas;
    source originTables;
        data casuser.bad;
            set casuser.cars;
            NewCol=first(Model);
        run;
    endsource;
    dataStep.runCode / code=originTables;
quit;

 

The results of the runCode action return an error. The error occurs because the FIRST function is unknown or cannot be accessed. In situations like this you will need to find a CAS supported method to complete the task. (HINT: Here instead of the first function you can use the substr function).

For more information visit Restrictions and Supported Language Elements. Be sure to find the version of your SAS Viya environment.

Summary

In SAS Viya, the runCode action provides an easy way to execute most of the traditional DATA step in CAS in any language, from the CAS Language (CASL), to Python, R, Lua, Java and more.

Additional Resources

runCode Action
DATA Step Action Set: Details
Restrictions and Supported Language Elements
SOURCE statement
SAS® Cloud Analytic Services: Fundamentals
Code

CAS-Action! Executing the SAS DATA Step in SAS Viya was published on SAS Users.

9月 022021
 

Deleting any trailing characters in SAS stringsIn my previous post, we solved the task of removing specified leading characters from SAS strings. In this post, we tackle the complementary task of removing trailing characters.

While removing trailing blanks is well covered in SAS by the TRIM() and TRIMN() functions, removing non-blank trailing characters remains a bit of a mystery that can pop up during text string processing.

For example, you may need to clean up the following strings by removing all trailing x's from them:

012345x
012345xxx
012345xx

These extra characters can result from data entry errors, prior data manipulations, etc. No matter how you get them in, you want them out.

How to remove any trailing characters

For instance, let’s remove all occurrences of the arbitrary trailing character '*'. The following diagram illustrates what we are going to achieve and how:
Diagram: Deleting trailing characters
In order to remove a specified character (in this example '*') from all trailing positions in a string, we need to search our string from right to left starting from the rightmost non-blank character position and find the position p of the first character in that string that is not equal to the specified character. Note, that despite our right-to-left direction of search the position p=10 is still enumerated from left-to-right. Then we can extract the substring starting from position 1 with the length equal to the found position p.

Unlike in our leading characters removal solution, out of two contenders for our search functionality, VERIFY and FINDC, the VERIFY function has to be dropped from the competition as it does not provide right-to-left search functionality. However, the FINDC function stays on track. Here is a possible solution using the FINDC() function.

Using FINDC() function with negative start-position

The FINDC(X, C, ’K’, -LENGTH(X)) function searches string X from right to left starting from the last non-blank character position determined by the optional start-position argument equal to LENGTH(X), and returns the position P of the first character that does not appear in C.

Here we use the K modifier that switches the default behavior of searching for any character that appears in C to searching for any character that does not appear in C.

The direction of search is defined by the minus sign in front of the start-position (a negative start-position argument translates into searching from right to left.)

Then we can apply the SUBSTR(X, 1, P) function that extracts a substring of X starting from position 1 with a length of P which is effectively a substring of the first P characters in X.

Special considerations

Before we proceed to the code implementation of the outlined solution let’s consider the following edge case.

If our string X consists of all '*' characters and nothing else the FINDC() function will find no character (and therefore no position) that is not equal to '*'. In this case it will return 0. However, 0 is not a valid third argument value for the SUBSTR(X, 1, P) function. Valid values are 1 . . . through VLENGTH(X) – the length attribute of X. Having a 0 value for the third argument will trigger the automatic data step variable _ERROR_=1 and the following note generated in the SAS log:

NOTE: Invalid third argument to function SUBSTR at line ## column #.

Therefore, we need to handle this special case separately, conditionally using SUBSTR(X, 1, P) for P>0 and assigning blank ('') otherwise.

Code implementation for removing trailing characters

Now we are ready to put everything together. First, let’s create a test data table:

data TEST;
   input X $ 1-20;
   datalines;
*It's done***
*********
**01234*ABC**
No trailing *'s
;

Then we apply the logic described above. The following DATA step illustrates our coding solution for deleting trailing characters:

data CLEAN (keep=X Y);
   set TEST;
   C = '*'; *<- trailing character(s) to be removed;
 
   P = findc(X, C, 'K', -length(X));
   if P then Y = substr(X, 1, P); 
        else Y = '';
 
   put _n_= / X= / P= / Y= /;
run;

The SAS log will show interim and final results by the DATA step iterations:

_N_=1
X=*It's done***
P=10
Y=*It's done
 
_N_=2
X=*********
P=0
Y=
 
_N_=3
X=**01234*ABC**
P=11
Y=**01234*ABC
 
_N_=4
X=No trailing *'s
P=15
Y=No trailing *'s

Here is the output data table CLEAN showing the original and the resulting strings X and Y side by side:
Removing any trailing characters in SAS strings

Conclusion

The solution presented in this blog post expands trailing character deletion functionality beyond solely blanks (which are handled by the TRIM and TRIMN functions). Moreover, using this coding technique, we can simultaneously remove a variety of trailing characters. For example, if we have a string X='012345xxx.%' and specify C = 'x.%' (the order of characters listed within the value of C does not matter), then all three characters 'x', '.', and '%' will be removed from all trailing positions of X. The resulting string will be Y='012345'.

In addition, numerous modifiers of the FINDC() function allow specifying many characters in bulk, without explicitly listing them one by one. For example, we may augment a list of characters being removed by adding the D modifier as in P = FINDC(X, C, 'KD', -LENGTH(X)) which will remove all trailing digits in addition to those characters specified in C. Similarly, we may throw in the U modifier as in P = FINDC(X, C, 'KDU', -LENGTH(X)) which adds all uppercase letters to the list of trailing characters to be removed. And so on.

Additional resources

Questions? Thoughts? Comments?

Do you find this post useful? Do you have questions, concerns, comments? Please share with us below.

Removing trailing characters from SAS strings was published on SAS Users.

8月 232021
 

Illustration for trimming leading characters in SAS stringsAs in many other programming languages, there is a very useful SAS function that removes leading blanks in character strings. It is the ubiquitous LEFT function.

The LEFT(x) function left-aligns a character string x, which effectively removes leading blanks.

However, in many SAS applications we need a similar but more versatile data cleansing functionality allowing for removal of other leading characters, not just blanks. For example, consider some bank account numbers that are stored as the following character strings:

123456789
0123456789
000123456789

These strings represent the same account number recorded with either no, one, or several leading zeros. One way of standardizing this data is by removing the leading 0's. And while we're at it, why don’t we address the leading character removal functionality for any leading characters, not just zeros.

How to remove any leading characters

For example, let’s remove all occurrences of the arbitrary leading character '*'. The following diagram illustrates what we are going to achieve and how:

In order to remove a specified character (in this example '*') from all leading positions in a string, we need to search our string from left to right and find the position of the first character in that string that is not equal to the specified character. In this case, it’s a blank character in position 4. Then we can extract a substring starting from that position till the end of the string.

I can see two possible solutions.

Solution 1: Using VERIFY() function

The VERIFY (X, C) function searches string X from left to right and returns the position P of the first character that does not appear in the value of C.

Then we can apply the SUBSTR(X,P) function that extracts a substring of X starting from position P till the end of the string X.

Solution 2: Using FINDC() function


The FINDC(X, C, ‘K’) function also searches string X from left to right and returns the position P of the first character that does not appear in C. (The modifier ‘K’ switches the default behavior of searching for any character that appears in C to searching for any character that does not appear in C.)

Then, as with the VERIFY() function, we can apply the SUBSTR(X,P) function that extracts a substring of X starting from position P till the end of the string X.

Special considerations

So far so good, and everything will be just hunky-dory, right? Not really - unless we cover our bases by handling edge cases.

Have we thought of what would happen if our string X consisted of all '*' characters and nothing else? In this special case, both the verify() function and findc() function will find no position of the character that is not equal to '*' and thus return 0.

However, 0 is not a valid second argument value for the SUBSTR(X,P) function. Valid values are 1 . . . through length(X). Having a 0 value for the second argument will trigger the automatic data step variable _ERROR_=1 and the following note generated in the SAS log:

NOTE: Invalid second argument to function SUBSTR at line ## column #.

Therefore, we need to handle this special case separately, conditionally using SUBSTR(X,P) for P>0 and assigning blank ('') otherwise.

Code implementation for removing leading characters

Let’s put everything together. First, we'll create a test data table:

data TEST;
   input X $ 1-20;
   datalines;
*** It's done*
*********
**01234*ABC**
No leading *'s
;

Then we apply the logic described above. The following DATA step illustrates our two implemented coding solutions for removing leading characters:

data CLEAN (keep=X Y Z);
   set TEST;
   C = '*'; *<- leading character(s) to be removed;
 
   P1 = verify(X,C); *<- Solution 1;
   if P1 then Y = substr(X, P1);    
         else Y = '';
 
   P2 = findc(X,C,'K'); *<- Solution 2;
   if P2 then Z = substr(X, P2); 
         else Z = '';
 
   put _n_= / X= / P1= / Y= / P2= / Z= /;
run;

Alternatively, we can replace the IF-THEN-ELSE construct with this IFC() function one-liner:

data CLEAN (keep=X Y Z);
   set TEST;
   C='*'; *<- leading character(s) to be removed;
 
   P1 = verify(X,C); *<- Solution 1;
   Y = ifc(P1, substr(X, P1), '');
 
   P2 = findc(X,C,'K'); *<- Solution 2;
   Z = ifc(P2, substr(X, P2), '');
 
   put _n_= / X= / P1= / Y= / P2= / Z= /;
run;

The SAS log will show interim and final results by the DATA step iterations:

_N_=1
X=*** It's done*
P1=4
Y=It's done*
P2=4
Z=It's done*
 
_N_=2
X=*********
P1=10
Y=
P2=10
Z=
 
_N_=3
X=**01234*ABC**
P1=3
Y=01234*ABC**
P2=3
Z=01234*ABC**
 
_N_=4
X=No leading *'s
P1=1
Y=No leading *'s
P2=1
Z=No leading *'s

Here is the output data table CLEAN showing the original string X, and resulting strings Y (solution 1) and Z (solution 2) side by side:
Removing any leading characters in SAS strings
As you can see, both solutions (1 & 2) produce identical results.

Conclusion

Compared to the LEFT() function, the solution presented in this blog post not only expands leading character removal/cleansing functionality beyond the blank character exclusively. Using this coding technique we can simultaneously remove a variety of leading characters (including but not limited to blank). For example, if we have a string X=' 0.000 12345' and specify C = ' 0.' (the order of characters listed within the value of C does not matter), then all three characters ' ', '0', and '.' will be removed from all leading positions of X. The resulting string will be Y='12345'.

Additional resources

Questions? Thoughts? Comments?

Do you find this post useful? Do you have questions, concerns, comments? Please share with us below.

Removing leading characters from SAS strings was published on SAS Users.

8月 192021
 

In Part 1 of my series fetch CAS, fetch!, I executed the fetch CAS action to return rows from a CAS table. That was great, but what can you do with the results? Maybe you want to create a visualization that includes the top five cars by MSRP for all Toyota vehicles? How can we accomplish this task? We'll cover this question and provide several other examples in this post.

Save the results of a CAS action as a SAS data set

First, execute the table.fetch CAS action on the CARS in-memory table to filter for Toyota cars, return the Make, Model and MSRP columns, and sort the results by MSRP. Then save the results of the action in a variable using the results option. The results of an action return a dictionary to the client. The fetch action returns a dictionary with a single key, and the result table as the value. In this example, I'll name the variable toyota.

proc cas;
    table.fetch result=toyota / 
          table={name="cars", caslib="casuser",
                 where="Make='Toyota'",
                 vars={"Make","Model","MSRP"}
          },
          sortBy={
                 {name="MSRP", order="DESCENDING"}
          },
          index=FALSE,
          to=5;
...

After executing the code, the results of the action are stored in the variable toyota and not shown in the output.

Next, use the SAVERESULT statement to save the result table stored in the toyota variable. Since the variable is a dictionary, specify the variable name toyota, a dot, then the fetch key. This will access the result table from the dictionary. Finally, specify the DATAOUT= option with the name of the SAS data set to create.

proc cas;
    table.fetch result=toyota / 
          table={name="cars", caslib="casuser",
                 where="Make='Toyota'",
                 vars={"Make","Model","MSRP"}
          },
          sortBy={
                 {name="MSRP", order="DESCENDING"}
          },
          index=FALSE,
          to=5;
 
     saveresult toyota.fetch dataout=work.top5;
quit;

After executing the code, the result table is saved as a SAS data set. The SAS data set is named top5 and saved to the WORK library.

 

 

Wondering what else can we do? Let's take a look.

Visualize the SAS data set

Now that the result table is saved as a SAS data set, you can use the SGPLOT procedure to create a bar chart! Consider the code below.

title justify=left height=14pt "Top 5 Toyota Cars by MSRP";
proc sgplot data=work.top5
         noborder nowall;
    vbar Model / 
          response=MSRP 
          categoryorder=respdesc
          nooutline
          fillattrs=(color=cx0379cd);
    label MSRP="MSRP";
quit;
title;

There it is! We processed our data in CAS using the fetch action, returned a smaller subset of results back to the client, then used traditional SAS programming techniques on the smaller table. This method will work similarly in other languages like Python and R. Then you can then use the native visualization packages of the language!

You can now use your imagination on what else to do with the raw data from the CARS table or from the top5 results table we produced with the table.fetch action. Feel free to get creative.

Summary

CAS actions are optimized to run in a distributed environment on extremely large tables. Your CAS table can contain millions or even billions of rows. Since the data in CAS can be extremely large, the goal is to process and subset the table on the CAS server, then return a smaller amount of data to the client for additional processing, visualization or modeling.

Additional resources

fetch Action
SAVERESULT Statement
SAS® Cloud Analytic Services: Fundamentals
Plotting a Cloud Analytic Services (CAS) In-Memory Table
Getting started with SGPLOT - Index
Code used in this post

CAS-Action! fetch CAS, fetch! - Part 2 was published on SAS Users.

7月 152021
 

I am a long-time SAS 9 Administrator, I feel very confident in my understanding of SAS 9 administration. I will admit I don’t know everything, but I have been administering SAS since the days of SAS 9.1.3. I often tell my students I am a general practitioner when it comes to SAS 9. I know a little about a lot of things. Even at that, I still feel I can handle most questions when it comes to SAS 9 administration. I will always yield to someone with more knowledge, but I feel I can hold my own.

I am comfortable with SAS 9 administration. I’m confident in my understanding of the SAS Servers, metadata and all the other parts that make up the SAS 9 environment. I was great deploying SAS 9 in various environments that mostly meant some form of on-premise. It was the norm for me to expect to be able to go to the air-conditioned room, with the raised floor, and marvel at my server farm. There was some downside to this that we all knew: the upkeep of physical servers, the up-front cost of those servers and scalability just to mention a few factors. Even with those barriers, I still embraced the tried-and-true work associated with being a SAS 9 administrator.

Sometimes things change and bring us out of our comfort zone

As I was snuggled up to SAS 9 administration like a certain character and his blanket, SAS embraced microservices, cloud and Kubernetes. I’ll admit that I had a brief OMG moment. As an instructor I should, in theory, know more than the student. Well, I knew this was one time that might not be the case. I consider myself technically savvy, but I was concerned because I didn’t really know where to begin and it all seemed daunting to me.

Then I had to realize that though SAS has transformed, some things remain the same. I won’t say my thoughts are the thoughts of anyone else at SAS, but it’s my way to proverbially eat the elephant. (No elephants were harmed in erasing my confusion, by the way. 😊)

If you want an amazing introduction to SAS Viya and Kubernetes, check out Vasilij Nevlev’s SAS Global Forum Paper titled, Introduction to Azure Kubernetes Service and SAS® Viya® 2020. Vasilij’s analysis is a useful tool, but I needed a bit more to be able to feel like “I got it".

Making associations with things I already knew

Although there was a dramatic shift in the overall architecture, the tool for managing the latest release of SAS Viya is very similar to the tool used to manage SAS Viya 3.5. I was excited to see that SAS Environment Manager remained much the same. You can still use SAS Environment Manager to manage data, server configuration, content, and security. I am comforted in that.

This association thing was going well, so I decided to continue the process. The next similarity I noticed was in the configuration. I was pleased to know that SAS Viya still relied on a YAML file for configuration options. That told me that although things had changed, much had remained the same.

The final point I will touch on here is the commands used with Kubernetes. Now I know what you might be thinking, “Raymond those kubectl commands are the stumbling block for me!” I get what you are saying. It is another “language” for us to learn. For me, I’m already used to navigating commands in a Linux environment. In addition, the kubeclt command line is support by the Microsoft, Amazon and Google platforms. This is not something you might have to worry about as an administrator, but as an instructor this is wonderful news. SAS Viya is supported on those platforms and now I just have to learn one command line interface.

All of this gets me excited about administration of SAS Viya and what is happening with the software. I hope you share in my excitement. And I hope to meet you in one of our SAS Viya Administration courses soon.

Going from SAS 9 to SAS Viya and I can hardly contain myself was published on SAS Users.