SAS architecture

4月 132018
 

As a follow on from my previous blog post, where we looked at the different use cases for using Kerberos in SAS Viya 3.3, in this post will delve into more details on the requirements for use case 4, where we use Kerberos authentication through-out both the SAS 9.4 and SAS Viya 3.3 environments. We won’t cover the configuration of this setup as that is a topic too broad for a single blog post.

As a reminder the use case we are considering is shown here:

SAS Viya 3.3 Kerberos Delegation

Here the SAS 9.4 Workspace Server is launched with Kerberos credentials, the Service Principal for the SAS 9.4 Object Spawner will need to be trusted for delegation. This means that a Kerberos credential for the end-user is available to the SAS 9.4 Workspace Server. The SAS 9.4 Workspace Server can use this end-user Kerberos credential to request a Service Ticket for the connection to SAS Cloud Analytic Services. While SAS Cloud Analytic Services is provided with a Kerberos keytab and principal it can use to validate this Service Ticket. Validating the Service Ticket authenticates the SAS 9.4 end-user to SAS Cloud Analytic Services. The principal for SAS Cloud Analytic Services must also be trusted for delegation. We need the SAS Cloud Analytic Services session to have access to the Kerberos credentials of the SAS 9.4 end-user.

These Kerberos credentials made available to the SAS Cloud Analytic Services are used for two purposes. First, they are used to make a Kerberized connection to the SAS Viya Logon Manager, this is to obtain the SAS Viya internal OAuth token. As a result, the SAS Viya Logon Manager must be configured to accept Kerberos connections. Secondly, the Kerberos credentials of the SAS 9.4 end-user are used to connect to the Secure Hadoop environment.

In this case, since all the various principals are trusted for delegation, our SAS 9.4 end-user can perform multiple authentication hops using Kerberos with each component. This means that through the use of Kerberos authentication the SAS 9.4 end-user is authenticated into SAS Cloud Analytic Services and out to the Secure Hadoop environment.

Reasons for doing it…

To start with, why would we look to use this use case? From all the use cases we considered in the previous blog post this provides the strongest authentication between SAS 9.4 Maintenance 5 and SAS Viya 3.3. At no point do we have a username/password combination passing between the SAS 9.4 environment and the SAS Viya 3.3. In fact, the only credential set (username/password) sent over the network in the whole environment is the credential set used by the Identities microservice to fetch user and group information for SAS Viya 3.3. Something we could also eliminate if the LDAP provider supported anonymous binds for fetching user details.

Also, this use case provides true single sign-on from SAS 9.4 Maintenance 5 to SAS Viya 3.3 and all the way out to the Secured Hadoop environment. Each operating system run-time process will be launched as the end-user and no cached or stored username/password combination is required.

High-Level Requirements

At a high-level, we need to have both sides configured for Kerberos delegated authentication. This means both the SAS 9.4 Maintenance 5 and the SAS Viya 3.3 environments must be configured for Kerberos authentication.

The following SAS components and tiers need to be configured:

  • SAS 9.4 Middle-Tier
  • SAS 9.4 Metadata Tier
  • SAS 9.4 Compute Tier
  • SAS Viya 3.3 SAS Logon Manager
  • SAS Viya 3.3 SAS Cloud Analytic Services

Detailed Requirements

First let’s talk about Service Principal Names. We need to have a Service Principal Name (SPN) registered for each of the components/tiers in our list above. Specifically, we need a SPN registered for:

  • HTTP/<HOSTNAME> for the SAS 9.4 Middle-Tier
  • SAS/<HOSTNAME> for the SAS 9.4 Metadata Tier
  • SAS/<HOSTNAME> for the SAS 9.4 Compute Tier
  • HTTP/<HOSTNAME> for the SAS Viya 3.3 SAS Logon Manager
  • sascas/<HOSTNAME> for the SAS Viya 3.3 SAS Cloud Analytic Services

Where the <HOSTNAME> part should be the fully qualified hostname of the machines where the component is running. This means that some of these might be combined, for example if the SAS 9.4 Metadata Tier and Compute Tier are running on the same host we will only have one SPN for both. Conversely, we might require more SPNs, if for example, we are running a SAS 9.4 Metadata Cluster.

The SPN needs to be registered against something. Since our aim is to support single sign-on from the end-user’s desktop we’ll probably be registering the SPNs in Active Directory. In Active Directory we can register against either a user or computer object. For both the SAS 9.4 Metadata and Compute Tier the registration can be performed automatically if the processes run as the local system account on a Microsoft Windows host and will be against the computer object. Otherwise, and for the other tiers and components, the SPN must be registered manually. We recommend, that while you can register multiple SPNs against a single object, that you register each SPN against a separate object.

Since the entire aim of this configuration is to delegate the Kerberos authentication from one tier/component onto the next we need to ensure the objects, namely users or computer objects, are trusted for delegation. The SAS 9.4 Middle-Tier will only support un-constrained delegation, whereas the other tiers and components support Microsoft’s constrained delegation. If you choose to go down the path of constrained delegation you need to specify each and every Kerberos service the object is trusted to delegate authentication to.

Finally, we need to provide a Kerberos keytab for the majority of the tiers/components. The Kerberos keytab will contain the long-term keys for the object the SPN is registered against. The only exceptions being the SAS 9.4 Metadata and Compute Tiers if these are running on Windows hosts.

Conclusion

You can now enable Kerberos delegation across the SAS Platform, using a single strong authentication mechanism across that single platform. As always with configuring Kerberos authentication the prerequisites, in terms of Service Principal Names, service accounts, delegation settings, and keytabs are important for success.

SAS Viya 3.3 Kerberos Delegation from SAS 9.4M5 was published on SAS Users.

2月 152018
 

In this article, I will set out clear principles for how SAS Viya 3.3 will interoperate with Kerberos. My aim is to present some overview concepts for how we can use Kerberos authentication with SAS Viya 3.3. We will look at both SAS Viya 3.3 clients and SAS 9.4M5 clents. In future blog posts, we’ll examine some of these use cases in more detail.

With SAS Viya 3.3 clients we have different use cases for how we can use Kerberos with the environment. In the first case, we use Kerberos delegation throughout the environment.

Use Case 1 – SAS Viya 3.3

The diagram below illustrates the use case where Kerberos delegation is used into, within, and out from the environment.

How SAS Viya 3.3 will interoperate with Kerberos

In this diagram, we show the end-user relying on Kerberos or Integrated Windows Authentication to log onto the SAS Logon Manager as part of their access to the visual interfaces. SAS Logon Manager is provided with a Kerberos keytab and HTTP principal to enable the Kerberos connection. In addition, the HTTP principal is flagged as “trusted for delegation” so that the credentials sent by the client include the delegated or forwardable Ticket-Granting Ticket (TGT). The configuration of SAS Logon Manager with SAS Viya 3.3 includes a new option to store this delegated credential. The delegated credential is stored in the credentials microservice, and secured so that only the end-user to which the credential belongs can access it.

When the end-user accesses SAS CAS from the visual interfaces the initial authentication takes place with the standard internal OAuth token. However, since the end-user stored a delegated credential when accessing the SAS Logon Manager an additional Origin attribute is set on the token of “Kerberos.” The internal OAuth token also contains the groups the end-user is a member of within the Claims. Since we want this end-user to run the SAS CAS session as themselves they must have been added to a custom group with the ID=CASHostAccountRequired. When the SAS CAS Controller receives the OAuth token with the additional Kerberos Origin, it requests the visual interface to make a second Kerberized connection. So, the visual interface retrieves the delegated credential from the credentials microservice and uses this to request a Service Ticket to connect to SAS CAS.

SAS CAS has been provided with a Kerberos keytab and a sascas principal to enable the Kerberos connection. Since the sascas principal is flagged as “trusted for delegation,” the credentials sent by the visual interfaces include a delegated or forwardable Ticket-Granting Ticket (TGT). SAS CAS validates the Service Ticket, which in turn authenticates the end-user. The SAS CAS Controller then launches the session as the end-user and constructs a Kerberos ticket cache containing the delegated TGT. Now, within their SAS CAS session the end-user can connect to the Secured Hadoop environment as themselves since the SAS CAS session has access to a TGT for the end-user.

This means in this first use case all access to, within, and out from the SAS Viya 3.3 environment leverages strong Kerberos authentication. This is our “gold-standard” for authenticating the end-user to each part of the environment.

But, it is strictly dependent on the end-user being a member of the custom group with ID=CASHostAccountRequired, and the two principals (HTTP and sascas) being trusted for delegation. Without both the Kerberos delegation will not take place.

Use Case 1a – SAS Viya 3.3

The diagram below illustrates a slight deviation on the first use case.

Here, either through choice or by omission, the end-user is not a member of the custom group with the ID=CASHostAccountRequired. Now even though the end-user connects with Kerberos and irrespective of the configuration of SAS Logon Manager to store delegated credentials the second connection using Kerberos is not made to SAS CAS. Now the SAS CAS session runs as the account that launched the SAS CAS controller, cas by default. Since, the session is not running as the end-user and SAS CAS did not receive a Kerberos connection, the Kerberos ticket cache that is generated for the session does not contain the credentials of the end-user. Instead, the Kerberos keytab and principal supplied to SAS CAS are used to establish the credentials in the Kerberos ticket cache.

This means that even though Kerberos was used to connect to SAS Logon Manager the connection to the Secured Hadoop environment is as the sascas principal and not the end-user.

The same situation could be arrived at if the HTTP principal for SAS Logon Manager is not trusted for delegation.

Use Case 1b – SAS Viya 3.3

A final deviation to the initial use case is shown in the following diagram.

In this case the end-user connects to SAS Logon Manager with any other form of authentication. This could be the default LDAP authentication, external OAuth, or external SAML authentication. Just as in use case 1a, this means that the connection to SAS CAS from the visual interfaces only uses the internal OAuth token. Again, since no delegated credentials are used to connect to SAS CAS the session is run as the account that launched the SAS CAS controller. Also, the ticket cache created by the SAS Cloud Analytic Service Controller contains the credentials from the Kerberos keytab, i.e. the sascas principal. This means that access to the Secured Hadoop environment is as the sascas principal and not the end-user.

Use Case 2 – SAS Viya 3.3

Our second use case covers those users entering the environment via the programming interfaces, for example SAS Studio. In this case, the end-users have entered a username and password, a credential set, into SAS Studio. This credential set is used to start their individual SAS Workspace Session and to connect to SAS CAS from the SAS Workspace Server. This is illustrated in the following figure.

Since the end-users are providing their username and password to SAS CAS it behaves differently. SAS CAS uses its own Pluggable Authentication Modules (PAM) configuration to validate the end-user’s credentials and hence launch the SAS CAS session process running as the end-user. However, in addition the SAS CAS Controller also uses the username and password to obtain an OAuth token from SAS Logon Manager and then can obtain any access control information from the SAS Viya 3.3 microservices. Obtaining the OAuth token from the SAS Logon Manager ensures any restrictions or global caslibs defined in the visual interfaces are observed in the programming interfaces.

With the SAS CAS session running as the end-user and any access controls validated, the SAS CAS session can access the Secured Hadoop cluster. Now since the SAS CAS session was launched using the PAM configuration, the Kerberos credentials used to access Hadoop will be those of the end-user. This means the PAM configuration on the machines hosting SAS CAS must be linked to Kerberos. This PAM configuration then ensures the Kerberos Ticket-Granting Ticket is available to the CAS Session as is it launched.

Next, we consider three further use cases where the client is SAS 9.4 maintenance 5. Remember that SAS 9.4 maintenance 5 can make a direct connection to SAS CAS without requiring SAS/CONNECT. The use cases we will discuss will illustrate the example with a SAS 9.4 maintenance 5 web application, such as SAS Studio. However, the statements and basic flows remain the same if the SAS 9.4 maintenance 5 client is a desktop application like SAS Enterprise Guide.

Use Case 3 – SAS 9.4 maintenance 5

First, let’s consider the case where our SAS 9.4 maintenance 5 end-user enters their username and password to access the SAS 9.4 environment. This is illustrated in the following diagram.

In this case, since the SAS 9.4 Workspace Server is launched using a username and password, these are cached on the launch of the process. This enables the SAS 9.4 Workspace Server to use these cached credentials when connecting to SAS CAS. However, the same process occurs if instead of the cached credentials being provided by the launching process, they are provided by another mechanism. These credentials could be provided from SAS 9.4 Metadata Server or from an authinfo file in the user’s home directory on the SAS 9.4 environment. In any case, the process on the SAS Cloud Analytic Server Controller is the same.

The username and password used to connect are validated through the PAM stack on the SAS CAS Controller, as well as being used to generate an internal OAuth token from the SAS Viya 3.3 Logon Manager. The PAM stack, just as in the SAS Viya 3.3 programming interface use case 2 above, is responsible for initializing the Kerberos credentials for the end-user. These Kerberos credentials are placed into a Kerberos Ticket cache which makes them available to the SAS CAS session for the connection to the Secured Hadoop environment. Therefore, all the different sessions within SAS 9.4, SAS Viya 3.3, and the Secured Hadoop environment run as the end-user.

Use Case 4 – SAS 9.4 maintenance 5

Now what about the case where the SAS 9.4 maintenance 5 environment is configured for Kerberos authentication throughout. The case where we have Kerberos delegation configured in SAS 9.4 is shown here.

Here the SAS 9.4 Workspace Server is launched with Kerberos credentials, the Service Principal for the SAS 9.4 Object Spawner will need to be trusted for delegation. This means that a Kerberos credential for the end-user is available to the SAS 9.4 Workspace Server. The SAS 9.4 Workspace Server can use this end-user Kerberos credential to request a Service Ticket for the connection to SAS CAS. SAS CAS is provided with a Kerberos keytab and principal it can use to validate this Service Ticket. Validating the Service Ticket authenticates the SAS 9.4 end-user to SAS CAS. The principal for SAS CAS must also be trusted for delegation. We need the SAS CAS session to have access to the Kerberos credentials of the SAS 9.4 end-user.

These Kerberos credentials made available to the SAS CAS are used for two purposes. First, they are used to make a Kerberized connection to the SAS Viya Logon Manager, this is to obtain the SAS Viya internal OAuth token. Therefore, the SAS Viya Logon Manager must be configured to accept Kerberos connections. Second, the Kerberos credentials of the SAS 9.4 end-user are used to connect to the Secure Hadoop environment.

In this case since all the various principals are trusted for delegation, our SAS 9.4 end-user can perform multiple authentication hops using Kerberos with each component. This means that through the use of Kerberos authentication the SAS 9.4 end-user is authenticated into SAS CAS and out to the Secure Hadoop environment.

Use Case 5 – SAS 9.4 maintenance 5

Finally, what about cases where the SAS 9.4 maintenance 5 session is not running as the end-user but as a launch credential; this is illustrated here.

The SAS 9.4 session in this case could be a SAS Stored Process Server, Pooled Workspace Server, or a SAS Workspace server leveraging a launch credential such as sassrv. The key point being that now the SAS 9.4 session is not running as the end-user and has no access to the end-user credentials. In this case we can still connect to SAS CAS and from there out to the Secured Hadoop environment. However, this requires some additional configuration. This setup will leverage One-Time-Passwords generated by the SAS 9.4 Metadata Server, so the SAS 9.4 Metadata Server must be made aware of the SAS CAS. This is done by adding a SAS 9.4 metadata definition for the SAS CAS. Our connection from SAS 9.4 must then be “metadata aware,” achieved by using authdomain=_sasmeta_ on the connection.

Equally, the SAS Viya 3.3 side of the environment must be able to validate the One-Time-Password used to connect to SAS CAS. When SAS CAS receives the One-Time-Password on the connection, it is sent to the SAS Viya Logon Manager for validation and to obtain a SAS Viya internal OAuth token. We need to add some configuration to the SAS Viya Logon Manager to enable this to validate the One-Time-Password. We configured the SAS Viya Logon Manager with the details of where the SAS 9.4 Web Infrastructure Platform is running. The SAS Viya Logon Manager passes the One-Time-Password to the SAS 9.4 Web Infrastructure Platform to validate the One-Time-Password. After the One-Time-Password is validated a SAS Viya internal OAuth token is generated and passed back to SAS CAS.

Since SAS CAS does not have access to the end-user credentials, the session that is created will be run using the account used to launch the controller process, cas by default. Since the end-user credentials are not available, the Kerberos credentials that are initialized for the session are from the Kerberos keytab provided to SAS CAS. Then the connection to the Secured Hadoop environment will be made using those Kerberos credentials of the principal assigned to the SAS CAS.

Summary

We have presented several use cases above. The table below can be used to summarize and differentiate the use cases based on key factors.

SAS Viya 3.3 - Some Kerberos principles was published on SAS Users.

12月 072017
 

In SAS Viya, deployments identities are managed by the environments configured identity provider. In Visual SAS Viya deployments the identity provider must be an LDAP (Lightweight Directory Access Protocol)  server. Initial setup of a SAS Viya Deployment requires configuration to support reading the identity information (users and groups) from LDAP. SAS Viya 3.3 adds support for multi-tenancy which has implications for the way users and groups must be stored in LDAP. For the SAS Administrator, this means at least a rudimentary understanding of LDAP is required. In this blog post, I will review some key LDAP basics for the SAS Viya administrator.

A basic understanding of LDAP l ensures SAS Viya administrators can speak the same language as the LDAP administrator.

What is LDAP?

LDAP is a lightweight protocol for accessing directory servers. A directory server is a hierarchical object orientated database. An LDAP server can be used to organize anything. One of the most common uses is as an identity management system, to organize user and group membership.

LDAP directories are organized in a tree manner:

  • A directory is a tree of directory entries.
  • An entry contains a set of attributes.
  • An attribute has a name, and one or more values.

LDAP basics for the SAS Viya administrator

Below is an entry for a user Henrik. It has common attributes like:

  • uid User id
  • cn Common Name
  • L Location
  • DisplayName: name to display

The attribute value pairs are the details of the entry.

The objectclass attribute is worth a special mention. Every entry has at least one objectclass attribute and often more than one. The objectclass is used to provide the rules for the object including required and allowed attributes. For example, the inetorgperson object class specifies attributes about people who are part of an organization, including items such as uid, name, employeenumber etc.

LDAP Tree

Let’s now look at the organization of the tree. DC is the “domain component.” You will often see examples of LDAP structures that use DNS names for the domain component, such as: dc=sas,dc=com. This is not required, but since DNS itself often implies organizational boundaries, it usually makes sense to use the existing naming structure. In this example the domain component is “dc=geldemo,dc=com”. The next level down is the organizational unit (ou).  An organizational unit is a grouping or collection of entries. Organizational units can contain additional organizational units.

But how do we find the objects we want in the directory tree? Every entry in a directory has a unique identifier, called the Distinguished Name (DN). The distinguished name is the full path to the object in the directory tree. For example, the distinguished name of Henrik is uid=Henrik,ou=users, ou=gelcorp,dc=viyademo,dc=com. The distinguished name is the path to the object from lowest to highest (yes it seems backward to me to).

LDAP Queries and Filters

Like any other database LDAP can be queried and it has its own particular syntax for defining queries. LDAP queries are boolean expressions in the format

<em><strong>attribute operator value</strong></em>

<em><strong>uid = sasgnn</strong></em>

 

Attribute can be any valid LDAP attribute (e.g name, uid, city etc.) and value is the value that you wish to search for.  The usual operators are available, including:

Using LDAP filters, you can link two or more Boolean expressions together using the “filter choices” and/or. Unusually, the LDAP “filter choices” are always placed in front of the expressions. The search criteria must be put in parentheses and then the whole term has to be bracketed one more time. Here are some examples of LDAP queries that may make the syntax easier to follow:

  • (sn=Jones): return all entries with a surname equal to Jones.
  • (objectclass=inetorgperson) return entries that use the inegorgperson object class.
  • (mail=*): return all entries that have the mail attribute.
  • (&(objectclass=inetorgperson)(o=Orion)): return all entries that use the inetorgperson object class and that have the organization attribute equal to Orion (people in the Orion organization).
  • (&(objectclass=GroupofNames)(|(o=Orion)(o=Executive))) return all entries that use the groupofNames object class and that have the organization attribute equal to Orion OR the organization attribute equal to Executive (groups in the Orion or Executive organizations).

Why do I need to know this?

How will you apply this LDAP knowledge in SAS Viya? To enable SAS Viya to access your identity provider, you must update the SAS Identities service configuration. As an administrator, the most common items to change are:

  • BaseDN the entry in the tree from which the LDAP server starts it search.
  • ObjectFilter the filter used to identity and limit the users and groups returned.

There is a separate BaseDN and ObjectFilter for users and for groups.

To return users and groups to SASVIYA from our example LDAP server we would set:

sas.identities.providers.ldap.group.BasedN=ou=gelcorp,ou=groups,dc=viyademo,dc=com

sas.identities.providers.ldap.users.BasedN= ou=gelcorp,ou=users,dc=viyademo,dc=com

 

This would tell SASVIYA to begin its search for users and groups at those locations in the tree.

The object filter will then determine what entries are returned for users and groups from a search of the LDAP tree starting at the BaseDN. For example:

sas.identities.providers.ldap.group.objectFilter: 
(&amp;(objectClass=GroupOfNames)(o=GELCorp LTD))

sas.identities.providers.ldap.users.objectFilter: 
(&amp;(objectClass=inetOrgPerson)(o=GELCorp LTD))

 

There are a lot of LDAP clients available that will allow you to connect to an LDAP server and view, query, edit and update LDAP trees and their entries. In addition, the ldif file format is a text file format that includes data and commands that provide a simple way to communicate with a directory so as to read, write, rename, and delete entries.

This has been a high-level overview of LDAP. Here are some additional sources of information that may help.

Basic LDAP concepts

LDAP Query Basics

Quick Introduction to LDAP

How To Use LDIF Files to Make Changes to an OpenLDAP System

LDAP basics for the SAS Viya administrator was published on SAS Users.

12月 072017
 

In SAS Viya, deployments identities are managed by the environments configured identity provider. In Visual SAS Viya deployments the identity provider must be an LDAP (Lightweight Directory Access Protocol)  server. Initial setup of a SAS Viya Deployment requires configuration to support reading the identity information (users and groups) from LDAP. SAS Viya 3.3 adds support for multi-tenancy which has implications for the way users and groups must be stored in LDAP. For the SAS Administrator, this means at least a rudimentary understanding of LDAP is required. In this blog post, I will review some key LDAP basics for the SAS Viya administrator.

A basic understanding of LDAP l ensures SAS Viya administrators can speak the same language as the LDAP administrator.

What is LDAP?

LDAP is a lightweight protocol for accessing directory servers. A directory server is a hierarchical object orientated database. An LDAP server can be used to organize anything. One of the most common uses is as an identity management system, to organize user and group membership.

LDAP directories are organized in a tree manner:

  • A directory is a tree of directory entries.
  • An entry contains a set of attributes.
  • An attribute has a name, and one or more values.

LDAP basics for the SAS Viya administrator

Below is an entry for a user Henrik. It has common attributes like:

  • uid User id
  • cn Common Name
  • L Location
  • DisplayName: name to display

The attribute value pairs are the details of the entry.

The objectclass attribute is worth a special mention. Every entry has at least one objectclass attribute and often more than one. The objectclass is used to provide the rules for the object including required and allowed attributes. For example, the inetorgperson object class specifies attributes about people who are part of an organization, including items such as uid, name, employeenumber etc.

LDAP Tree

Let’s now look at the organization of the tree. DC is the “domain component.” You will often see examples of LDAP structures that use DNS names for the domain component, such as: dc=sas,dc=com. This is not required, but since DNS itself often implies organizational boundaries, it usually makes sense to use the existing naming structure. In this example the domain component is “dc=geldemo,dc=com”. The next level down is the organizational unit (ou).  An organizational unit is a grouping or collection of entries. Organizational units can contain additional organizational units.

But how do we find the objects we want in the directory tree? Every entry in a directory has a unique identifier, called the Distinguished Name (DN). The distinguished name is the full path to the object in the directory tree. For example, the distinguished name of Henrik is uid=Henrik,ou=users, ou=gelcorp,dc=viyademo,dc=com. The distinguished name is the path to the object from lowest to highest (yes it seems backward to me to).

LDAP Queries and Filters

Like any other database LDAP can be queried and it has its own particular syntax for defining queries. LDAP queries are boolean expressions in the format

<em><strong>attribute operator value</strong></em>

<em><strong>uid = sasgnn</strong></em>

 

Attribute can be any valid LDAP attribute (e.g name, uid, city etc.) and value is the value that you wish to search for.  The usual operators are available, including:

Using LDAP filters, you can link two or more Boolean expressions together using the “filter choices” and/or. Unusually, the LDAP “filter choices” are always placed in front of the expressions. The search criteria must be put in parentheses and then the whole term has to be bracketed one more time. Here are some examples of LDAP queries that may make the syntax easier to follow:

  • (sn=Jones): return all entries with a surname equal to Jones.
  • (objectclass=inetorgperson) return entries that use the inegorgperson object class.
  • (mail=*): return all entries that have the mail attribute.
  • (&(objectclass=inetorgperson)(o=Orion)): return all entries that use the inetorgperson object class and that have the organization attribute equal to Orion (people in the Orion organization).
  • (&(objectclass=GroupofNames)(|(o=Orion)(o=Executive))) return all entries that use the groupofNames object class and that have the organization attribute equal to Orion OR the organization attribute equal to Executive (groups in the Orion or Executive organizations).

Why do I need to know this?

How will you apply this LDAP knowledge in SAS Viya? To enable SAS Viya to access your identity provider, you must update the SAS Identities service configuration. As an administrator, the most common items to change are:

  • BaseDN the entry in the tree from which the LDAP server starts it search.
  • ObjectFilter the filter used to identity and limit the users and groups returned.

There is a separate BaseDN and ObjectFilter for users and for groups.

To return users and groups to SASVIYA from our example LDAP server we would set:

sas.identities.providers.ldap.group.BasedN=ou=gelcorp,ou=groups,dc=viyademo,dc=com

sas.identities.providers.ldap.users.BasedN= ou=gelcorp,ou=users,dc=viyademo,dc=com

 

This would tell SASVIYA to begin its search for users and groups at those locations in the tree.

The object filter will then determine what entries are returned for users and groups from a search of the LDAP tree starting at the BaseDN. For example:

sas.identities.providers.ldap.group.objectFilter: 
(&amp;(objectClass=GroupOfNames)(o=GELCorp LTD))

sas.identities.providers.ldap.users.objectFilter: 
(&amp;(objectClass=inetOrgPerson)(o=GELCorp LTD))

 

There are a lot of LDAP clients available that will allow you to connect to an LDAP server and view, query, edit and update LDAP trees and their entries. In addition, the ldif file format is a text file format that includes data and commands that provide a simple way to communicate with a directory so as to read, write, rename, and delete entries.

This has been a high-level overview of LDAP. Here are some additional sources of information that may help.

Basic LDAP concepts

LDAP Query Basics

Quick Introduction to LDAP

How To Use LDIF Files to Make Changes to an OpenLDAP System

LDAP basics for the SAS Viya administrator was published on SAS Users.

8月 312016
 

update_to_SASStudio_ 3.5Update-in-place supports the ability to update a SAS Deployment within a major SAS release. Updates often provide new versions of SAS products. However, when using the SAS Deployment Wizard to perform an update-in-place you cannot selectively update a machine or product. As a general rule if you want to update one product in a SAS Deployment you have to update the whole deployment. With the latest version of SAS Studio, that’s not the case.  You can now update from version 3.4 to version 3.5 of SAS Studio without updating any other part of your SAS deployment.

SAS Studio 3.5 contains some interesting new functionality:

  • A new batch submit feature.
  • The ability to create global settings for all SAS Studio users at your site.
  • A new Messages window that displays information about the programs, tasks, queries, and process flows that you run.
  • A new table of contents in results.
  • New keyboard shortcuts to add and insert code snippets.
  • Many new tasks for statistical process control, multivariate analysis, econometric analysis, and power and sample size analysis. For more information, see SAS Studio Tasks.

For my purposes, I was really interested in using the batch submit feature. Using “Batch Submit” a user can run a saved SAS program in batch mode, which means that the program will run in the background while you continue to use SAS Studio. When you run a program in batch mode, you can view the status of programs that have been submitted, and you can cancel programs that are currently running.

So how does this “selective update” work? Somewhat unusual for a product update, it is available via a hot fix documented in the note 57898: Upgrade SAS® Studio 3.4 to SAS® Studio 3.5 without upgrading other products.

SAS Studio is available in three different deployment flavors: SAS Studio Mid-Tier (the enterprise edition), SAS Studio Basic, and SAS Studio Single-User. The hot fix is available for the enterprise and basic edition. In addition, in order to apply the hot fix the current deployment must be at SAS 9.4 M3. For SAS Studio Single-User, an MSI file has been added to the downloads section of support.sas.com to allow users to download SAS Studio 3.5 to run against their existing Windows desktop SAS for releases 9.4M1 and higher.

The hot fix is a container hot fix, meaning the hot fix delivers one or more “MEMBER” hot fixes in one downloadable unit. Container hot fixes have some special rules you must follow when applying them.

  • They must be applied separately to each machine. The installation process will apply only those MEMBER hot fixes which are applicable based on the SAS Deployment Registry for each specific machine.
  • They may contain MEMBER hot fixes for multiple operating systems. The SAS Deployment Manager will apply only those MEMBER hot fixes which are applicable for the operating system on each specific machine.
  • They often contain pre and/or post installation steps outlined in the instructions provided.

A review of the hot fix instructions shows that to complete the update for the SAS Studio Mid-Tier the web application must be rebuilt and redeployed.

To apply the container hot fix on my three tier deployment, which has a Windows metadata server, LINUX compute tier and LINUX middle tier, I downloaded the hot fix to a network accessible location and followed the process documented in the hot fix instructions. To summarize:

Create a deployment registry report on each machine. The reports showed that:

SAS Studio Basic is installed on the Linux compute tier.

Update to SAS Studio 3.5

SAS Studio Enterprise is installed on the Linux middle-tier.

Update to SAS Studio 3.5_1

Update SAS Studio Basic

Stop all SAS servers in the deployment. Run the SAS Deployment Manager on the LINUX compute tier and select Apply Hot fixes and then select the directory where the hot fix was downloaded. The Wizard updates SAS Studio Basic. A review of the hot fix documentation shows no post-deployment steps are required for SAS Studio Basic.

Update to SAS Studio 3.5_2

Update SAS Studio Mid-Tier (Enterprise)

Run the SAS Deployment Manager on the LINUX middle-tier tier and select Apply Hot fixes and then select the directory where the hot fix was downloaded. The Wizard updates SAS Studio Mid-Tier.

Update to SAS Studio 3.5_3

A review of the hot fix documentation shows that, to complete the update, the SAS Studio Web Application must be rebuilt and redeployed.

Update to SAS Studio 3.5_4

Start the SAS Metadata Server and use the SAS Deployment Manager on the middle-tier to rebuild just the SAS Studio Middle-Tier. Start all SAS Servers and use the SAS Deployment Manager on the middle-tier machine to redeploy just the SAS Studio Middle Tier.

When the redeploy is completed, I logon to SAS Studio. Selecting Help > About shows that now I have SAS Studio 3.5.

Update to SAS Studio 3.5_5

If I navigate the folder tree and select a SAS program I can now right-click on the program and select “Batch Submit” to run the program in the background.

Update to SAS Studio 3.5_6

If you are excited about the new functionality of SAS Studio 3.5, I think you will agree that the hot fix provides an easy path to update the software.

tags: deployment, SAS Administrators, SAS architecture, SAS Professional Services, sas studio

A quick way to update to SAS Studio 3.5 was published on SAS Users.

8月 022016
 

One of the jobs of SAS Administrators is keeping the SAS license current.  In the past, all you needed to do was update the license for Foundation SAS and you were done. This task can be performed by selecting the Renew SAS Software option in the SAS Deployment Manager.

More recently, many SAS solutions require an additional step which updates the license information in metadata. The license information is stored in metadata so that middle-tier applications can access it in order to check whether the license is valid. Not all solutions require that the SAS Installation Data file (SID) file be stored in metadata, however the list of solutions that do require it is growing and includes SAS Visual Analytics. For a full list you can check this SASNOTE. To update the license information in metadata, run the SAS Deployment Manager and select Update SID File in Metadata.

Recently, I performed a license renewal for a Visual Analytics environment. A couple of days later it occurred to me that I might not have performed the update of the SID file in metadata. That prompted the obvious question: how do I check the status of my license file in metadata?

To check the status of a SAS Foundation license you can use PROC setinit. PROC setinit will return the details of the SAS license in the SAS log.

proc setinit;run;

steps to update your SAS License

The above output of PROC setinit shows the:

  • Expiration Date as 25MAY2017
  • Grace Period ends on 09JUL2917
  • Warning Period ends on 04SEP2017

This indicates that the software expires on 25MAY2017, however nothing will happen during the Grace Period. During the Warning Period messages in the SAS log will warn the user that the software is expiring. When the Warning Period ends on 04SEP2017 the SAS Software will stop functioning. PROC setinit is only checking the status of the Foundation SAS license, not the license in metadata.

If the foundation license is up-to-date but the license stored in metadata is expired the web applications will not work. It turns out SAS Environment Manager will also monitor the status of the SAS license. But is it the Foundation license or the license stored in metadata?

To see the status of the license in SAS Environment Manager, select Resources then select Browse > Platforms > SAS 9.4 Application Server Tier. The interface displays:

  • Days Until License Expiration:  the number of days until the license expires.
  • Days Until License Termination: the number of days until the software stops working.
  • Days Until License Termination Warning: the number of days until the Grace period.

steps to update your SAS License

Some testing revealed that Environment Manager is monitoring not the status of the foundation license but the status of the license in metadata. This is an important point, because as we noted earlier not all SAS solutions require the SID to be updated in metadata. Since Environment Manager monitors the license by checking the status of the SID file in metadata, administrators are recommended, as a best practice, to always update the SID file in metadata.

Environment manager with Service Architecture configured also will generate events that warn of license termination when the license termination date is within a month.

In addition, as of SAS 9.4 M3, SAS Management Console has an option to View metadata setinit details. To access this functionality you must be a member of the SAS Administrators Group or the Management Console: Advanced Role.

To check on a SID file in metadata open SAS Management Console and in the plug-ins tab:

1.     Expand Metadata Manager

2.     Select Metadata Utilities

3.     Right- click and select View metadata setinit details

steps to update your SAS License

Selecting the option gives details of the current SID file in metadata, with similar information as PROC setinit displays including the expiration date, the grace period and the warning period.  In addition it displays the date the SID file was last updated in metadata.

steps to update your SAS License

The takeaway: to fully renew SAS software, and ensure that SAS Environment Manager has the correct date for its metrics on license expiration, always use SAS Deployment Manager to both Update the SAS License, AND Update the SID File in Metadata.

To check if your SAS Deployment license has been fully updated, do the following:

1.     Run PROC setinit to view the status of the SAS Foundation license.

2.     Use SAS Management Console or SAS Environment Manager to check if the SID file has been updated in metadata.

For more information on this topic see the video, “Use SAS Environment Manager to Get SAS License Expiration Notice” and additional resources below:

 

SAS® Deployment Wizard and SAS® Deployment Manager 9.4:User’s Guide: Update SID File in Metadata
SAS® Deployment Wizard and SAS® Deployment Manager 9.4:User’s Guide: Renew SAS Software
SAS(R) 9.4 Intelligence Platform: System Administration Guide: Managing Setinit (License) Information in Metadata
SAS® Environment Manager 2.5 User’s Guide

tags: configuration, SAS Administrators, SAS architecture, SAS Environment Manager, SAS Professional Services

Two steps to update your SAS License and check if it is updated was published on SAS Users.

10月 262015
 

SAS Grid Manager for Hadoop is a brand new product released with SAS 9.4M3 this summer. It gives you the ability to co-locate your SAS Grid jobs on your Hadoop data nodes to let you further leverage your investment in your Hadoop infrastructure. This is possible because SAS Grid Manager for Hadoop is integrated with the native components, specifically YARN and Oozie, of your Hadoop ecosystem. Let's review the architecture of this new offering.

First of all, the official name– SAS Grid Manager for Hadoop– shows that it is a brand new product, not just an addition or a different configuration of the “classic” SAS Grid Manager – which I will subsequently refer to as “for Platform” to distinguish the two.

For an end user, grid usage and functionality remains the same, but an architect will notice that many components of the offering have changed. Describing these components will be the focus of the remainder of this post.

Let me start by showing a picture of a sample software architecture, so that it will be easier to recognize all the pieces with a visual schema in front of us. The following is one possible deployment architecture; there are other deployment choices.

SAS_Grid_Manager_for_Hadoop_9_4M3_Architecture_v1_1_full

Third party components

Just as SAS Grid Manager for Platform builds on top of third party software from Platform Computing (part of IBM), SAS Grid Manager for Hadoop requires Hadoop to function. There is a big difference, though.

SAS Grid Manager for Platform includes all of the required Platform Computing components, as they are delivered, installed and supported by SAS.

On the other side, SAS Grid Manager for Hadoop considers all of the Hadoop components (highlighted in yellow in the above diagram) as prerequisites. As such, customers are required to procure, install and support Hadoop before SAS gets installed.

Hadoop, as you know, includes many different components. The diagram lists the one that are needed for SAS Grid Manager:

  • HDFS provides cluster-wide filessytem storage
  • YARN is used for resource management
  • Oozie is the scheduling service
  • Hue is required, if the Oozie web GUI is surfaced through Hue.
  • Hive is required at install time for the SAS Deployment Wizard to be able to access the required Hadoop configuration and jar files.
  • Hadoop jars and config files need to be on every machine, including clients.

YARN Resource Manager, HDFS Name Node, Hive, and Oozie are not necessarily on the same machine. By default, the SAS grid control server needs to be on the machine that YARN Resource Manager is on.

SAS Components

SAS programming interfaces to grid have not changed, apart from the lower-level libraries to connect to the third party software. As such, SAS will deploy the traditional SAS grid control server, SAS grid nodes, SAS thin client (aka SASGSUB) or the full SAS client (SAS Display Manger).

In a typical SAS Grid deployment, a shared directory is used to share the installation and configuration directories between machines in the grid. With SAS Grid Manager for Hadoop, you can either use NFS to mount a shared directory on all cluster hosts or use the SAS Deployment Manager (SDM) to work with the cluster manager to distribute the deployment to the cluster hosts. The SDM has the ability to create Cloudera parcels and Ambari packages to enable the distribution of the installation and configuration directories from the grid control server to the grid nodes.

One notable missing component is the SAS Grid Manager plug-in for SAS Management Console. This management interface is tightly coupled with Platform Computing GMS, and cannot be used with Hadoop.

The Middle Tier

You will notice in the above diagram that the middle tier is faded. In fact, no middle tier components are included in SAS Grid Manager for Hadoop. Anyway, a middle tier will generally be included and deployed as part of other solutions licensed on top of SAS Grid Manager, so you will still be able to program using SAS Studio and monitor the SAS infrastructure using SAS Environment Manager.

Please note that I say “monitor the SAS infrastructure”, not “monitor the SAS grid.” There are no plug-ins or modules within SAS Environment Manager that are specific to SAS Grid Manager for Hadoop.   This is by design because SAS is part of your overall Hadoop environment and therefore the SAS Grid workload can be monitored using your favorite Hadoop management tools.

Hadoop provides plenty of web interfaces to monitor, manage and configure its environment. As such, you will be able to use YARN Web UI to monitor and manage submitted SAS jobs, as well as Hue web UI to review scheduled workflows.

The Storage

Discussing grid storage is never a quick task and could require a full blog post on its own. It is worth noting some architecture peculiarities related to SAS Grid Manager for Hadoop. HDFS can be used to store shared data, and is used to store scheduled jobs, workflows, logs. But, we still require a traditional, POSIX complaint filesystem for stuff such as SAS Work, SASGSUB, solution specific projects, etc.

Conclusion

SAS Grid Manager for Hadoop enables customers to co-locate their SAS Grid and all of the associated SAS workload on their existing Hadoop cluster. We have briefly discussed the key components that are included in – or are missing from – this new offering. I hope you found this post helpful. As always, any comments are welcome.

tags: grid, Hadoop, SAS architecture, SAS Grid Manager, SAS Grid Manager for Hadoop, SAS Professional Services

SAS Grid Manager for Hadoop architecture was published on SAS Users.

10月 142015
 

sizingSizing is a topic that solutions managers typically leave until the end after decisions about the application have been settled. But there are often many variables that can impact the final size requirement. We have seen across our customer base that sizing and the number of environments has been determined by predicted data volumes, the types of environments that need to be supported and the budget available.

Environments

Technical architects spend time debating what environment is right for their business and of course this is no easy decision.  Often the business changes its mind, data volumes increase (often with little or no advance warning), data sources vary, different teams need access and with this performance issues creep in.

Production – this one is a must so is easy to say yes to. It’s perhaps the easiest of the estimates as long as the solutions team is able to predict the volume of data.

Undersizing is a common problem here for many reasons. The most common reason is when the solution has been far more successful and has attracted more users, and/or data sources. The second common cause of this is where the procurement team has persuaded the solutions managers that they can make do with less resources. Finally we also sometimes see incorrect assumptions being used when sizing.

Test – what and when will be tested, and how frequently, are the key questions here

Development – Increasingly we are finding customers trying to minimize the environments. However, for data quality a development server is pretty key to have in place. Don’t let the development server be an afterthought.

Architecture workshop

The most effective way we found to determine optimum sizing is an in-depth workshop with an experienced architect. But such a workshop typically requires a lead time of two-to-four weeks to set up as experts review requirements and work on proposed options. The fallacy of budget constraints - companies may try to save money by reducing environments, however this can end up costing more.

Sometimes though, solutions managers discover very late in the implementation cycle that they need to revisit their sizing/number of environments.   If a resizing has to take place, additional budgets secured and a reworking of the installation, this can have a real impact on time/resource/cost and more importantly the time it takes to start gaining business benefit.

In such situations, we have found three workarounds:

  1. Spend time with the vendor’s architect team to understand what is required now and moving forwards – get advice
  2. Understand the business’s expectations and requirements for the next 24 months so it’s a robust scaleable solution
  3. Get back on track as soon as possible so the business can realize value from improved data quality/analytics/access to Hadoop etc.

Interesting article here by David Lashin virtualised environments.

Communication is king throughout the sizing exercise - between the technical teams of both vendor and customer and between the business and IT teams about exact usage, data volumes and critically growth plans for the coming 12-to-18 months.

Summary

Across all organizations we see various debates happening – our advice is get this topic out on the table and into the open as early as possible. It is critical to the success of any project, it will enable the deployment to be smoother and the adoption and therefore the time to value will be reduced (which is critical).

We would love to hear from you – does any of the above resonate with you? Have you had good/bad experiences? Share your thoughts with caroline.hermon@sas.com

For more hot topics follow me @hermon100 for more tips from Caroline at the Coalface!

tags: data management, SAS architecture, sizing

Sizing: the long and short of it was published on SAS Users.

6月 102015
 

Recently my wife and I took our annual anniversary trip – this time we went to the Grand Canyon, staying in Las Vegas. In researching our options to fly from Raleigh-Durham (RDU) to Las Vegas (LAS), we had several different selection criteria:

  • what time we wanted to leave
  • what time we wanted to arrive
  • price
  • number of stops
  • layover time
  • airline – loyalty program
  • type of aircraft – seating, amenities, food, wifi

All of the flights we looked at would get us from RDU to LAS and back. So the destination wasn’t the issue – it was how much value we placed on each of the attributes: arriving in the afternoon (hotel check-in is 3:00pm) versus spending more money for a nonstop flight, for example. We made our decisions based on our specific needs at that time. We also have different opinions of what was important (I’m basically cheap, and my wife refuses to take the red-eye flight).

The evaluation of storage for a SAS solution can be viewed in a similar fashion. There are tradeoffs to be made, or certainly criteria which will be evaluated and prioritized. This blog posting will briefly examine three such attributes and how they may impact storage in a SAS environment.

Who says you can’t have it all?

tradeoffsThis diagram highlights three of the more common attributes that are considered when evaluating storage. While there are certainly other considerations (capacity, interfaces, architecture), these three are usually involved in most storage decisions. This diagram also suggests that there are tradeoffs to be considered: for example, between price and performance (higher performance may require higher price). Let’s briefly examine each of these, and where we may see tradeoffs in a SAS environment.

Price

Price is usually among the first attributes that come up in any discussion of storage. Everyone is looking to save money, and unfortunately storage often gets compromised. Consider this scenario: our SAS deployment will need about 5 terabytes (TB) of storage. In terms of raw capacity, a new 5TB disk drive can be bought from a number of online vendors for around $150.00 USD. While this drive may meet the capacity requirements, it most likely is not the best selection for a SAS deployment – especially if there are performance or availability considerations. Typical enterprise-class SAS storage may involve configurations with multiple disks and controllers, and perhaps shared storage such as  Network Attached Storage (NAS) or Storage Area Networks (SAN). Factoring in these, and possibly other, considerations would most likely (significantly!) increase the price of our storage.

Performance

SAS applications are consumers of storage, and have significant performance expectations for I/O throughput. Many SAS field consultants can share stories of under-performing storage leading to failed deployments and unhappy customers. SAS has minimum recommended I/O throughput rates of file systems that are to be used in a SAS environment, and the Performance Evaluation team within SAS R&D has written several papers that document best practices and tuning guidelines. There is even a usage note about testing throughput for your SAS9 File Systems. Multiple configuration options are reviewed and discussed, ranging from shared file systems to external SAN or NAS arrays.

Availability

Deploying SAS applications into a business-critical environment or where there are availability requirements such as a Service Level Agreement require careful attention to the type and configuration of storage used. Since SAS is implemented on the host OS file systems, commonly used high availability strategies can be used effectively. From simple strategies, such as configuring local storage using RAID mirroring, to more complex enterprise-class solutions, such as redundancy through a SAN, the appropriate level of high availability can be designed and deployed to assure that the storage is designed to meet the needs of the business.

So how does all this fit together?

tradeoffsAs you can see, none of these criteria should be considered independent of the others when designing and evaluating storage solutions for SAS environments. There will be tradeoffs made in the evaluation process, and priorities will be established. For some areas, such as performance, there are guidelines established by SAS R&D. In other areas, specific needs of the customer (a specific SLA, for example) may dictate specific design decisions. In addition, there’s some flexibility in certain areas – filesystems containing SAS permanent data should be allocated to a more available, more protected storage area than the temporary filesystem of SASWORK. A detailed analysis of the storage needs of the SAS deployment as a part of the overall architecture design will consider these three, in addition to other criteria.

 

In case you were wondering, we didn’t take the red-eye.

tags: SAS architecture, SAS Professional Services, storage

The post Evaluating tradeoffs when designing storage for SAS applications appeared first on SAS Users.

2月 252015
 

My Performance Validation team in SAS R&D is constantly working with our partners to test how their storage arrays work with SAS.  In late 2014, we finalized several papers that discuss how a mixed analytics workload performs on several storage arrays.  While doing this testing, we also listed lessons-learned in the tuning guidelines of each paper.

Please review the papers listed below:

 

These papers, along with lots of other papers for other storage, can be found in Usage Note 53874: Troubleshooting system performance problems: I/O subsystem and storage papers.  Please bookmark this SAS Usage note as we update this list of papers regularly.

Let me know if you have questions about these papers or if there are other new storage systems that you would like SAS to test.

tags: flash storage, SAS Administrators, SAS architecture