SAS Grid Manager

1月 262017
 

SAS® Viya™ 3.1 represents the third generation of high performance computing from SAS. Our journey started a long time ago and, along the way, we have introduced a number of high performance technologies into the SAS software platform:

Introducing Cloud Analytic Services (CAS)

SAS Viya introduces Cloud Analytic Services (CAS) and continues this story of high performance computing.  CAS is the runtime engine and microservices environment for data management and analytics in SAS Viya and introduces some new and interesting innovations for customers. CAS is an in-memory technology and is designed for scale and speed. Whilst it can be set up on a single machine, it is more commonly deployed across a number of nodes in a cluster of computers for massively parallel processing (MPP). The parallelism is further increased when we consider using all the cores within each node of the cluster for multi-threaded, analytic workload execution. In a MPP environment, just because there are a number of nodes, it doesn’t mean that using all of them is always the most efficient for analytic processing. CAS maintains node-to-node communication in the cluster and uses an internal algorithm to determine the optimal distribution and number of nodes to run a given process.

However, processing in-memory can be expensive, so what happens if your data doesn’t fit into memory? Well CAS, has that covered. CAS will automatically spill data to disk in such a way that only the data that are required for processing are loaded into the memory of the system. The rest of the data are memory-mapped to the filesystem in an efficient way for loading into memory when required. This way of working means that CAS can handle data that are larger than the available memory that has been assigned.

The CAS in-memory engine is made up of a number of components - namely the CAS controller and, in an MPP distributed environment, CAS worker nodes. Depending on your deployment architecture and data sources, data can be read into CAS either in serial or parallel.

What about resilience to data loss if a node in an MPP cluster becomes unavailable? Well CAS has that covered too. CAS maintains a replicate of the data within the environment. The number of replicates can be configured but the default is to maintain one extra copy of the data within the environment. This is done efficiently by having the replicate data blocks cached to disk as opposed to consuming resident memory.

One of the most interesting developments with the introduction of CAS is the way that an end user can interact with SAS Viya. CAS actions are a new programming construct and with CAS, if you are a Python, Java, SAS or Lua developer you can communicate with CAS using an interactive computing environment such as a Jupyter Notebook. One of the benefits of this is that a Python developer, for example, can utilize SAS analytics on a high performance, in-memory distributed architecture, all from their Python programming interface. In addition, we have introduced open REST APIs which means you can call native CAS actions and submit code to the CAS server directly from a Web application or other programs written in any language that supports REST.

Whilst CAS represents the most recent step in our high performance journey, SAS Viya does not replace SAS 9. These two platforms can co-exist, even on the same hardware, and indeed can communicate with one another to leverage the full range of technology and innovations from SAS. To find out more about CAS, take a look at the early preview trial. Or, if you would like to explore the capabilities of SAS Viya with respect to your current environment and business objectives speak to your local SAS representative about arranging a ‘Path to SAS Viya workshop’ with SAS.

Many thanks to Fiona McNeill, Mark Schneider and Larry LaRusso for their input and review of this article.

 

tags: global te, Global Technology Practice, high-performance analytics, SAS Grid Manager, SAS Visual Analytics, SAS Visual Statistics, SAS Viya

A journey of SAS high performance was published on SAS Users.

12月 272016
 

We have seen in a previous post of this series how to configure SAS Studio to better manage user preferences in SAS Grid environments. There are additional settings that an administrator can leverage to properly configure a multi-user environment; as you may imagine, these options deserve special considerations when SAS Studio is deployed in SAS Grid environments.

SAS Studio R&D and product management often collect customer feedback and suggestions, especially during events such as SAS Global Forum. We received several requests for SAS Studio to provide administrators with the ability to globally set various options. The goal is to eliminate the need to have all users define them in their user preferences or elsewhere in the application. To support these requests, SAS Studio 3.5 introduced a new configuration option, webdms.globalSettings. This setting specifies the location of a directory containing XML files used to define these global options.

Tip #1

How can I manage this option?

The procedure is the same as we have already seen for the webdms.studioDataParentDirectory property. They are both specified in the config.properties file in the configuration directory for SAS Studio. Refer to the previous blog for additional details, including considerations for environments with clustered mid-tiers.

Tip #2

How do I configure this option?
By default, this option points to the directory path !SASROOT/GlobalStudioSettings. SASROOT translates to the directory where SAS Foundation binaries are installed, such as /opt/sas/sashome/SASFoundation/9.4 on Unix or C:/Program Files/SASHome/SASFoundation/9.4/ on Windows. It is possible to change the webdms.globalSettings property to point to any chosen directory.

SAS Studio 3.6 documentation provides an additional key detail : in a multi-machine environment, the GlobalStudioSettings directory must be on the machine that hosts the workspace servers used by SAS Studio. We know that, in grid environments, this means that this location should be on shared storage accessible by every node.

Tip #3

Configuring Global Folder Shortcuts

SAS Studio Tips for SAS Grid Manager Administrators

In SAS Studio, end users can create folder shortcuts from the Files and Folders section in the navigation pane. An administrator might want to create global shortcuts for all the users, so that each user does not have to create these shortcuts manually. This is achieved by creating a file called shortcuts.xml in the location specified by webdms.globalSettings, as detailed in

SAS Studio repositories are an easy way to share tasks and snippets between users. An administrator may want to configure one or multiple centralized repositories and make them available to everyone. SAS Studio users could add these repositories through their Preferences window, but it’s easier to create global repositories that are automatically available from the Tasks and Utilities and Snippets sections. Again, this is achieved by creating a file called repositories.xml in the location specified by webdms.globalSettings, as detailed in tags: SAS Administrators, SAS Grid Manager, SAS Professional Services, sas studio

More SAS Studio Tips for SAS Grid Manager Administrators: Global Settings was published on SAS Users.

12月 122016
 

In a previous blog about SAS Studio I’ve briefly introduced the concept of using the Webwork library instead of the default Work. I also suggested, in SAS Global Forum 2016 paper, Deep Dive with SAS Studio into SAS Grid Manager 9.4, to save intermediate results in the Webwork library, because this special library is automatically assigned at start-up and is shared across all workspace server sessions. In the past days, I received some request to expand on the properties of this library and how it is shared across different sessions. What better way to share this information than writing this up in a blog?

As always, I’d like to start with a reference to the official documentation. SAS(R) Studio 3.5: User’s Guide describes the Webwork library, along with its differences with respect to the Work library, in the section about the Interactive Mode. The main points are:

  • Webwork is the default output library in interactive mode. If you refer to a table without specifying both the libref and the table name, SAS Studio assumes it is stored in the Webwork library.
  • The Webwork library is shared between interactive mode and non-interactive mode. Any data that you create in the Webwork library in one mode can be accessed in the other mode.
  • The Work library is not shared between interactive mode and non-interactive mode. Each workspace server session has its own separate Work library, and data cannot be shared between them.
  • Any data that you save to the Work library in interactive mode cannot be accessed from the Work library in non-interactive mode. Also, you cannot view data in the Work library from the Libraries section of the navigation pane if the data was created in interactive mode.

In addition to this, we can list some additional considerations:

  • The Webwork library is shared between every workspace server session started when using parallel process flows from the Visual Programming perspective.
  • The Webwork library is not shared between different SAS Studio sessions. When using multiple SAS Studio sessions, each one has a different Webwork, just like traditional SAS Foundation sessions do not share their Work libraries.
  • The Webwork library is cleared at the end of the SAS Studio session and its content is temporary in nature, just like the Work library.

Here are the logs of the same lines of code executed in different SAS Studio sessions to show the actual path, on a Windows machine, of the Work and Webwork directories:

First SAS Studio session, non-interactive mode

sas-studio-webwork-library01

Same session, interactive mode

sas-studio-webwork-library02

Second SAS Studio session, non-interactive mode

sas-studio-webwork-library03

And since a picture is worth a thousand words, the following diagram depicts the relationship between SAS Studio sessions, Work libraries and Webwork libraries.

sas-studio-webwork-library04

Finally, I’d like to remember that, in distributed environments where the workspace server sessions are load balanced across multiple hosts, it is imperative to configure the Webwork library on a shared filesystem, following the instructions explained in the SAS Studio tips for SAS Grid Manager Administrators blog.

tags: SAS Grid Manager, SAS Professional Services, sas studio, SAS Studio Webwork library

SAS Studio Webwork library demystified was published on SAS Users.

9月 212016
 

In a previous blog post I explained how end users should code and use shared locations for SAS artifacts, to avoid issues in a SAS Grid Manager environment. Still, they could still fall in some sharing issues, which could have very obscure manifestations. For example, users opening SAS studio might notice that it automatically opens to the last program that they were working on in a previous session… sometimes. Other times, they may logon and find that SAS Studio opens to a blank screen. What causes SAS Studio to “sometimes” remember a previous program and other times not? And why should this matter, when all I am looking for, are my preferences?

Where are my preferences?

SAS Studio has a Preferences window that enables end users to customize several options that change the behavior of different features of the software. By default, these preferences are stored under the end-user home directory on the server where the workspace server session is running (%AppData%/SAS/SASStudio/preferences in Windows or ~/.sasstudio/preferences in UNIX). Does this sentence ring any alarm bells? With SAS Studio Enterprise Edition running in a grid environment, there is no such thing as “the server where the workspace server session is running!” One invocation of SAS Studio could run on one grid node and the next invocation of SAS Studio could run on a different grid node.  For this reason, it might happen that a preference that we just set to a custom value reverts to its default value on the next sign-in. This issue can become worse because SAS Studio follows the same approach to store code snippets, tasks, autosave files, the WEBWORK library, and more.

Until SAS Studio 3.4, the only solution to this uncertainty was to have end users’ home directories shared across all the grid nodes. SAS Studio 3.5 removes this requirement by providing administrators with a new configuration option: webdms.studioDataParentDirectory. This option specifies the location of SAS Studio preferences, snippets, my tasks, and more. The default value is blank, which means that the behavior is the same as in previous releases. An administrator can point it to any shared location to access all of this common data from any workspace server session.

Tip #1

This option sounds cool, how can I change its value?

SAS Studio 3.5: Administrator’s Guide provides information on this topic, but the specific page contains a couple of errors – not to worry, they have been flagged and are in the process of being amended. The property is specified in the config.properties file in the configuration directory for SAS Studio. Remember that when you deploy using multiple Web Application Servers (which is common with many SAS solutions, and mandatory in modern clustered environments), SAS Studio is deployed in SASServer2_1, not in SASServer1_1. It is also worth noting that, in case of clustered middle tiers, this change should be applied to every deployed instance of SAS Studio web application.

The documentation page also incorrectly states how to enforce this change. The correct procedure is to restart the Web Application Server hosting SAS Studio, i.e. SASSerrver2_1.

Tip #2

I do not want that all my users to share a common directory, they would override each other’s settings!

This is a fair request, but, unfortunately, the official documentation can be confusing.

The correct syntax, to have a per-user directory, is to append <userid> at the end of the path. SAS Studio will substitute this token with the actual userid of the user currently logged on. For example, given the following configuration:

sas-studio-tips-for-sas-grid-manager-administrators

will lead to this directory structure:

sas-studio-tips-for-sas-grid-manager-administrators02

Tip #3

You – or your SAS Administrator – changed the value of the webdms.studioDataParentDirectory option. How can you know if the change has been correctly applied? This is the same as asking, how can I know where this option is currently pointing to? Here is a quick tip. This property influences the location of the WEBWORK directory. Simply open the Libraries pane, right click on WEBWORK to select “Properties”, and here you are. The highlighted path in the following screenshot shows the current value of the option:
sas-studio-tips-for-sas-grid-manager-administrators03

Conclusion

As you may have understood, the daily duties of SAS Administrators include ghost hunting as well as debugging weird issues. I hope that the tips contained in this post will make your lives a little easier, leaving more time for all the other paranormal activities. And I promise, more tips will follow!

tags: SAS Administrators, SAS Grid Manager, SAS Professional Services, sas studio

SAS Studio tips for SAS Grid Manager Administrators: Where are my preferences? was published on SAS Users.

5月 042016
 

New default parameter values for Platform Suite for SASNew default parameter values for Platform Suite for SAS

Sometimes, when your kids grow older, they change their habits and you don’t recognize their behaviors any more. “We play this game every year at the beach. Don’t you like it anymore?” you ask. “Dad, I’m not seven years old any more”.

Well, Platform Suite for SAS is not seven any more. And its default behavior has changed, too.

Recognizing the release

Platform Suite for SAS ships with SAS Grid Manager offering, and (almost) every SAS maintenance changes the bundled release. It includes different products that do not share the same numbering sequence.  we are currently (as of 9.4M2) shipping Platform Suite for SAS version 8.1, which includes LSF 9.1.1.

Any new release adds additional features, expands the list of supported operating systems and increases the flexibility in configuring your environments. But the default values of the main parameters, that characterize how the software behaves out of the box, are usually untouched. Until now.

A faster start

After installing a new environment, we suggest to submitting many jobs, all at once, to check that LSF dispatches them correctly. If you have ever done this more than once, you surely remember that jobs take a while to start. The following screenshot shows the results of a bhist command issued about a minute after submitting 15 jobs on a newly installed, 3-nodes-grid that uses LSF 8.01 (SAS 9.4 and 9.4M1).

15 jobs submitted to a LSF 8 grid

15 jobs submitted to a LSF 8 grid

You can see that jobs are kept pending (column highlighted in red) while LSF decides which host is the best to run them. LSF starts jobs in subsequent batches every 20 seconds, and after a minute some jobs have not started yet (column highlighted in green).

Here is another screenshot, that shows the same bhist command issued only thirty seconds after submitting 15 jobs on a newly installed, 3-nodes-grid that uses LSF 9.11 (SAS 9.4M2). Can you spot the difference? All jobs start almost immediately, without losing any time in the pending state:

15 jobs submitted to a LSF 9 grid

15 jobs submitted to a LSF 9 grid

Isn’t this good?

Well, it depends. Platform LSF is a system built and tuned for batch interaction. As such, many components need some “think time” before being able to react. At the end, when submitting a 2-hour-long batch job, is it important if it takes 1 or 25 seconds to start?

Things change when we use LSF to manage interactive SAS workloads. End users do care if a SAS session takes 5 or 25 seconds to start when submitting a project from SAS Enterprise Guide. If it takes more than 60 seconds, the object spawner may even time out. The practice we use is to tune some LSF parameters, as shown in this blog post, to reduce grid services sleep times so that interactive sessions start faster.

Looking at the above results, how fast jobs start with LSF 9.1.1, gives us a hint that the new release has these parameters already tuned by default. That’s good! Isn’t it?

Comparing the values

To understand which parameters have new default values, it is possible to compare the Platform LSF Configuration Reference Version 9.1 with Platform LSF Configuration Reference Version 8.01. I verified what I found there by checking the actual LSF configuration files in two grid installations, then I built the following table to compare the main parameters we usually tune:

New default parameter values for Platform Suite for SAS3

The “Default” column reports values that are automatically set in configuration files after a default deployment. When a parameter is not defined in the configuration files, it takes the value listed under the “Undefined” column. As you can see, all of the actual values have been lowered.

What does this mean?

With the new default LSF 9 values, a SAS 9.4M2 grid is more responsive to interactive users and can accept more jobs that are submitted all at once, increasing the overall job throughput. Grid-launched workspace servers now start almost immediately (if there are enough resources to run them, obviously) with no timeouts or long waiting.

… BUT …

There is one problem with this configuration.

If you are familiar with LSF tuning, you may remember this note from the official documentation:

JOB_ACCEPT_INTERVAL: If 0 (zero), a host may accept more than one job. By default, there is no limit to the total number of jobs that can run on a host, so if this parameter is set to 0, a very large number of jobs might be dispatched to a host all at once […]  It is not recommended to set this parameter to 0.

Wait a minute. It is not recommended to set this parameter to 0, and the default for LSF 9.1.1 is 0?

If you check what actually happens inside the grid, you will find that yes, jobs start faster, but this has a price. LSF doesn’t have time to check how server load is impacted by these new jobs. LSF simply dispatches all jobs to the SAME server, until the node is full. Only then it sends jobs to another server. The following screenshot shows how all jobs end up running on the same server until it is full (same colors).

New default parameter values for Platform Suite for SAS4

Imagine if the jobs are grid-launched workspace servers. Many sessions will land on the same host all at once. The same users that were happy because their sessions start immediately, soon will complain because they will be contending with each other for resources from the same server.

How can I change this?

To solve this issue, bring back the value of the JOB_ACCEPT_INTERVAL parameter to 1. As with LSF 8, you may want to also lower a bit MBD_SLEEP_TIME and SBD_SLEEP_TIME, but that depends on the actual load on each grid environment. This screenshot shows the same environment running the same jobs, after being tuned. Jobs take a bit more to start, but they are now distributed evenly across all machines.

New default parameter values for Platform Suite for SAS5

The JOB_ACCEPT_INTERVAL parameter can also be set at the queue level, so a more advanced tuning could be to use different values (0, 1 or even more) based on the desired behavior of each queue. This second option is an advanced tuning that usually cannot be implemented during an initial configuration, as it requires careful design, testing and validation.

Changes in SAS 9.4M3

Did you think this long blog was over? Wait, there is more. With SAS 9.4M3 things changed again. This release bundles Platform Suite for SAS version 9.1, which includes LSF 9.1.3. It’s a small change in the absolute number, but the release includes a new parameter:

LSB_HJOB_PER_SESSION Specifies the maximum number of jobs that can be dispatched in each scheduling cycle to each host. LSB_HJOB_PER_SESSION is activated only if the JOB_ACCEPT_INTERVAL parameter is set to 0.

Now you can configure JOB_ACCEPT_INTERVAL=0 to achieve increased grid responsiveness and job throughput, and at the same time put a limit on how many jobs are sent to the same server before LSF starts dispatching them to a different node.

Or you can simply accept the default values for all the parameters: they have changed again, but this time they all fit our recommended practice, as shown in this final table:

New default parameter values for Platform Suite for SAS6

 

 

 

tags: SAS Administrators, SAS Grid Manager, SAS Professional Services

Is your new grid behaving oddly? was published on SAS Users.

3月 022016
 

parallelI recently received a call from a colleague that is using parallel processing in a grid environment; he lamented that SAS Enterprise Guide did not show in the work library any of the tables that were successfully created in his project.

The issue was very clear in my mind, but I was not able to find any simple description or picture to show him: so why not put it all down in a blog post so everyone can benefit?

Parallel processing can speed up your projects by an incredible factor, especially when programs consist of subtasks that are independent units of work and can be distributed across a grid and executed in parallel. But when these parallel execution environments are not kept in sync, it can also introduce unforeseen problems.

This specific issue of “disappearing” temporary tables can happen using different client interfaces, because it does not depend upon using a certain software, but rather on the business logic that is implemented. Let’s look at two practical examples.

SAS Studio

We want to run an analysis – here a simple proc print – on two independent subsets of the same table. We decide to use two parallel grid sessions to partition the data and then we run the analysis in the parent session.  The code we submit in SAS Studio could be similar to the following:

%let rc = %sysfunc( grdsvc_enable(_all_, server= SASApp));
signon grid1;
signon grid2;
proc datasets library=work noprint;
delete sedan SUV;
run;
rsubmit grid1 wait=no ;
data sedan;
set sashelp.cars;
where Type=”Sedan”;
run;
endrsubmit;
rsubmit grid2 wait=no ;
data SUV;
set sashelp.cars;
where Type=”SUV”;
run;
endrsubmit;
waitfor _ALL_ grid1 grid2;
proc print data=sedan;
run;
proc print data=SUV;
run;

After submitting the code by pressing F3 or clicking Run, we do not get the expected RESULT window and the LOG window shows some errors:

ThePitfallsofParallelJobs

SAS Enterprise Guide

Suppose we have a project similar to the following.

ThePitfallsofParallelJobs2

Many items can be created independently of others. The orange arrows illustrate the potential tasks which can be executed in parallel.

Let’s try to run these project tasks in parallel. Open File, Project Properties. Select Code Submission and flag “Allow parallel execution on the same server.”

ThePitfallsofParallelJobs3

This property enables SAS Enterprise Guide to create one or more additional workspace server connections so that parallel process flow paths can be run in parallel.

Note: Despite the description, when used in a grid environment the additional workspace server sessions do not always execute on the same server. The grid master server decides where these sessions start.

After we select Run, Run Process Flow to submit the code, SAS Enterprise Guide submits the tasks in parallel respecting the required dependencies.  Unfortunately, some tasks fail and red X’s appear over the top right corner of the task icons. In the log summary we may find the following:

ERROR: File WORK.QUERY_FOR_PRODUCTS.DATA does not exist.

What's going on?

The WORK library is the temporary library that is automatically defined by SAS at the beginning of each SAS session or job. The WORK  library stores temporary SAS files that are written by a data step or a procedure and then read as input of subsequent steps. After enabling the parallel execution on the grid, tasks run in multiple SAS sessions, and each grid session has its own dedicated WORK library that is not shared with any other grid session, or with the parent work session which started it.

In the SAS Studio example, the data steps outputs their results – the SEDAN and SUV tables – in the WORK library of a SAS session, then the PROC PRINT tries to read those tables from the WORK library of a different SAS session. Obviously the tables are not there, and the task fails.

ThePitfallsofParallelJobs4

This is a quite common issue when dealing with multiple sessions – even without a grid. One simple solution is to avoid using the WORK library and any other non-shared resources. It is possible to assign a common library in many ways, such as in autoexec files or in metadata.
The issue is solved:

ThePitfallsofParallelJobs5

No shared resources, am I safe?

Well, maybe not. Coming back to the original issue presented at the opening of this post, sometimes we oversee what ‘shared’ means. I’ll show you with this very simple SAS Enterprise Guide project: it’s just a simple query that writes a result table in the WORK library. You can test it on your laptop, without any grid.

ThePitfallsofParallelJobs6

After running it, the FILTER_FOR_AIR table appears in the server’s pane:

ThePitfallsofParallelJobs7

Now, let’s say we have to prepare for a more complex project and we follow again the steps to “Allow parallel execution on the same server. “ Just to be safe, we resubmit the project to test what happens. All seems unchanged, so we save everything and close SAS Enterprise Guide. Say I forgot to ask you to write down something about the results. We reopen the project and knowing the result table in the WORK library was temporary, we rerun the project to recreate it.

This time something is wrong.

ThePitfallsofParallelJobs8

The Output Data pane shows an error, and the Servers pane does not list the FILTER_FOR_AIR table anymore.
Even if we rerun the project, the table will not reappear.

The reason lies, again, in the realm of shared v.s. local libraries.

As soon as we enable “Allow parallel execution on the same server,” SAS Enterprise Guide starts at least one additional SAS session to process the code, even if there is nothing to parallelize. Results are saved only there, but SAS Enterprise Guide always tries to read them from the original, parent session. So we are again in the trap of local WORK libraries.

ThePitfallsofParallelJobs9

Why didn’t we uncover the issue the first time we ran the project? If you run your code, at least once, without the “Allow parallel execution on the same server” option, the results are saved in the parent session. And, they remain there even after enabling parallelization. As a result, we actually have two copies of the FILTER_FOR_AIR table!
As soon as we close SAS Enterprise Guide, both tables are deleted. So, on the next run, there is nothing in the parent session WORK library to send to SAS Enterprise Guide!

The solution? Same as before – only use shared libraries.

Is this all?

As you might have guessed, the answer is no. Libraries are not the only objects that should be shared across sessions. Every local setting – be it the value of an option, a macro, a format – has to be shared across all parallel sessions. Not difficult, but we have to remember to do it!

 

 

 

 

 

 

 

tags: parallel processing, sas enterprise guide, SAS Grid Manager, SAS Professional Services, SAS Programmers, sas studio

Avoid the pitfalls of parallel jobs was published on SAS Users.

2月 092016
 

keep your SAS Grid from slowing downAs another year goes by, many people think about new year’s resolutions. It’s probably the third year in row that I’ve promised myself I’d start exercising in the fabulous SAS gym. Of course, I blame the recently concluded holiday season. With all the food-focused events, I couldn’t resist, ate way too much and now feel like I can barely move. The same can happen to a SAS Grid environment, as I learned by helping some colleagues that were debugging a SAS Grid that “ate” too many jobs and often refused to move. In this blog I’ll share that experience in the hopes that you can learn how to keep your SAS Grid from slowing down.

The issue

The symptoms our customer encountered were random “freezes” of their entire environment without any predictable pattern, even with no or few jobs running on the SAS Grid, LSF daemons stopped responding for minutes. When it happened, not only did new jobs not start, but it was also impossible to query the environment with simple commands:

$ bhosts

LSF is processing your request. Please wait ...
LSF is processing your request. Please wait ...
LSF is processing your request. Please wait ...

Then, as unpredictably as the problem started, it also self-resolved and everything went back to normal… until the next time.

Eventually we were able to find the culprit. It all depends on the way mbatchd, the Master Batch Daemon, manages its internal events file: lsb.events.

Let’s see what this file is, and why it can cause troubles.

The record keeper

You can find details about the LSF events file in the official documentation.

What is it?

The LSF batch event log file lsb.events is used to display LSF batch event history and for mbatchd failure recovery. Whenever a host, job, or queue changes status, a record is appended to the event log file.

How is this file managed?

Use MAX_JOB_NUM in lsb.params to set the maximum number of finished jobs whose events are to be stored in the lsb.events log file. Once the limit is reached, mbatchd starts a new event log file. The old event log file is saved as lsb.events.n, with subsequent sequence number suffixes incremented by 1 each time a new log file is started. Event logging continues in the new lsb.events file.

The official documentation does not state much more, but additional online research reveals an interesting detail:

lsb.events file is moved to lsb.events.1, and each old lsb.events.n file is moved to lsb.events.n+1. The mbatchd never deletes these files. If disk storage is a concern, the LSF administrator should arrange to archive or remove old lsb.events.n files occasionally.

So what?

Well, by default LSF rolls over the events file every 1000 records (this is the default value up to LSF version 8). This small value is a legacy from the past, when an LSF restart could take forever if the value was too high. In our situation, a high job throughput caused the events file to roll over so often, that mbatchd produced tons and tons of old events files. And for each roll over, it had to rename from lsb.events.n to lsb.events.n+1 each and every file present in the log directory. This became such an overwhelming task, that mbatchd stopped responding to everything for minutes, just to do it.

Do not fall into the same trap

To avoid the problem we encountered, I encourage you to implement a good maintenance practice, with monitoring the LSF log directory (by default <LSF_TOP>/work/<cluster name>/logdir ) and periodically archiving or deleting older files.

With older versions of LSF, this issue can be prevented with the tuning described in this blog. All of the versions of LSF that have shipped with SAS 9.4 have solved this problem, according to LSF documentation:

Switching and replaying the events log file, lsb.events, is much faster. The length of the events file no longer impacts performance.

That’s why the latest documentation suggests keeping a higher number of records in the events file before triggering a roll over: set MAX_JOB_NUM in lsb.params to 10000 or even 100000 for high throughput environments.

With a bit of maintenance your SAS Grid can perform much better. I really have to do the same, and start the new year at the gym!

 

tags: SAS Administrators, SAS Grid Manager, SAS Professional Services

New year resolution: Don't let old stuff slow down your SAS Grid was published on SAS Users.

1月 212016
 

GridManagerAs of SAS 9.4 M3, SAS Grid Computing has a new tool: the Grid Manager Module for SAS Environment Manager 2.5. This module provides some of the same monitoring and management functions as IBM Platform RTM for SAS, so you can monitor and manage your grid using the same application that you use to manage the rest of your SAS environment.

As an LSF administrator, if defined also as a SAS Environment Manager user, you can access this Web application/module from SAS Environment Manager (http://YourSAS.AppServer:7080/)

GridManagerModule

Note: It is also possible to access the Grid Manager Web application directly through http://YourSAS.AppServer:7980/SASGridManager.

Out of the box, and based on deployment and configuration best practices, the Grid Manager Module for SAS Environment Manager 2.5 may not work as expected. The SAS Grid environment requires some post installation/configuration steps.

The goal of this post is to help you learn about these steps…

Why does SAS Grid Manager Module for SAS Environment Manager 2.5 require post installation/configuration steps?

The best practices for SAS Grid Computing are to deploy all SAS software components using a specific SAS Installation User (I.E.: sas or sasinst) account, and to deploy all IBM Platform Suite for SAS software components using a specific LSF Administrator User (I.E.:lsfadmin) account. The SAS Installation User and the LSF Administrator User are both members of a unique user group, generally “sas“.

Traditionally, the SAS software components are installed and configured using the SAS Deployment Wizard application, and the IBM Platform Suite for SAS software components are installed and configured using the IBM Platform installation tools.

But since SAS 9.4 M3, there is one component of the IBM Platform Suite for SAS software, IBM Platform PWS (Platform Web Services) that is installed and configured using the SAS Deployment Wizard. Because of that, IBM Platform PWS is installed, configured and managed by the SAS Installation User, not the LSF Administrator.

The IBM Platform PWS component, as the SAS Grid Manager Module for SAS Environment Manager 2.5, is a middle-tier application that “interfaces” with the IBM Platform LSF components that reside on the SAS Grid Control Server.

Each time an administrator manages the LSF or HA (High Availability) configuration using the SAS Grid Manager Module for SAS Environment Manager 2.5, this module has to read from and/or write to information from IBM Platform LSF components on the SAS Grid Control Server through IBM Platform PWS from the middle-tier host. If all of these components are not installed, configured, and managed by the same user, these actions may generate errors making it impossible to manage LSF and High Availability configurations.

For this reason, post installation/configuration steps are required to make the SAS Grid Manager Module for SAS Environment Manager 2.5 fully functional.

The post installation/configuration steps…

1 -   Obtain the required IBM Platform PWS fix for SAS Grid

Contact SAS Tech Support to obtain the IBM Platform PWS fix for SAS Grid.
The fix is named pws9.1.3_build65123.zip. The fix will contain a Readme.txt file that explains the installation process.

2 -   Configure your SAS Grid environment

A.  It is required that passwordless SSH be configured between the IBM Platform PWS host and the IBM Platform LSF master host (I.E.: SAS Grid Control Node) for the user who starts the IBM Platform PWS SpringSource tcServer instance (The SAS Installation User). This will allow the IBM Platform PWS user to SSH to the IBM Platform LSF master host as the Primary LSF Administrator (I.E.: ssh lsfadmin@lsf_master_host) without a password prompt.

B.  On the IBM Platform LSF master host (I.E.: SAS Grid Control Node) you need to modify/adjust the permissions against the IBM Platform LSF configuration files.

i.  Go to the IBM Platform LSF configuration directory.
[PlatformSuiteForSAS-Top-Directory]/lsf/conf/

ii.  Add the group write (‘w‘) permission to all files located under this directory which will allow the SAS Installation User to write/modify these files.
chmod -R g+w *

iii.  Add the group read (‘r‘) permission to certain files located under this directory which do not have it by default which will allow the SAS Installation User to read these files.

a.  Go to [PlatformSuiteForSAS-Top-Directory]/lsf/conf/ego/sas_cluster/kernel

b.  Change the group permissions on specific files
chmod -R g+r dh512.pem pamauth.conf server.pem users.xml

3 -   Install the IBM Platform PWS fix for SAS.

A.  Stop IBM Platform PWS

B.  Apply the IBM Platform PWS webapps/platform fix.

i.  Go to [SAS-Configuration-Directory]/Levn/Web/Staging/exploded/platformpws/platform

ii.  Backup, and replace these specific .class and .properties files (detailed in the Readme.txt file) with files from the IBM Platform PWS fix.

./WEB-INF/classes/com/platform/gui/pac/util/shell/ShellHelper.class
./WEB-INF/classes/com/platform/pws/pwsResource.properties
./WEB-INF/classes/com/platform/pws/util/lsfConfig/LSFConfigApplyUtil.class
./WEB-INF/classes/com/platform/pws/util/lsfConfig/LSFConfigApplyUtil$DataFormatException.class
./WEB-INF/classes/com/platform/pws/util/NonShareHelper.class
./WEB-INF/classes/com/platform/pws/util/NonShareHelper$FileException.class
./WEB-INF/classes/com/platform/pws/util/PWSUtil.class
./WEB-INF/classes/com/platform/pws/util/PWSUtil$ResultEntry.class
./WEB-INF/classes/com/platform/pws/webservice/impl/LSFConfigWebServiceImpl.class

iii.  Add this specific .class file from the IBM Platform PWS fix.

./WEB-INF/classes/com/platform/pws/util/lsfConfig/LSFConfigApplyUtil$LSFConfigApplyException.class

iv.  Go to [SAS-Configuration-Directory]/Levn/Web/WebAppServer/SASServer14_1/sas_webapps/platform.web.services.war

v.  Repeat steps "ii." and "iii." against this directory.

C.   Start/Restart IBM Platform PWS.

 

After applying these post configuration steps, you should now be able to use the SAS Grid Manager Module for SAS Environment Manager 2.5 to manage your grid much like you do with IBM Platform RTM.

GridManagerModule2

I hope this article has been helpful to you.

tags: SAS Administrators, SAS Environment Manager, SAS Grid Manager, SAS Prof

SAS Grid Series: Grid Manager Module for SAS Environment Manager – post configuration was published on SAS Users.

10月 262015
 

SAS Grid Manager for Hadoop is a brand new product released with SAS 9.4M3 this summer. It gives you the ability to co-locate your SAS Grid jobs on your Hadoop data nodes to let you further leverage your investment in your Hadoop infrastructure. This is possible because SAS Grid Manager for Hadoop is integrated with the native components, specifically YARN and Oozie, of your Hadoop ecosystem. Let's review the architecture of this new offering.

First of all, the official name– SAS Grid Manager for Hadoop– shows that it is a brand new product, not just an addition or a different configuration of the “classic” SAS Grid Manager – which I will subsequently refer to as “for Platform” to distinguish the two.

For an end user, grid usage and functionality remains the same, but an architect will notice that many components of the offering have changed. Describing these components will be the focus of the remainder of this post.

Let me start by showing a picture of a sample software architecture, so that it will be easier to recognize all the pieces with a visual schema in front of us. The following is one possible deployment architecture; there are other deployment choices.

SAS_Grid_Manager_for_Hadoop_9_4M3_Architecture_v1_1_full

Third party components

Just as SAS Grid Manager for Platform builds on top of third party software from Platform Computing (part of IBM), SAS Grid Manager for Hadoop requires Hadoop to function. There is a big difference, though.

SAS Grid Manager for Platform includes all of the required Platform Computing components, as they are delivered, installed and supported by SAS.

On the other side, SAS Grid Manager for Hadoop considers all of the Hadoop components (highlighted in yellow in the above diagram) as prerequisites. As such, customers are required to procure, install and support Hadoop before SAS gets installed.

Hadoop, as you know, includes many different components. The diagram lists the one that are needed for SAS Grid Manager:

  • HDFS provides cluster-wide filessytem storage
  • YARN is used for resource management
  • Oozie is the scheduling service
  • Hue is required, if the Oozie web GUI is surfaced through Hue.
  • Hive is required at install time for the SAS Deployment Wizard to be able to access the required Hadoop configuration and jar files.
  • Hadoop jars and config files need to be on every machine, including clients.

YARN Resource Manager, HDFS Name Node, Hive, and Oozie are not necessarily on the same machine. By default, the SAS grid control server needs to be on the machine that YARN Resource Manager is on.

SAS Components

SAS programming interfaces to grid have not changed, apart from the lower-level libraries to connect to the third party software. As such, SAS will deploy the traditional SAS grid control server, SAS grid nodes, SAS thin client (aka SASGSUB) or the full SAS client (SAS Display Manger).

In a typical SAS Grid deployment, a shared directory is used to share the installation and configuration directories between machines in the grid. With SAS Grid Manager for Hadoop, you can either use NFS to mount a shared directory on all cluster hosts or use the SAS Deployment Manager (SDM) to work with the cluster manager to distribute the deployment to the cluster hosts. The SDM has the ability to create Cloudera parcels and Ambari packages to enable the distribution of the installation and configuration directories from the grid control server to the grid nodes.

One notable missing component is the SAS Grid Manager plug-in for SAS Management Console. This management interface is tightly coupled with Platform Computing GMS, and cannot be used with Hadoop.

The Middle Tier

You will notice in the above diagram that the middle tier is faded. In fact, no middle tier components are included in SAS Grid Manager for Hadoop. Anyway, a middle tier will generally be included and deployed as part of other solutions licensed on top of SAS Grid Manager, so you will still be able to program using SAS Studio and monitor the SAS infrastructure using SAS Environment Manager.

Please note that I say “monitor the SAS infrastructure”, not “monitor the SAS grid.” There are no plug-ins or modules within SAS Environment Manager that are specific to SAS Grid Manager for Hadoop.   This is by design because SAS is part of your overall Hadoop environment and therefore the SAS Grid workload can be monitored using your favorite Hadoop management tools.

Hadoop provides plenty of web interfaces to monitor, manage and configure its environment. As such, you will be able to use YARN Web UI to monitor and manage submitted SAS jobs, as well as Hue web UI to review scheduled workflows.

The Storage

Discussing grid storage is never a quick task and could require a full blog post on its own. It is worth noting some architecture peculiarities related to SAS Grid Manager for Hadoop. HDFS can be used to store shared data, and is used to store scheduled jobs, workflows, logs. But, we still require a traditional, POSIX complaint filesystem for stuff such as SAS Work, SASGSUB, solution specific projects, etc.

Conclusion

SAS Grid Manager for Hadoop enables customers to co-locate their SAS Grid and all of the associated SAS workload on their existing Hadoop cluster. We have briefly discussed the key components that are included in – or are missing from – this new offering. I hope you found this post helpful. As always, any comments are welcome.

tags: grid, Hadoop, SAS architecture, SAS Grid Manager, SAS Grid Manager for Hadoop, SAS Professional Services

SAS Grid Manager for Hadoop architecture was published on SAS Users.

8月 272015
 

GridIf you have, or are considering, SAS Grid Manager, you’ll be excited to hear about two new changes to the product that make it even better for managing and processing in your analytics environment.

The two changes were made in conjunction with the release of SAS 9.4M3 in July, and included:

  • The introduction of the SAS Grid Manager agent plug-in for SAS Environment Manager.
  • The substitution of the SAS Grid Manager Plug-in for SAS Environment Manager with the SAS Grid Manager Module for SAS Environment Manager in the middle tier.

The SAS Grid Manager agent plug-in

The SAS Grid Manager agent plug-in monitors grid resources over time and generates events and alerts based on the monitored information.

The plug-in provides metric data for the grid cluster and for individual grid hosts and uses that data to:

  • display the current state of grid resources
  • graph the data over time, providing a historical view and enabling you to see how the data changes
  • create alerts that notify you whenever a selected measurement reaches a selected state

These new features allow administrators to closely monitor their grid and more effectively use their resources.

The SAS Grid Manager Module

The SAS Grid Manager Module for SAS Environment Manager replaces the SAS Grid Manager Plug-in for SAS Environment Manager and enables administrators to configure and perform actions on grid resources and high availability applications. It also provides a view of the grid at any moment in time.

These changes – the introduction of SAS Grid Manager agent plug-in and the new SAS Grid Manager Module – give administrators new tools, knowledge and know-how that helps them better manage a modern grid.

Additional Considerations

SAS Grid Manager 9.4M2 Plug-in for SAS Environment Manager, 9.4M3 SAS Grid Manager Module for SAS Environment Manager and SAS Grid Manager Agent Plug-in use the Platform Web Services application for gateways, storing data and configuration information within the WIP Data Server.

SAS Grid Manager mid-tier components use the SAS Environment Manager to monitor, manage and configure a SAS grid cluster with high availability applications.

The SAS Grid environment is administered from a single web application, eliminating the need for RTM and the SAS Management Console Grid Plug-ins.

If you already use SAS Grid Manager, be sure to check these new enhancements. More information on both changes can be found in our official documentation.

And, if you want to learn more about SAS Grid Manager, visit our product page.

Thanks for reading.

Edoardo

tags: SAS Administrators, SAS Grid Manager, SAS Professional Services

What’s new in SAS Grid Manager 9.4M3 architecture was published on SAS Users.