2月 232012
I recently presented a session on big data at the 13th Annual Privacy and Security hosted by the Province of British Columbia and held in Victoria. There were a number of interesting discussions and presentations that relate to privacy and security ramifications of big data. The discussion was timely given [...]
11月 232010
Many people mistakenly assume that just because you want to use a SAS program to access a protected resource (such as a database table), you must include the credentials for the resource inside your program.

Few things cause a database administrator to lose more sleep than coming across this within a SAS program:

 libname ora10 oracle
    path=ora10g2 schema=PAYROLL
It doesn't have to be this way! SAS 9.2 offers so many security-related features that you should never have to code user IDs and passwords in your SAS programs again. This blog post summarizes my five favorite approaches.

1. Use the AUTHDOMAIN= option to access SAS libraries
The AUTHDOMAIN option allows you to delegate the authentication by allowing SAS to "look up" credentials as needed using the SAS metadata environment. For each domain that has a unique set of credentials, your SAS administrator can create an "auth domain" in metadata and associate it with a database server (or other resource). Every SAS/ACCESS database engine supports AUTHDOMAIN=, in LIBNAME statements as well as in PROC SQL CONNECT statements.

With AUTHDOMAIN, the above LIBNAME example becomes:

 libname ora10 oracle
    path=ora10g2 schema=PAYROLL
The beauty of this solution is that SAS can resolve the database credentials differently for each user or group who runs this program, using the credentials that are defined in metadata for those identities.

Bonus: AUTHDOMAIN works for other resources too, such as FTP connections.

2. Use the META engine or INFOMAPS engine to access data sources
The META engine provides a layer of indirection in front of a SAS library, so that not only do you not need credentials to access it, you don't even need to know the details of how the data tables are stored. A LIBNAME statement can be as simple as:

libname MYLIB Library="My SAS Library";
where "My SAS Library" is an administered library that a SAS administrator defined in SAS Management Console. MYLIB might resolve to a folder with SAS data sets, or it might be a set of tables in a Teradata database. The implementation details are hidden from the program and the programmer.

Sometimes you might see an example that looks like this:

libname MYLIB Library="My SAS Library"
   metaserv="" /* don't want this */
   metaport=8561   /* don't want this */
   user="userid"      /* don't want this */
   pw="mySecret";  /* don't want this */
If you run your programs in SAS Enterprise Guide or within a SAS stored process, you don't need (and don't want) the connection/credential information! The SAS session you're connected to already knows who you are, and the extra connection information isn't necessary. (If you make use of the META engine in client applications, you'll want to read up on the nuances described in this tech support paper -- based on 9.1.3 but still mostly relevant for SAS 9.2.)

The INFOMAPS engine provides an administered view of data, further abstracted from the physical structure of the data tables. Check out my previous post for a detailed example of programming with Information Maps.

3. Use SAS Token Authentication
SAS Token Authentication uses your established SAS metadata connection to generate and validate single-use tokens for every other SAS-related resource that you might need access to. To put this another way, once the SAS metadata server knows who you are, it "vouches for you" and facilitates connections to anything else you might need, including SAS workspace servers, OLAP servers, database servers (using AUTHDOMAIN) and more.

A big advantage of SAS Token Authentication is that you don't actually need a host account for all of the resources that you might connect to. This cuts down on the sys admin tasks required to get a group of users up and running. It's a best-practice alternative to using group host accounts on a SAS workspace server; configure the SAS workspace server to use SAS token authentication instead.

4. Use Integrated Windows Authentication
The best way to hide passwords? Don't have them in the first place. That's what many customers do when they implement fingerprint readers or retina scanners in their corporate workstations. The users of these workstations wouldn't know how to supply a password if you asked for one.

Even if your users need a password to log in to your workstation, you can use Integrated Windows Authentication to prevent them from seeing another password challenge as they connect to their SAS environment.

SAS Token Authentication and Integrated Windows Authentication are examples of authentication mechanisms that SAS 9.2 supports. Using these authentication mechanisms can reduce the "authentication friction" that results when your SAS applications must hop among different protected resources that would traditionally require a user ID and password to access.

5. If you must, use PROC PWENCODE to obscure passwords
Sometimes, despite your best efforts, you cannot avoid the odd password in your programs. For example, if you've got to access password-protected SAS data sets, you need to specify the password. But there is no need to have the clear-text password appear in your code. You can use the PWENCODE procedure to encode the password so that prying eyes cannot guess at it. For example:

proc pwencode in="ItzaSecret"
yields this in the log:

18         proc pwencode in=XXXXXXXXXXXX
19           method=sas002;
20         run;


NOTE: PROCEDURE PWENCODE used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds
You can then use the encoded password in your program:

proc means
WARNING: Using an encoded password does not prevent someone else from running your program, encoded-password intact, and accessing the same protected resource. So you still need to protect the content of your program. However, at least someone glancing over your shoulder won't be able to guess your password, and most likely won't be able to memorize the encoded gobbledy-gook that appears in the password field.
10月 222010
Recently Facebook launched a new anti-bullying campaign. Andrew Noyes, Facebook’s Communications Manager, described the methods Facebook uses to help stop cyber bullies:
  1. "Neighborhood Watch" – the Facebook community reports offensive web pages for a Facebook team to review and take action.
  2. Technology to capture and flag the offending comment or message.
Andrew is somewhat coy about #2, focusing instead on the self-policing of the community. In this post we will investigate some of the likely approaches used in step #2 which, particularly when combined with #1, could be very effective in countering cyber bullying.

Following the optimal hybrid approach – human expertise plus machine learning – we would:
  • Gather information.
  • Analyze the data.
  • Learn from the results.
  • Use discovery and topic migration to stay on top of the problem.
I’ll outline how this hybrid approach can be applied in the remainder of this post.
Continue reading "Text analytics: how to capture the online bully"
10月 222010
Recently Facebook launched a new anti-bullying campaign. Andrew Noyes, Facebook’s Communications Manager, described the methods Facebook uses to help stop cyber bullies: "Neighborhood Watch" – the Facebook community reports offensive web pages for a Facebook team to review and take action. Technology to capture and flag the offending comment or [...]