data preparation

8月 022016
 

A long time ago, I worked for a company that had positioned itself as basically a third-party “data trust” to perform collaborative analytics. The business proposition was to engage different types of organizations whose customer bases overlapped, ingest their data sets, and perform a number of analyses using the accumulated […]

The post Crowdsourcing data assets in the data lake appeared first on The Data Roundtable.

7月 282016
 

Data preparation before modeling is an unavoidable chore. One of the most time-consuming tasks can be cleaning up categorical data that may have misspellings, inconsistent capitalization and abbreviations, and the like. The Recode tool in JMP makes data prep a lot easier. Watch this video by my colleague Ryan DeWitt […]

The post Video: Using Recode in JMP for data preparation appeared first on JMP Blog.

4月 112016
 

I used JMP to explore a recent FAA drone data set, inspired by a weekly data visualization challenge called 52Vis. The data set contains "reports of unmanned aircraft (UAS) sightings from pilots, citizens and law enforcement." I decided to focus on exploring the time data. I'll describe how I prepared the time […]

The post Visualizing data quality with FAA drone data appeared first on JMP Blog.

4月 082016
 

A soccer fairy tale Imagine it's Soccer Saturday. You've got 10 kids and 10 loads of laundry – along with buried soccer jerseys – that you need to clean before the games begin. Oh, and you have two hours to do this. Fear not! You are a member of an advanced HOA […]

The post Can SAS Data Management get you to soccer on time? appeared first on The Data Roundtable.

2月 052016
 

Gerhard Svolba is a colleague at SAS who is not only an experienced analyst and a caring father, but also an author for SAS Press and an enthusiastic sailor. He has done valuable research about detecting data quality problems and their consequences for data analysis. As a statistician, I’m well […]

The post Sailing and the art of data quality assessment appeared first on JMP Blog.

1月 112016
 

When my band first started and was in need of a sound system, we bought a pair of cheap yet indestructible Peavey speakers, some Radio Shack microphones and a power mixer. The result? We sounded awful and often split our ear drums from high-pitched feedback and raw, untrained vocals. It took us years […]

The post Self-service data preparation transforms data professionals into data rock stars appeared first on The Data Roundtable.

12月 222015
 

In two previous posts (Part 1 and Part 2), I explored some of the challenges of managing data beyond enterprise boundaries. These posts focused on issues around managing and governing extra-enterprise data. Let’s focus a bit on one specific challenge now – satisfying the need for business users to rapidly ingest new data sources. Sophisticated business […]

The post Agility in external data ingestion appeared first on The Data Roundtable.

12月 162015
 

If you read my last post, then you know that I’m giving myself the gift of data this holiday season! For me, collecting data on my diet and fitness habits is a gift that just keeps on giving. Although I may not look at all my data sets on a […]

The post Visualizing holiday food log patterns appeared first on JMP Blog.

10月 132015
 

For anyone doing data analysis, it’s common to devote a large proportion of time to data cleaning. Real data is dirty data. Even if you are using automated collection tools, errors, anomalies and inconsistencies will still make their way into your data. Over the next few blog posts, I’d like […]

The post It's a dirty job, but somebody has to do it: Data cleaning with JMP appeared first on JMP Blog.