In my previous post I discussed the practice of putting data quality processes as close to data sources as possible. Historically this meant data quality happened during data integration in preparation for loading quality data into an enterprise data warehouse (EDW) or a master data management (MDM) hub. Nowadays, however, there’s a lot of […]
Throughout my long career of building and implementing data quality processes, I've consistently been told that data quality could not be implemented within data sources, because doing so would disrupt production systems. Therefore, source data was often copied to a central location – a staging area – where it was cleansed, transformed, unduplicated, restructured […]
A soccer fairy tale Imagine it's Soccer Saturday. You've got 10 kids and 10 loads of laundry – along with buried soccer jerseys – that you need to clean before the games begin. Oh, and you have two hours to do this. Fear not! You are a member of an advanced HOA […]
The post Can SAS Data Management get you to soccer on time? appeared first on The Data Roundtable.
I am currently cycling through a schema-on-read data modeling process on a specific task for one of my clients. I have been presented with a data set and have been asked to consider how that data can be best analyzed using a graph-based data management system. My process is to […]
Traditional data governance is all about establishing a boundary around a specific data domain. This translates to establishing authority to define key business terms within that domain; establishing business-driven decision making processes for changing the business terminology and the rules that apply to them; defining content standards (e.g., metadata and […]
(Otherwise known as Truncate – Load – Analyze – Repeat!) After you’ve prepared data for analysis and then analyzed it, how do you complete this process again? And again? And again? Most analytical applications are created to truncate the prior data, load new data for analysis, analyze it and repeat […]
The post Data management for analysis – Feeding the analytical monster more than once appeared first on The Data Roundtable.
Once you have assessed the types of reporting and analytics projects and activities are to be done by the community of data analysts and consumers and have assessed their business needs and requirements for performance, you can then evaluate – with confidence – how different platforms and tools can be combined to satisfy […]
The post Integration and publication: Data management for analytics appeared first on The Data Roundtable.
In my previous post I used junk drawers as an example of the downside of including more data in our analytics just in case it helps us discover more insights only to end up with more flotsam than findings. In this post I want to float some thoughts about a two-word concept […]
In April, the free trial of SAS Data Loader for Hadoop became available globally. Now, you can take a test drive of our new technology designed to increase the speed and ease of managing data within Hadoop. The downloads might take a while (after all, this is big data), but I think you’ll […]
The post Self-service big data preparation in the age of Hadoop appeared first on The Data Roundtable.