Tamara Dull

9月 122015
 

If a data lake isn’t a data warehouse, as I proposed in my last post, then it behooves us to better understand more about this “new” data lake structure. In the fifth and final post in this series titled, Big Data Cheat Sheet on Hadoop, we’ll highlight some of the pros and cons of a data lake using a SWOT diagram.

Question 5: What are some of the pros and cons of a data lake?

This discussion comes from an online debate I had earlier this year with my colleague, Anne Buff, where we discussed the pros and cons of a data lake in context of this resolution: The data lake is essential for any organization that wants to take full advantage of its data. I took the Pro stance, while Anne took the Con stance.

Even though our online debate was focused on the data lake, it forced us to address the larger discussion of managing growing volumes of data in a big data world. With the onslaught of big data technologies in recent years—the most popular being the open source project, Apache Hadoop—organizations are having to look once again at the underlying technologies supporting their data collection, processing, storage, and analysis activities.

The Hadoop-based data lake happens to be a popular option right now. The SWOT diagram below identifies some of the key factors when considering a data lake. Keep in mind that this is just a quick snapshot (with brief explanations following), and not a comprehensive list:

datalake-swot

Strengths

  • Lower costs. A Hadoop-based data lake is largely dependent on open source software and is designed to run on low-cost commodity hardware. So from a software and hardware standpoint, there’s a huge cost savings that cannot be ignored.
  • One-stop data shopping. Hadoop is no respecter of data. It will store and process it all – structured, semi-structured, and unstructured—at a fraction of the cost and time of your existing, traditional systems. There’s much to be gained from having all (or much of) your data in one place – mixing and matching data sets like never before.

Weaknesses

  • Data management. We can get hung up talking about the volume, variety, and velocity of (big) data, but equally important to this discussion is being able to govern and manage all of it, regardless of the underlying technologies. For a Hadoop-based data lake, both open source projects and vendor products continue to mature/be developed to support this increasing demand. We’re moving in the right direction—rapidly—but we’re not quite there yet.
  • Security. Hadoop-based security has been a long-time issue, but there’s significant effort and progress being made by the open source community and vendors to support an organization’s security and privacy requirements. While it’s easy to finger wag at this particular “weakness,” it’s important to recognize that the weekly (and almost daily) reports we hear about this-&-that data breach are primarily attacks on existing traditional systems, not these newer big data systems.

Opportunities

  • Discovery. This feature allows users to discover the “unknown unknowns.” Unlike existing data warehouses where users are limited with both the questions and answers they can ask and get answers for, with a Hadoop-based data lake, the sky’s the limit. A user can go to the data lake with the same set of questions she had for the data warehouse and get the same, or even better, answers. But she can also discover previously-unknown questions, thus driving her to more answers, and ideally, better insights.
  • Advanced analytics. A lot of software apps include descriptive analytics, showing a user pretty visuals about what’s happened. We’ve had this capability for decades. With big data, however, organizations need advanced analytics—such as prescriptive, predictive, and diagnostic—to really get ahead of the game (and one could even argue to stay in the game). A Hadoop-based data lake provides that opportunity.

Threats

  • Status quo. This is not a new threat, especially for software vendors, but it’s a very real threat. The cost and time required to migrate towards these newer big data technologies is not insignificant. This is not a case of hot-swapping technologies while no one is looking. It will also impact the people, processes, and the culture in your organization—if done right.
  • Skills. There is no question that there is a skills shortage for these big data technologies. Even though this shortage can be viewed as a threat to Hadoop adoption, it shouldn’t be seen as a negative. These big data technologies are new, they’re evolving, and there’s a lot of experimentation going on to figure out what’s needed, what’s not, what should stick, what shouldn’t, etc. Thus, it should be no surprise that as our technologies evolve, so will the skills required. We have an opportunity to take what we have and know to a new level and help prepare the next generation to excel in our data-saturated society.

The bottom line. There are well-known weaknesses and threats associated with a data lake—some of which I have highlighted here—and we cannot ignore these. But there are also significant strengths and opportunities to explore. I believe an organization can take full advantage of its data if there’s a way for them to bring it all together without breaking the bank. A data lake can help make this dream a reality.

This is the final post in a 5-part series, "Big Data Cheat Sheet on Hadoop." This spin-off series for marketers was inspired by a popular big data presentation I delivered to executives and senior management at a recent SAS Global Forum Executive Conference.


Editor’s note:

If you did not read the previous posts in this series, I encourage you to read those as well. Tamara's goal here has been to enable you to have an informed view of how this area of technology can support your marketing strategy. Armed with these perspectives, hopefully you can partner even more closely with I.T. and operations to deliver the best possible customer experience.

Once you're comfortable with Hadoop and want to delve deeper into analytically-driven marketing solutions, start with our Customer Intelligence home page at: www.sas.com/customerjourney.

And as always, thank you for following!

 

tags: big data, Big Data Cheat Sheet on Hadoop, customer experience, Hadoop, sas global forum executive conference

Marketers ask: What are some of the pros and cons of a data lake? was published on Customer Analytics.

8月 282015
 

In this 5-part blog series on the Big Data Cheat Sheet on Hadoop, we’re taking a look at these five questions from the perspective of a marketer:

Image showing a serene lake at sunset.

This lake is NOT a data lake.

  • What can Hadoop do that my data warehouse can’t?
  • Why do we need Hadoop if we’re not doing big data?
  • Is Hadoop enterprise-ready?
  • Isn’t a data lake just the data warehouse revisited?
  • What are some of the pros and cons of a data lake?

We’ve already tackled the first three questions, and we’re now on question 4, so it’s time to talk about the data lake.

Question 4: Isn’t a data lake just the data warehouse revisited?

Some of us have been hearing more about the data lake, especially during the last six months. There are those that tell us the data lake is just a reincarnation of the data warehouse—in the spirit of “been there, done that.” Others have focused on how much better this “shiny, new” data lake is, while others are standing on the shoreline screaming, “Don’t go in! It’s not a lake—it’s a swamp!”

All kidding aside, the commonality I see between the two is that they are both data storage repositories. That’s it. But I’m getting ahead of myself. Let’s first define data lake to make sure we’re all on the same page. James Dixon, the founder and CTO of Pentaho, has been credited with coming up with the term. This is how he describes a data lake:

“If you think of a datamart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.”

And earlier this year, my colleague, Anne Buff, and I participated in an online debate about the data lake. My rally cry was #GOdatalakeGO, while Anne insisted on #NOdatalakeNO. Here’s the definition we used during our debate:

“A data lake is a storage repository that holds a vast amount of raw data in its native format, including structured, semi-structured, and unstructured data. The data structure and requirements are not defined until the data is needed.”

The table below helps flesh out this definition. It also highlights a few of the key differences between a data warehouse and a data lake. This is, by no means, an exhaustive list, but it does get us past this “been there, done that” mentality:

dw-vs-datalake

Let’s briefly take a look at each one:

  • Data. A data warehouse only stores data that has been modeled/structured, while a data lake is no respecter of data. It stores it all—structured, semi-structured, and unstructured. [See my big data is not new The data warehouse can only store the orange data, while the data lake can store all the orange and blue data.]
  • Processing. Before we can load data into a data warehouse, we first need to give it some shape and structure—i.e., we need to model it. That’s called schema-on-write. With a data lake, you just load in the raw data, as-is, and then when you’re ready to use the data, that’s when you give it shape and structure. That’s called schema-on-read. Two very different approaches.
  • Storage. By definition, a relational data warehouse stores data in a hierarchical manner, while a data lake is object-based and is not dependent on any hierarchy.
  • Agility. A data warehouse is a highly-structured repository, by definition. It’s not technically hard to change the structure, but it can be very time-consuming given all the business processes that are tied to it. A data lake, on the other hand, lacks the structure of a data warehouse—which gives developers and data scientists the ability to easily configure and reconfigure their models, queries, and apps on-the-fly.
  • Security. Data warehouse technologies have been around for decades, while big data technologies (the underpinnings of a data lake) are relatively new. Thus, the ability to secure data in a data warehouse is much more mature than securing data in a data lake. It should be noted, however, that there’s a significant effort being placed on security right now in the big data industry. It’s not a question of if, but when.
  • Users. For a long time, the rally cry has been BI and analytics for everyone! We’ve built the data warehouse and invited “everyone” to come, but have they come? On average, 20-25% of them have. Is it the same cry for the data lake? Will we build the data lake and invite everyone to come? Not if you’re smart. Trust me, a data lake, at this point in its maturity, is best suited for the data scientists.

Why this matters

At the most fundamental level, big data is mostly driven by customer-related activity, and Hadoop is a very effective way to handle big data. And as a marketer, you may hear rumblings that your organization is setting up a data lake and/or your marketing data warehouse is a candidate to be migrated to this data lake. It’s important to recognize that while both the data warehouse and data lake are storage repositories, the data lake is not Data Warehouse 2.0 nor is it a replacement for the data warehouse.

So to answer the question—Isn’t a data lake just the data warehouse revisited?—my take is no. A data lake is not a data warehouse. They are both optimized for different purposes, and the goal is to use each one for what they were designed to do. Or in other words, use the best tool for the job.

This is not a new lesson. We’ve learned this one before. Now let’s do it.

This is the 4th post in a 5-part series, "Big Data Cheat Sheet on Hadoop." This spin-off series for marketers was inspired by a popular big data presentation I delivered to executives and senior management at a recent SAS Global Forum Executive Conference.


Editor’s note:

If you did not read the previous posts in this series, I encourage you to read those as well. Tamara's goal with this series is to enable you to have an informed view of how this area of technology can support your strategy. Armed with these perspectives, hopefully you can partner even more closely with I.T. and operations to deliver the best possible customer experience.

Once you're comfortable with Hadoop and want to delve deeper into analytically-driven marketing solutions, start with our Customer Intelligence home page at: www.sas.com/customerjourney.

And as always, thank you for following!

tags: big data, Big Data Cheat Sheet on Hadoop, data lake, Hadoop, sas global forum executive conference

Marketers ask: Isn’t a data lake just the data warehouse revisited? was published on Customer Analytics.

8月 142015
 

In response to my last post—Marketers ask: Why do we need Hadoop if we’re not doing big data?—a Tweet: "Why should marketers worry about Hadoop at all?"Twitter follower asked this question:

It’s a fair question. Typically, marketers are more interested in the car (in this case, big data) than they are in the engine (Hadoop). But Hadoop is not just another faster, more cost-effective engine option. It’s a game changer in the world of data management—much like the Prius and Tesla have been in the world of gas-guzzling cars, trucks, and SUVs.

Do marketers need to understand how Hadoop works? Not at all. But what should interest them is if and how this popular big data technology can help them gain better and more informed insights about their customers. If (big) data can indeed help take the customer experience from a 3-star to a 5-star experience, then isn’t it worth understanding what all the Hadoopla is about?

This dovetails nicely into our 3rd question in this 5-part series. My answer will be short—and it may surprise you.

Question 3: Is Hadoop enterprise-ready?

I have two answers to this question:

  • For your organization: Maybe.
  • For all organizations: No.

It all depends on what and why you want to use Hadoop in your organization. If you simply want to use it as an additional (or alternative) storage repository and/or as a short-term data processor, then by all means, Apache Hadoop is ready for you. (My last post discusses six ways Apache Hadoop can be used.)

However, if you want to go beyond data storage and processing, and you’re looking for some of the same data management and analysis capabilities you currently have with your existing data ecosystem, Apache Hadoop alone is not going to cut it.

As I mentioned in my first post, you will need to get technical assistance—from IT and developers, internal and external—to explore the vast ecosystem of Hadoop-related open source and proprietary projects and products to achieve your objectives. This will not be a small undertaking. Remember, you don’t want to “do Hadoop” just because everyone else is doing it or because it looks good on paper or it’s cheap for IT to install. You want to do Hadoop if it helps address or solve real business issues your organization is facing. Start with your requirements list first before you start looking at Hadoop.

One final point to consider is that many of these newer Hadoop-related technologies are still maturing—quite rapidly, I might add. They don’t have the decades of R&D behind them like our existing relational systems. That’s not a strike against Hadoop; it’s just the reality of where we are today. That’s why I say Hadoop—as in the Hadoop ecosystem—isn’t 100% ready for the enterprise. Yet.

This is the 3rd post in a 5-part series, "Big Data Cheat Sheet on Hadoop." This spin-off series for marketers was inspired by a popular big data presentation I delivered to executives and senior management at a recent SAS Global Forum Executive Conference.


Editor’s note:

If you did not read the previous posts in this series, I encourage you to read those as well. Tamara's goal with this series is to enable you to have an informed view of how this area of technology can support your strategy. Armed with these perspectives, hopefully you can partner even more closely with I.T. and operations to deliver the best possible customer experience.

Once you're comfortable with Hadoop and want to delve deeper into analytically-driven marketing solutions, start with our Customer Intelligence home page at: www.sas.com/customerjourney. And as always, thank you for following!

tags: big data, Big Data Cheat Sheet on Hadoop, customer experience, Hadoop, sas global forum executive conference

Marketers ask: Is Hadoop enterprise-ready? was published on Customer Analytics.

7月 242015
 

"Our corporate data is growing at a rate of 27% each year and we expect that to increase. It’s just getting too expensive to extend and maintain our data warehouse.”

“Don’t talk to us about our ‘big’ data. We’re having enough trouble getting our ‘small’ data processed and analyzed in a timely manner. First things first.”

“We have to keep our data for 7 years for compliance reasons, but we’d love to store and analyze decades of data - without breaking the machine and the bank.”

Do any of these scenarios ring a bell? If so, Hadoop may be able to help. In this 5-part blog series, Big Data Cheat Sheet on Hadoop, we’re taking a look at five big data questions from the perspective of a marketer. This post answers the second question in the series to help marketers understand how these big data technologies are impacting (or can impact) the customer experience, and what you can do to take advantage of this data playground.

Question 2: Why do we need Hadoop if we’re not doing big data?

Contrary to popular belief, Hadoop is not just for big data. (For purposes of this discussion, big data simply refers to data that doesn't fit comfortably – or at all – into your existing relational systems.) Granted, Hadoop was originally developed to address the big data needs of web/media companies, but today, it's being used around the world to address a wider set of data needs, big and small, by practically every industry.

In my white paper, The Non-Geek’s Big Data Playbook: Hadoop and the Enterprise Data Warehouse, I propose six common Hadoop use cases—three of which don’t require “big” data at all to take full advantage of Hadoop:

6 Hadoop Use Cases

Here’s a brief summary of each use case:

  1. Stage structured data. Use Hadoop as a data staging platform for your data warehouse.

What if you used Hadoop to process and transform your operational data before loading it into your data warehouse? The bonus is that because of the low cost of Hadoop storage, you could store both versions of the data in Hadoop: the raw, native data and the transformed data. Your data would now all be in one place, making it easier to manage, re-process, and analyze at a later date.

  1. Process structured data. Use Hadoop to update data in your data warehouse and/or operational systems.

Instead of using costly data warehouse resources to update data in the warehouse, why not send the necessary data to Hadoop, let Hadoop do its thing, and then send the updated data back to the warehouse? This use case not only applies to processing your warehouse data, but also data in any of your operational or analytical systems. Take advantage of Hadoop’s low-cost processing power so that your relational systems are freed up to do what they do best.

  1. Archive all data. Use Hadoop to archive all your data on-premises or in the cloud.

Since Hadoop runs on commodity hardware that scales easily and quickly, organizations can now store and archive a lot more data at a much lower cost. For example, what if you didn’t have to destroy data after its regulatory life to save on storage costs? What if you could easily and cost-effectively keep all your data? Or maybe it’s not just about keeping the data on-hand, but rather, being able to analyze more data. Why limit your analysis to the last three, five or seven years when you can easily store and analyze decades of data? Isn't this a data geek’s paradise?

  1. Process any data. Use Hadoop to take advantage of data that’s currently unavailable in your enterprise data warehouse ecosystem.

This use case focuses on two categories of data: (1) structured data sources that have not been integrated into your data warehouse and (2) unstructured and semi-unstructured data sources. More generally, it’s any data that’s currently not part of your warehouse ecosystem that could be providing additional insight into your customers, products and services. Because Hadoop can store and process any data, it can pick up the slack for data that your data warehouse cannot or doesn’t handle well.

  1. Access any data (via data warehouse). Use Hadoop to extend your data warehouse and keep it at the center of your organization’s data universe.

This use case is geared towards companies that want to keep the enterprise data warehouse as the de facto system of record—at least for now. As a complementary component, Hadoop can be used to process and integrate any type of data—structured, semi-structured, and unstructured—and load what is needed into the data warehouse. This allows companies to continue using their current BI/analytics tools with their enterprise data warehouse ecosystem.

  1. Access any data (via Hadoop). Use Hadoop as the landing platform for all data and exploit the strengths of both the data warehouse and Hadoop.

As mentioned earlier, one advantage of capturing data in Hadoop is that it can be stored in its raw, native state. It does not need to be formatted upfront as with traditional, structured data; it can be formatted at the time of the data request. This use case most closely supports the concept of using Hadoop as a “data lake”—which is a discussion/debate I had recently with a colleague in another forum.

Key takeaways for marketers

Don’t make the mistake of believing that Hadoop is synonymous with big data—because it’s not. It is, however, one of the more popular big data technologies out there that you can use even if you don’t have big data—as pointed out in the first three use cases above. But it’s not just about the technology - this is about enabling you to understand technology enough to understand how it relates to your focus on the customer experience.

Hadoop is here to stay and it’s ready to “play” with your enterprise data warehouse. Download my Non-Geek’s Big Data Playbook to help you figure out which use cases make sense for your organization. This playbook was written for the technologically-savvy business professional who prefers pictures to words, simplicity to complexity, and briefer explanations to longer ones. If this describes you, then what are you waiting for?

This is the 2nd post in a 5-part series, "Big Data Cheat Sheet on Hadoop." This spin-off series for marketers was inspired by a popular big data presentation I delivered to executives and senior management at a recent SAS Global Forum Executive Conference.


Editor’s note:

If you did not read the first post in this series, I encourage you to read that one as well. Tamara's goal with this series is to enable you to have an informed view of how this area of technology can support your strategy. Armed with these perspectives, hopefully you can partner even more closely with I.T. and operations to deliver the best possible customer experience.

Once you're comfortable with Hadoop and want to delve deeper into analytically-driven marketing solutions, start with our Customer Intelligence home page at: www.sas.com/customerjourney. And as always, thank you for following!

tags: big data, Big Data Cheat Sheet on Hadoop, customer experience, Hadoop, sas global forum executive conference

Marketers ask: Why do we need Hadoop if we’re not doing big data? was published on Customer Analytics.

7月 112015
 

Recently, I was given the opportunity to present a session titled, An Executive’s Cheat Sheet on Hadoop, the Enterprise Data Warehouse and the Data Lake at the SAS Global Forum Executive Conference. During this standing-room only session, I addressed these five questions:

  • What can Hadoop do that my data warehouse can’t?
  • We’re not doing “big” data, so why do we need Hadoop?
  • Is Hadoop enterprise-ready?
  • Isn’t a data lake just the data warehouse revisited?
  • What are some of the pros and cons of a data lake?

I've been inspired to re-think  my answers to those 5 questions in terms of the customer experience and present them for marketers as a 5-part series in this blog. My goal is to help marketers understand how these big data technologies are impacting (or can impact) the customer experience, and what you can do to take advantage of this data playground. Let’s get started!

Question 1: What can Hadoop do that my data warehouse can’t?

Here’s the short answer: (1) Store any and all kinds of data more cheaply and (2) process all this data more quickly and cheaply.

The longer answer is:
[Please excuse me as I step up on one of my big data soapboxes to address this question.]

I’m here to tell you that big data is not new. Yet, with all the hype these last few years around these two little words, you’d think we’ve discovered the Holy Grail. Let me share with you the dirty little secret about big data: it’s just data—the same data we’ve had for decades.

Big data is not new

They say that 20% of the data we deal with today is structured data (see examples in orange boxes above). I also call this traditional, relational data. The other 80% is semi-structured or unstructured data (examples in blue boxes), and this is what I call “big” data.

Are any of these blue-box data types new? Of course not. We’ve been collecting, processing, storing, and analyzing all this data for decades. What we haven’t been able to do very well, however, if at all, is mix the orange- and blue-box data together.

So here’s what’s new: We now have the technologies to collect, process, store, and analyze all this data together. In other words, with Hadoop, we can now mix-&-match the orange- and blue-box data together – at a fraction of the cost and time of our traditional, relational systems. You can’t do that with your data warehouse.
[I’m stepping off my soapbox now.]

Why this matters

Big data technologies like Hadoop take the “360-degree view of the customer” concept to a whole new level. Let’s say you want to provide your customers with an omnichannel experience, so that no matter how they choose to interact with you, you’re right there with them. It’s possible with data. The diagram above includes 25 sample data sources, many of which contain customer data. What if you could tie these data sources together to provide your customer with a satisfying and even fun experience?

Consider this scenario: One of your loyal customers posts on Facebook that she’s going shopping at one of your stores today. You know that she just purchased a pair of pants online last week, and that her abandoned online shopping cart has a few cute tops in it to go with the pants. She goes to the store, the retail assistant is able to identify who she is and brings out the tops she abandoned online to try on with her new pants. But since your customer isn’t wearing her new pants, the retail assistant knows which size pants to go grab. Then while shopping, your customer gets a 25% off coupon delivered to her smartphone—good for today only.

All creepiness aside, this retail scenario is not as far-fetched as you may think. This is what mixing-&-matching your customer data will allow you to do. With Hadoop, not your data warehouse.

Key takeaways for marketers

Before you go bust down IT’s door and ask them to install Hadoop so that you can have a better 360-degree view of your customers, please understand that this is easier said than done. Whereby these big data technologies make mixing-&-matching your data possible (which is a huge feat in itself!), be aware that the tools themselves are still maturing. You will need technical assistancefrom IT and developers, internally and externallyto get started with Hadoop.

But it’s not just about the technology. I strongly encourage you to follow these three steps if you want to be successful with Hadoop:

  • Identify the business issue. Don’t “do Hadoop” just because everyone else is doing it or because it looks good on paper or it’s cheap to install. Do Hadoop if it helps address or solve a real business issue for your organization.
  • Get executive buy-in before—not after—you get started. Don’t embark on a big data project without executive support. Even successful projects have been shot down because they couldn’t get executive support and/or they didn’t support corporate strategies.
  • Develop a multi-player plan. Don’t do Hadoop, or big data for that matter, alone. It’s not a single department play. Big data projects require multiple players from the business, IT, and executive management.

Many companies eager to jump on the Hadoop bandwagon have missed these three steps, and guess what they have to show for it now? Abandoned Hadoop installations.

Don’t be one of those companies.

 This is the 1st post in a 5-part series, “A Big Data Cheat Sheet: What Marketers Want to Know.” This spin-off series for marketers was inspired by a popular big data presentation I delivered to executives and senior management at the SAS Global Forum Executive Conference earlier this year.


Editor’s note:

Tamara's ability to make technology accessible to marketers is what makes her perspective so valuable to this blog. And my favorite part about her message is that marketers don't need to be experts in Hadoop to effectively harness the potential in big data. The key is to know just enough about Hadoop so you can have an informed discussion with your technical counterparts about meeting your business needs. The bottom line is that big data technologies such as Hadoop can indeed help marketers deliver a better customer experience.

For a little more detail about Hadoop, I'd recommend this paper by the International Institute for Analytics: The Current State of Hadoop in the Enterprise. Once you're comfortable with Hadoop and want to delve deeper into analytically-driven marketing solutions, start with our Customer Intelligence home page at: www.sas.com/customerjourney.

tags: big data, customer experience, Hadoop, sas global forum executive conference

Marketers ask: What can Hadoop do that my data warehouse can’t? was published on Customer Analytics.

6月 162015
 

People are such an important aspect of data analytics. I was reminded of this at the recent Strata+Hadoop World event, where I saw first hand that the UK is indeed facing the same skills gaps as elsewhere in the world. Perhaps that didn’t surprise me, but I also noticed the […]

The post UK Hadoopists: fashionably late to the big data party appeared first on SAS Voices.

12月 132014
 

The Big Data MOPS Series with Tamara Dull

In my last post, Where Do You Draw the Line Between Relevancy and Privacy, I talked about some of the plusses and minuses of behavioral online advertising as it pertains to personal (big data) privacy. Finding the balance between honoring people’s privacy while providing them with an interesting and relevant online experience is tricky, complicated, and an issue of context. What I may consider as a violation of my privacy, you may think nothing of—or what you may consider an invasion of your privacy, I may say, “Wow. That’s cool. And maybe a little creepy.”

And about those targeted online ads. We know marketers are working hard with advertising platforms, such as Google and Facebook, to make sure we’re seeing the “right” ad at the “right” time. Do they always get it right, though? Here’s an entertaining story about a colleague’s run-in with one of Facebook’s ads. You be the judge of who got it right.

The post and ad. Here’s what my colleague, Jeannette, posted on Facebook:

Facebook example

What we know about Jeannette. If Jeannette is your Facebook friend, then you also know this about her from the many posts and images she’s shared over the years:

  • Jeannette was born and raised in Los Angeles. She and her family just moved to Austin, Texas, this summer.
  • She is an L.A. fashionista who stockpiles highly fashionable footwear.
  • When Jeannette moved to Austin, she kept us entertained with a running commentary – complete with photos - on Austin’s (lack of) fashion scene.
  • Despite the fashion faux pas, Jeannette LOVES Austin!

What Facebook knows about Jeannette. If you’ve been reading this blog series, you know I haven’t been shy about calling Facebook a machine—a big data machine. It knows and understands Jeannette (and each of us) only through the data we share. The more we share, the more Facebook learns about us. Here’s what we can assume Facebook knows about Jeannette:

  • She’s been in Los Angeles since she created her account, but she recently moved to Austin. Facebook knows this through location data, and the fact that she updated her Current City from Los Angeles to Austin.
  • She expresses a lot of positive sentiment about fashion, shoes, restaurants and rock & roll.
  • She has expressed higher levels of negative sentiment about Austin’s sense of fashion and shoes (but very positive sentiment about their restaurants and music scene).
  • She fit into the demographics that the advertiser, QVC, was looking for. That’s why the ad showed up on her profile page.

Is it possible that Facebook really understood Jeannette’s disillusionment with Austin’s fashion scene, and offered up these Austin-appropriate shoes to help her fit in? Perhaps we will never know, but it does make you wonder how effective behavioral online advertising really is.

The IM chat afterwards. After this Facebook exchange, I IMed with Jeannette. The story gets even better. Despite her initial outrage at this ad, Jeannette wrote: “I may now buy something from QVC because of [the ad]. But not those sandals – even though they seem to be popular here.”

Mission accomplished. The advertiser, QVC, gets a +1 because they got a new customer. Jeannette gets a +1 because she discovered a new retail site that can keep her fashionista cravings satisfied. And Facebook gets a +1 because they connected the dots between an advertiser and a buyer.

My only question is: If Jeannette were still in L.A., would she had ever seen this ad for mandals? My vote is no. What do you think?

Author's note: This post was reviewed and approved by Jeannette Fino, who is now the proud owner of four new pairs of shoes from QVC.

Originally written for and published on Smart Data Collective as part of the Big Data MOPS Series


Editor's note:

John Balla will not hear Tom's canvas wedge sandals.I can totally relate to the cool/creepy reaction here. Either Jeannette and I are living alternate lives because Facebook is serving me ads for Tom's canvas wedge sandals that Jeannette would probably like (screen-shot to the right).

The more likely scenario is that the prompt from that ad came from my teenage daughter's not-so-subtle hint about what she'd like Santa to bring her. You see, two weeks ago, she searched for the shoes on my home computer and then left the browser up on the screen - with a handbag in another tab and then a couple of blouses in two other tabs.

So how do I get Facebook to understand that my daughter gave the same not-so-subtle hint to her grandmother, who has already bought the shoes? Our family has moved on from the canvas wedges.

I think that's the next evolution in personalization - where it becomes interactive somehow. Because I already find these ads annoying and I'm certain I'm not the only one.

tags: advertising, big data, facebook, personalization, privacy, social media
12月 052014
 

The Big Data MOPS Series with Tamara Dull

We have a love/hate relationship with ads. Whether they’re on television, in our favorite publications, or online, we love them if they’re relevant and interesting, or get annoyed when they get in the way of [insert whatever we’re doing]. I have to admit: I rarely watch a television show in real-time anymore. I’ll record a show, wait 20+ minutes, and “chase the show” with the recording—ad-free.

So what does this have to do with big data privacy, the “soapbox” I’ve been standing on for weeks? (Ha!) Well, some would have you believe that the big data privacy debate is all about online advertising—i.e., you get interesting, relevant ads in exchange for your personal information. If this what you believe, you’re sort of missing the point. Read on and see if you agree.

About online advertising. Do you remember September 3rd when Facebook had an 18-minute outage? Given that Facebook generates about $22,000 per minute, this means they lost almost $400K during that outage. This may sound like a drop in the bucket for them, but if you add in all the lost revenue from all the businesses who generate ad revenue on Facebook’s platform, a lot more than $400K was lost.

Behavioral online advertising example

Advertising is big money, and behavioral online advertising is even bigger money – and companies like Facebook, Google, and Yahoo! get that. They know what we’re clicking, posting, liking, and commenting on, and they’re using this information to better target us for advertisers. And contrary to popular belief, when it comes to advertising and privacy, advertisers really don’t care about what we do or where we go. They only care about one thing: getting us to buy whatever they’re selling.

Why this matters. You may be thinking, “So what? What’s the harm?” I mean, who doesn’t appreciate a targeted ad when you’re surfing for a certain item online or a coupon delivered to your smartphone when you’re near one of your favorite stores? It seems like a harmless trade-off: a little bit of your personal information in exchange for some helpful, free service that could help save you some money.

But here’s the catch: the information we freely share is not just used by these advertisers selling stuff to us. It’s being used, bought, and sold by a lot of other data “players”—some good, some bad, some we’ve given explicit permission to, some we haven’t—and none of which we have any control over.

The big data privacy debate is not just about online advertising—or even the collection of data. It’s about who’s using our data, why they’re using it, and how we can protect ourselves from privacy invasions when we don’t even know who’s watching us. It’s about you, me, them, and us.

The bottom line. Vigilance, not apathy, is the right response to the opportunities and challenges this big data era is ushering in. Be mindful of what you click and share. If you don’t click it or share it, “they” can’t use it or abuse it.

Originally written for and published on Smart Data Collective as part of the Big Data MOPS Series


Editor's note:

We all get it now - big data is both a challenge and an opportunity for marketers. And the opportunity is realized by applying analytics to garner the insights that lead to better marketing. And big data is really BIG - so the data now available to marketers is like a digital "horn of plenty," a virtual cornucopia just overflowing with potential information.

And the point for marketers in Tamara's post here is analogous to the message of this biblical parable once used by John F. Kennedy:

To those whom much is given, much is expected.

As the steward of the customer relationship, marketing can't just harvest and use the truckloads of customers' personal data irresponsibly - it's reasonable that we'd be expected to safeguard it and respect the rights of the owners of that data.

There's more information on consumer's expectations regarding digital behavior and personalization that you can read in this report based on a global study SAS conducted this year:

Finding the Right Balance Between Personalization and Privacy

Check it out and let us know what you think!

tags: advertising, big data, Big Data MOPS Series, privacy, search
11月 282014
 

Admit it: If you’re like many marketers, when you read or hear about “big data privacy,” you’re ready to move onto the next topic or swipe to the next screen. Even though you know the discussion is important, you know it’s not fun, it’s sometimes creepy, and it’s not easy to navigate its complexities. But please bear with me.

Restricted area: Authorized personnel only.For this blog post, I’ve pulled out five steps from a recent webcast I did for the Association of National Advertisers (ANA) called “A Marketer’s 5-Step Guide to Data Privacy in a Big Data World.” It builds on the idea that big data privacy is caught in this tug-of-war between consumers, constituents, and the private and public sectors. There are steps we can take based on each of these perspectives.

Step 1. Take digital control and reduce your digital footprint.

This step applies to all of us as consumers. You don't ever want mom to be alarmed, skeptical or speechless.I could easily spew out hundreds of tips and tricks on how to take digital control of your life, but for the sake of space and time, I’ll highlight three ideas:

  • Make sure all the information you share passes the Mom test. Or the Thanksgiving dinner test. The Mom test considers what your mom would think if she saw the information online. Would she approve? The Thanksgiving dinner test asks: Would you be willing to share this information at the turkey table? If you can’t pass these two tests, red flags should be going up.
  • Create professional and personal personas with your online networks. Use different email addresses. Use a different browser for each persona so there is no cross-tracking.
  • Become a stealth browser user. Here are two simple things you can do immediately: block third-party cookies and enable anti-tracking software like Disconnect. There’s a lot of options here. Google it.

Bottom line: If you don’t share it, they can’t use it or abuse it.

Step 2. Give customers easy access and rights to their data.

The following “letter” is directed at the private sector, but could be applicable to any organization, for profit or not:

Dear Favorite Company,

I would like to make four requests:

  1. If I give you my personal data for free, don’t go behind my back and share or sell it to someone else without my knowledge.
  2. Figure out a simple way to help me understand who owns my data, who has rights to it, and for how long.
  3. Make it easy for me to access and manage my own data.   
  4. Be transparent about what and how my data is being used, what requests have been made by external entities, and the steps you’re taking to keep my data secure.

Sincerely yours,
Me

Bottom line: If you’re in the business of collecting and using customer data, treat it as a corporate asset and respond accordingly.

Step 3. Become a privacy advocate.

Most of us are aware that politicians buy reports about us. They know the issues we support, how much money we make, and what we like and dislike on Facebook. We’ve all been profiled in sometimes not-so-flattering ways. The problem is that many Americans have no idea that they’ve been profiled. Not only do they have no idea, they have no way to control that process.

So what can we do? Some would argue that erasing your digital footprint as much as possible is the way to go. But let me ask you this: Are you ready to give up technology to preserve your privacy? I’m not. We have come a long way in the last 20 years and there are a lot of upsides in this new digital economy.

So, again, what can we do? The better answer is to fight back and become a privacy advocate. We talked about a few ideas in step 1 and taking digital control. But here, as fellow citizens, we need to work together, collectively and collaboratively, to make sure we have rights to our data. That we can review our data. Or correct it. Or remove it. Or dispute it.

Protecting privacy goes beyond signing online petitions and adding comments in Facebook. It’s something that we need to do for ourselves, and then do again on behalf of others. Here’s a few more ideas:

  • Continue to educate yourself. The more informed you are about big data privacy issues, the more influential you can be in shaping and affecting data privacy policies, standards, and regulations.
  • Pick your battles. Start with the companies you do business with frequently or communities you’re involved with.
  • Understand that your choices aren’t everyone’s choices. What I deem important and valuable may not be as important to you, and vice versa.

Bottom line: Focus on the right issues and don’t get side-tracked. Be the voice that speaks up – even when it’s not convenient and you don’t feel supported. Because if it’s not you, then who?

Step 4. Take a lead role in the global privacy theater.

This step is directed at the public sector. No country is leading the way when it comes to data privacy—but this doesn’t mean significant efforts aren’t being made.

In the US, the White House released two reports earlier this year on big data privacy. There’s still lots of work to do, but it’s a start. The FTC, the agency who handles consumer protection issues, also released a report recently recommending that data brokers give consumers more control over their data. Again, it’s a start.

Europe is big in the news, too, with their right to be forgotten act. In fact, Eric Schmidt from Google is over there right now on a 7-country tour to hear views about the best ways to remove search engine links to information that petitioners contend is intrusive and no longer relevant. To date, Google has received almost 150,000 requests for the removal of 500,000 URLs; 58% of them have been removed so far.

Big data privacy on a global scale is extremely complex. Given that data is growing exponentially and knows no borders, and views of privacy vary around the globe, we have our work cut out for us. With the US being the home to 1/3 of all the data in the world, we have an opportunity to step forward.

Bottom line: As citizens, we can support politicians and policy makers who are passionate about these privacy issues and are actively engaged in moving the ball forward. Do we know who these players are? Let’s do our homework and find out.

Step 5. Stop the madness.

As I mentioned earlier, these five steps come from the ANA webcast I did earlier this month. In the webcast, I also presented 5 “facts” people think are true about data privacy – and then exposed these “facts” for what they really are: myths, distractions, and misunderstandings.

We need to educate ourselves. We need to be able to separate the facts from the fiction. We need to stop the madness. Let me give you an example: You hear people say “I’ve got nothing to hide” or “If I have done nothing wrong, I have nothing to worry about.”

If this is what you believe, you’re missing the point. Statements like this are just a distraction from the real discussion of online privacy. Consider this: Every day, someone new is coming online. Maybe it’s a young person who just got his first iPhone or it’s someone in a region who’s just getting affordable access for the first time. They don’t know the rules. And even though you may not care about being openly tracked, don’t put these newbies into a dangerous situation by letting them believe the internet is safe. Because it’s not.

Bottom line: As Maya Angelou so famously said, “When you know better, you do better.” We each have a role to play when it comes to big data privacy. What’s yours going to be?

tags: big data, Big Data MOPS Series, privacy