Big Data is Buzzing, But Small Data Packs a Punch

Author Topic: Big Data is Buzzing, But Small Data Packs a Punch  (Read 1025 times)

Offline doha

  • Full Member
  • ***
  • Posts: 163
    • View Profile
Big Data is Buzzing, But Small Data Packs a Punch
« on: March 27, 2014, 06:23:23 PM »
A decade ago, “sales-marketing alignment” was not a phrase on the tips of our tongues every time we entered a conversation about lead generation and revenue. But it is now. With the advent of platforms like social media and sales and marketing automation, we often see a blurred line where marketing stops and sales begins.

Traditionally, customer data wasn’t centralized and each organization had a different view into their customers and prospects. But as both sales and marketing organizations recognize a clear need to understand their prospects at a deeper, more analytical level, these teams are starting to employ a holistic data strategy.

Big data (read: expansive data sets that are difficult to navigate at the user level) should be replaced by small data. For the unfamiliar, small data implies data sets that can inform daily business decisions and are appropriate for end-user consumption. While your sales reps might be adept at analyzing data, we know it’s not that common and definitely not a good use of time when meeting quota is massively important to your quarterly revenue goals.

Business analyst Rand Schulman knows the ins and outs of big data — so we asked him to make a case for small data in sales and marketing. In the Q&A session outlined below, we talked with Rand about how the shift from big data to small data calls for precision and accuracy. When the rubber hits the road, it doesn’t matter how many leads you have. Your stellar marketing message or sales introduction won’t even make it to first base if your data is unreliable.

Want to learn more about data, big and small? Check out this helpful article: “Little Data Vs. Big Data – Nine Types of Data and How They Should Be Used.”

For over two decades Rand Schulman’s area of interest and success has been in building organizations and products that disrupt old sectors and forge new ones. He was founder and CEO of one of the first SaaS-based web-analytics companies, Keylime Software (Yahoo!), and he led products and strategy at Webtrends. His trajectory also includes CMO of WebSideStory (Omniture/Adobe) through its IPO, general manager of Unica’s Internet Division (IBM), and founding board member / director emeritus of the DAA. He’s been a trustee of the Direct Marketing Educational Foundation and is an executive-in-residence at the University of the Pacific for New Media and Marketing.

Rand is currently board member and advises numerous mobile and social media companies, including Qualcomm and SRI on big data topics, and is a managing partner with Efectyv Digital.

Rand will be speaking on multiple points regarding the value and quality of small data for sales and marketing in a one-hour webinar next month. Stay tuned for details about “Doing Small Data In a BIG Way.”

You recently published an insightful article called “Your Small Data Just Sucks.” Can you share your a bit more about the importance of small data to sales and marketing organizations?

Rand:  Yes, big data has been quite the buzz word out there, but we really need to recognize the value in small data too. That’s where we get the biggest bang – smaller data sets that provide users with more digestible information.

Just a few weeks ago, I was in my office working on a targeted email when I realized something so essential — and it’s a bit embarrassing to admit — as a data-driven marketing guy, you’d think I’d realize the most basic building block of any conversion starts with accurate “top of the funnel” CRM contact data. With garbage in, you only get garbage out.

As salespeople and marketers struggle with all of the new data tools, we need to better understand how the small data we’re using drives sales effectiveness. Without good contact information, these systems are just plain dumb and they cost us more than they help. According to Gartner, contact information ages up to 50 percent in any year, becoming inaccurate and out of date, only serving to compound the issue. The top of the funnel data just has to be solid. And as salespeople and marketers, we have to be agile.

It is important to note that most vendors claim their data is the best. They often toss out words like “comprehensive” or “most.” When we hear this, we are highly skeptical and instead tell clients that they need to understand the specific claims and look at the numbers that support the claims. Remember the shape of the lead funnel? It’s wider at the top. A perfect funnel would look more like a rectangle, with high conversion rates flowing down each step toward revenue (plus a little drop off). Poor quality data, wastes time and has a huge opportunity cost associated with it. More leads and better conversion is good, while more leads and poor conversion is bad. Thus, attaining a successful funnel is not about how many contacts or names you generate, it’s about what converts to revenue at the bottom of the funnel.

What are the different data segments you consider critical to use when creating quality small data sets for leads?

Rand:  There are several benchmarks we use to categorize content, or data. I generally think of these methodologies in the following segments of data: content collection; content triangulation, and content checking.

Though no one type is perfect and “truth” is relative, an objective set of accuracy benchmarks can provide a factual comparison. Various companies apply different methodologies. I’ve found that the data gathering model comparison matrix on is very informative.

And why do you consider content collection a critical piece of small data segmentation?

Rand:  The first benchmark, content collection, is the requirement to collect multiple sources of truth. As any reporter knows, it’s critical to ask the same question of many people. Since a human is not scalable over millions of records, we need to use technology to do that. Additionally, it’s critical to make sure more than a single data-gathering engine is collecting content. For example, crowdsourcing is great for breadth but tends to lack accuracy. Editorial, like the WSJ or NYT, is great for accuracy, but lacks breadth.

We tend to categorize content collection into four principal areas: 1) editorial aggregation of content from sources like the WSJ and NYT, 2) social media sources, like Twitter feeds and Facebook posts, 3) web crawling, which is good for finding out what web sites are reporting about people and company events, and 4) crowdsourcing, using the “wisdom of the crowds” to filter and create truth (i.e. Yelp, Jigsaw).

We know that incorrect data/results leads to wasted budget and lost time. Acknowledging that each method of content collection produces different results, can you describe the next step in the data segmentation process?

Rand:   After it’s collected, the data has to be triangulated and filtered. As I’ve said, having more than one source is required for ensuring powerful and accurate data. We need technology that can reconcile multiple data sources and scale across millions of records, merging only the most recent and most accurate pieces of company and contact data into a single record. I believe that this is a key area where vendor technology, using NLP, is a requirement to determine content’s “truthfulness.”

 Many vendors offer some kind of data technology, and we recommend that the data buyer (and the end user) perform their own benchmark tests on their own content, people, company and event information to determine quality. It’s not easy to do, but you can ask the vendor if they have conducted these tests and check out their results.

How can you create rules around triangulation and filtering this data, and why can’t you skip that step?

Rand:   Some vendors have created very effective triangulation logic. One of the most sophisticated approaches we have come across uses “entity triangulation”– algorithms that are based on both human judgment and machine learning. In the security and email deliverability fields, this is known as “reputation systems.” Editorial experts analyze vast sample files to assess the accuracy of different sources. This is done at both the overall vendor level (vendor X is more accurate than vendor Y) and the individual field level (vendor Y provides the most accurate data for specific field Z). The end result is that for any given field of data, on any given company or contact record, the algorithm can present the best available information. The end result is that the marketer or salesperson no longer has to compare multiple sources of data and try to make sense of conflicting information all by themselves. It is now automated.

And lastly, you mentioned that there must be a validation process. Can you describe the data validation process?

Rand:   There’s less magic in validation – it’s more about blocking and tackling. That said, vendors go to different lengths to validate data (and yes, some don’t bother at all!).

Several vendors use basic validation technologies such as email deliverability testing, which will “ping” an email service to confirm the existence of an email address (without actually sending a spam email). Some vendors rely on crowdsourcing via services like CrowdFlower or Amazon’s Mechanical Turk to evaluate specific information that can benefit from human judgment. Finally, just a few vendors use their own editorial experts to manually verify user-contributed data and conflicting information that has been flagged for review by end users or surfaced automatically by algorithms. Generally speaking, review and verification by an editorial expert is the last filter. The buck stops there.

Why is “Small Data” quality so important?

Rand:   We’ve analyzed vendor data accuracy results many times, testing with different contacts and companies. A great percentage of the time fundamental results were different between vendors, and they showed incorrect or conflicting titles, email addresses, phone numbers and other basic information which leads to poor conversion rates.

Can you share a specific example of one of your tests?

Rand:   At Efectyv Digital, we tested a few of the popular sales effectiveness tools to see why our conversion rates were so low. It didn’t take us long to confirm, as we suspected, that our data just sucked and we needed to start making it better. Here’s our analysis. At least step one. We’ll always work on our marketing messages and segment to help our conversion rates.

While there are scores of products on the market, including LinkedIn, Zoom, and One Source, and some great new start-ups that have various degrees of content mash-ups like Tempo and Refresh, we chose to test three of the more popular products that come integrated with CRM systems, including D&B 360, which has mostly manually generated contact and company information;, the roots of which are crowdsourced with Jigsaw data acquired by Salesforce; and InsideView, which relies on a multisourced, triangulated, validated data approach to deliver results. The levels of integration vary, depending on the CRM system: Dynamics, Oracle, Sugar, or Salesforce.

For the test, we used a real person and a real institution — Krystin Mitchell, Senior Vice President of Human Resources at 7-Eleven Inc. Since 7-Eleven’s revenue is in excess of $80 billion and they’re public, we thought they might be a good test to see how we can find her in our test systems.

So, where is Krystin? According to their current company Web page, she is indeed at 7-Eleven, but according to Krystin Mitchell is not included in 7-Eleven’s “Find Contacts” search results. When we broadened the search, we found there were 16 wrong results with her name, company, and email address. That’s crazy and not acceptable. I can see why our emails bounce.

We then tested the trusty old saw, D&B 360. Since much commerce is based on its data, it has to yield accurate results, right? D&B is the gold standard of contact data — the truth. It’s built with human editorial control so we thought we’d get correct results. But, even with D&B, Krystin Mitchell is not included in “Build a List” of custom search results (this time 65 wrong contacts came up in her place).

To find her, we needed to do a “general people in search,” but like in it yielded multiple/duplicate results, and different types of incorrect contact info, which defeats the purpose of a sales and marketing effectiveness product.

We are drowning in data. It is no simple feat to filter this sea of data, but it seems to me that we need to get the basics right about “small data” before we can talk about optimizing big data, real-time data, and the impact of attribution models. Quality B2B or B2C contact information is fundamental. It’s best to walk before we run and finally sprint to the holy grail of real-time conversions, and revenue falling from the trees.