Are All Data Created Equal?

AllDataNotCreatedEqual A tweet, a transaction, an email or a phone call - Do you have a preference?

[Note: This is an ongoing series of interviews on analytics ahead of the Useful Business Analytics Summit. Feel free to check out the rest of the interviews beginning here]

I thought this was an important question, and one I knew the answer to. My thinking, based on experience has been, certainly not, some data is far richer and more important than other data. For instance 1 or 10,000 tweets for that matter are no where near as important as one good data record of an actual customer calling or emailing your customer service center with a specific complaint, praise or suggestion.

That said as I posed the question to our panel of client side analytics experts I began to think maybe the question itself made was a mistake. The all too common mistake of putting the data before the question.

Curious to hear your thoughts. Can we legitimately ask this question about data without first answering the question of what question is to be answered? And if we can, on what side of the spectrum do you fall – all data is created equal OR some data are priceless and others are almost useless?




[Thomas Speidel - Suncor Energy]

 It depends on what we are trying to find out. For mission critical decisions, it's important to have data that was intentionally captured for that or a similar purpose (usually structured).

For exploration or low consequence questions, any data will do so long as we understand the limitations of our findings.


[Sofia Freyder – MasterCard]

I think all data is important: structures and unstructured, quantitative and qualitative, on- line and off-line, behavioral or opinion based. Each specific situation will define which data should be considered more accurate and precise.


[Deepak Tiwari - Google]

It depends. We use all types of data (structured, unstructured) and depending on the problem use them to varying degree.


[Jonathan Isernhagen – Travelocity]

 I’m a finance guy at heart, and believe in the idea of net present value….the idea that every allocation decision we make can be thought of as a project that should pay out more than the investment. I’m interested in any data which directly inform such “project” decisions…the ROI stuff . I’m less interested in other data. There’s a school of thought that I’d call “Pathism” or “Funnelism” which rejects channel attribution. If you don’t have the marketing budget to justify investing in an algorithmic attribution model, that’s one thing. If you imagine that knowing your fourth-most-popular path to conversion is SEO-to-Direct is better than knowing your individual channel ROIs….I would beg to differ.


[Farouk Ferchichi - Toyota Financial Services]

I don’t believe there is data that is not important. All data is important given the appropriate context. Internal and external structured data in the form of financials or customers’ data is important to analyze histories and develop models but internal and external unstructured data is equally as important to discover and access new type of information. The question becomes how to access data and what to acquire/store and for that you need a data discovery and acquisition strategy.


[Anthony Palella - Angie's List]

- Importance is determined by the high value questions that need to be answered. When I start working with a business partner, I don't ask about KPI's. I ask, "What are the 10-12 questions you need answers to in order to successfully run your business?". The data needed to answer these questions is "important".


[Larry Shiller - Yale]

This is a "meta" answer... "Type" means a way to slice and dice. If you are slicing data only one way, that way may be a shiny object: Look for other ways (i.e., other dimensions) to slice your data. For example, the most common dimension is time: Look for other dimensions/pivots.


Thanks to our speakers at the upcoming Useful Business Analytics Summit for their thoughtful answers to the above question. This Q&A is part of an ongoing series focusing on big data and business analytics in general. Feel free to check out some of our past questions on Big Data, How to Keep Up to Date on Analytics, Top 10 Analytics Tips. Our next post will be on my favorite topic, text analytics!





[Full Disclosure: Tom H. C. Anderson is Managing Partner of Anderson Analytics, developers of a patented Next Generation approach to text analytics known as OdinText. For more information and to inquire about software licensing visit OdinText INFO Request.]