Posts in Tips + Training
A New Trend in Qualitative Research

Almost Half of Market Researchers are doing Market Research Wrong! - My Interview with the QRCA (And a Quiet New Trend - Science Based Qualitative).

Two years ago I shared some research on research about how market researchers view Quantitative and Qualitative research. I stated that almost half of researchers don’t understand what good data is. Some ‘Quallies’ tend to rely and work almost exclusively with comment data from extremely small samples (about 25% of market researchers surveyed), conversely there is a large group of ‘Quant Jockey’s’ who while working with larger more representative sample sizes, purposefully avoid any unstructured data such as open ended comments because they don’t want to deal with coding and analyzing it or don’t believe in it’s accuracy and ability to add to the research objectives. In my opinion both researcher groups have it totally wrong, and are doing a tremendous disservice to their companies and clients.  Today, I’ll be focusing on just the first group above, those who tend to rely primarily on qualitative research for decisions.

Note that today’s blog post is related to a recent interview, which I was asked to take part in by the QRCA’s (Qualitative Research Consultant’s Association) Views Magazine. When they contacted me I told them that in most cases (with some exceptions), Text Analytics really isn’t a good fit for Qualitative Researchers, and asked if they were sure they wanted to include someone with that opinion in their magazine? I was told that yes, they were ok with sharing different viewpoints.

I’ll share a link to the full interview in the online version of the magazine at the bottom of this post. But before that, a few thoughts to explain my issues with qualitative data and how it’s often applied as well as some of my recent experiences with qualitative researchers licensing our text analytics software, OdinText.

 The Problem with Qualitative Research

IF Qual research was really used in the way it’s often positioned, ‘as a way to inform quant research’, that would be ok. The fact of the matter is though, Qual often isn’t being used that way, but instead as an end in and of itself. Let me explain.

First, there is one exception to this rule of only using Qual as pilot feedback for Quant. If you had a product for instance which was specifically made only for US State Governors, then your total population is only N=50. And of course it is highly unlikely that you would ever get all the Governors of each and every US State to participate in any research (which would be a census of all governors), and so if you were fortunate enough to have a group of say 5 Governors whom were willing to give you feedback on your product or service, you would and should obviously hang on to and over analyze every single comment they gave you.

IF however you have even a slightly more common mainstream product, I’ll take a very common product like hamburgers as an example, and you are relying on 5-10 focus groups of n=12 to determine how different parts of the USA (North East, Mid-West, South and West) like their burgers, and rather than feeding  directly into some quantitative research instrument with a greater sample, you issue a ‘Report’ that you share with management; well then you’ve probably just wasted a lot of time and money for some extremely inaccurate and dangerous findings. Yet surprisingly, this happens far more often than one would imagine.

Cognitive Dissonance Among Qual Researchers when Using OdinText

How do I know this you may ask? Good Text Analytics software is really about data mining and pattern recognition. When I first launched OdinText we had a lot of inquiries from Qualitative researchers who wanted some way to make their lives easier. After all, they had “a lot” of unstructured/text comment data which was time consuming for them to process, read, organize and analyze. Certainly, software made to “Analyze Text” must therefore be the answer to their problems.

The problem was that the majority of Qual researchers work with tiny projects/sample, interviews and groups between n=1 and n=12. Even if they do a couple of groups like in the hamburger example I gave above, we’re still taking about a total of just around n=100 representing four or more regional groups of interest, and therefore fewer than n=25 per group. It is impossible to get meaningful/statistically comparable findings and identify real patterns between the key groups of interest in this case.

The Little Noticed Trend In Qual (Qual Data is Getting Bigger)

However, slowly across the past couple of years or so, for the first time I’ve seen a movement of some ‘Qualitative’ shops and researchers, toward Quant. They have started working with larger data sets than before. In some cases, it has been because they have been pulled in to manage larger ongoing community/boards, in some cases larger social media projects, and in others, they have started using survey data mixed with qual, or even better, employing qualitative techniques in quant research (think better open-ends in survey research).

For this reason, we now have a small but growing group of ‘former’ Qual researchers using OdinText. These researchers aren’t our typical mixed data or quantitative researchers, but qualitative researchers that are working with larger samples.

And guess what, “Qualitative” has nothing to do with whether data is in text or numeric format, instead it has everything to so with sample size. And so perhaps unknowingly, these ‘Qualitative Researchers’ have taken the step across the line into Quantitative territory, where often for the first time in their career, statistics can actually be used. – And it can be shocking!

My Experience with ‘Qualitative’ Researchers going Quant/using Text Analytics

Let me explain what I mean. Recently several researchers that come from a clear ‘Qual’ background have become users of our software OdinText. The reason is that the amount of data they had was quickly getting “bigger than they were able to handle”. They believe they are still dealing with “Qualitative” data because most of it is text based, but actually because of the volume, they are now Quant researchers whether they know it or not (text or numeric data is irrelevant).

Ironically, for this reason, we also see much smaller data sizes/projects than ever before being uploaded to the OdinText servers. No, not typically single focus groups with n=12 respondents, but still projects that are often right on the line between quant and qual (n=100+).

The discussions we’re having with these researchers as they begin to understand the quantitative implications of what they have been doing for years are interesting.

Let me preface this with the fact that I have a great amount of respect for the ‘Qualitative’ researchers that begin using OdinText. Ironically, the simple fact that we have mutually determined that an OdinText license is appropriate for them means that they are no longer ‘Qualitative’ researchers (as I explained earlier). They are in fact crossing the line into Quant territory, often for the first time in their careers.

The data may be primarily text based, though usually mixed, but there’s no doubt in their mind nor ours, that one of the most valuable aspects of the data is the customer commentary in the text, and this can be a strength

The challenge lies in getting them to quickly accept and come to terms with quantitative/statistical analysis, and thereby also the importance of sample size.

What do you mean my sample is too small?

When you have licensed OdinText you can upload pretty much any data set you have. So even though they may have initially licensed OdinText to analyze some projects with say 3,000+ comments, there’s nothing to stop them from uploading that survey or set of focus groups with just n=150 or so.

Here’s where it sometimes gets interesting. A sample size of n=150 is right on the borderline. It depends on what you are trying to do with it of course. If half of your respondents are doctors (n=75) and half are nurses (n=75), then you may indeed be able to see some meaningful differences between these two groups in your data.

But what if these n=150 respondents are hamburger customers, and your objective was to understand the difference between the 4 US regions in the I referenced earlier? Then you have about n=37 in each subgroup of interest, and you are likely to have very few, IF ANY, meaningful patterns or differences.

Here’s where that cognitive dissonance can happen --- and the breakthroughs if we are lucky.

A former ‘Qual Researcher’ who has spent the last 15 years of their career making ‘management level recommendations’ on how to market burgers differently in different regions based on data like this, for the first time is looking at software which says that there are maybe just two to 3 small differences, or even worse, NO MEANINGFUL PATTERNS OR DIFFERENCES WHATSOEVER, in their data, may be in shock!

How can this be? They’ve analyzed data like this many times before, and they were always able to write a good report with lots of rich detailed examples of how North Eastern Hamburger consumers preferred this or that because of this and that. And here we are, looking at the same kind of data, and we realize, there is very little here other than completely subjective thoughts and quotes.

Opportunity for Change

This is where, to their credit, most of our users start to understand the quantitative nature of data analysis. They, unlike the few ‘Quant Only Jockie’s’ I referenced at the beginning of the article already understand that many of the best insights come from text data in free form unaided, non-leading, yet creative questions.

They only need to start thinking about their sample sizes before fielding a project. To understand the quantitative nature of sampling. To think about the handful of structured data points that they perhaps hadn’t thought much about in previous projects and how they can be leveraged together with the unstructured data. They realize they need to start thinking about this first, before the data has all been collected and the project is nearly over and ready for the most important step, the analysis, where rubber hits the road and garbage in really should mean garbage out.

If we’re lucky, they quickly understand, its not about Quant and Qual any more. It’s about Mixed Data, it’s about having the right data, it’s about having enough data to generate robust findings and then superior insights!

Final Thoughts on the Two Meaningless Nearly Terms of ‘Quant and Qual’

As I’ve said many times before here and on the NGMR blog, the terms “Qualitative” and “Quantitative” at least the way they are commonly used in marketing research, is already passé.

The future is Mixed Data. I’ve known this to be true for years, and almost all our patent claims involve this important concept. Our research shows time and time again, that when we use both structured and unstructured data in our analysis, models and predictions, the results are far more accurate.

For this reason we’ve been hard at work developing the first ever truly Mixed Data Analytics Platform, we’ll be officially launching it three months from now, but many of our current customers already have access. [For those who are interested in learning more or would like early access you can inquire here:].

In the meantime, if you’re wondering whether you have enough data to warrant advanced mixed data and text annalysis, check out the online version of article in QRCA Views magazine here. Robin Wedewer at QRCA really did an excellent job in asking some really pointed questions that forced me too answer more honestly and clearly than I might otherwise have.

I realize not everyone will agree with today’s post nor my interview with QRCA, and I welcome your comments here. I just please ask that you read both the post above, as well as the interview in QRCA before commenting solely based on the title of this post.

Thank you for reading. As always, I welcome questions publicly in post below or privately via LinkedIn or our Inquiry form.


Best 10 Text Analytics Tips Posts of The Year

Our Top 10 Most Read Data and Text Mining Posts of 2017

Thank you for reading our blog this year. The OdinText blog has quickly become even more popular than the Next Gen Market Research blog, and I really appreciate the thoughtful feedback we’ve gotten here on the blog, via Twitter, and email.

In case you’re curious, here are the most popular posts of the year:

#10 NFL Players Taking a Knee is More Complex and Polarizing Than We Think If a Topic is Worth Quantifying – It’s Also Worth Understanding The Why’s Behind It

#9 Text Analytics Picks The 10 Strongest Super Bowl Ads New Text Analytics Poll Shows Which Super Bowl Ads Really Performed Best

#8 Why Your HR Survey is a Lie and How to Get The Truth OdinText Discovers Job Satisfaction Drivers in Anonymous Employee Data

#7 Of Tears & Text Analytics (An OdinText User Story – Text Analytics Guest Post (AI Meets VOC))

#6 65 CEO’s Share Thoughts on Insights (Insights Associations Inaugural CEO Summit – A Future Tied to Collaboration and Technology)

#5 Why Machine Learning is Meaningless (Beware of Buzzwords! The Truth about ‘Machine Learning’ and ‘Artificial Intelligence’)

#4 Do You Speak Teen? OdinText Announces 2nd Annual List of Top 10 Slang Terms (How Text Analytics Can Help Marketers Move at the Speed of Slang)

#3 Text Analysis Reveals Potential French Election French Election Upset (Text Analytics Poll Showed How Close Le Pen Came to ‘Trumping’ Macron)

#2 Text Analytics Poll: Why We Unfriend on Facebook (You Can’t Handle The Truth (And Other Top Reasons Why We Unfriend on Facebook)

#1 What Americans Really Think About Trump’s Immigration Ban and Why (Text Analysis of What People Say in Their Own Words Reveals More Than Multi-Choice Surveys)


I thought I’d also check what our top 5 posts were from last year as well, here they are in case you missed them:

Top Posts From 2016

#1 Text Analysis Answers Is The Quran Really More Violent Than The Bible (3 Parts)

#2 Attensity Sold – What Does it Mean?

#3 Customer Satisfaction Surveys: What do Satisfied VS Dissatisfied Customers Talk About?

#4 What’s Really Wrong With Polling?

#5 What Your Customer Satisfaction Research Isn’t Telling You

Thanks again for reading and commenting. As always I welcome your thoughts and questions via LinkedIn, or feel free to request info on anything you’ve read above here.

Happy New Year!


Brandtrust Uses OdinText to Quantify Qual at Scale and Unearth Dormant Brand Equities

Editor’s note: Today’s post was contributed by Brandtrust, an OdinText client, as part of a new ongoing series from our users. We felt Brandtrust was an outstanding candidate for a use case because they already have a set of sophisticated, proprietary methodologies, which were made even more valuable by easily incorporating the OdinText platform. Case Study: Realizing the Untapped Potential of Stories

A long-time client asked us to help determine the equity of an overlooked legacy sub-brand, as they were interested in how that sub-brand relates to the parent brand, both from the perspective of legacy sub-brand consumers and younger prospective consumers. They wondered if there was untapped potential in this neglected property.

Utilizing our Narrative Inquiry approach — text analytics with a unique take on unearthing human truth — Brandtrust asked legacy and prospective consumers to share their memories and experiences with the sub-brand and parent brand via an open-ended survey tool. We exposed prospective customers, who by definition lack experience with the sub-brand, to representative sub-brand stimulus, and then had them reflect on their exposure experience.

By tapping into stories around actual experiences, our team was able to elicit language around the relationships consumers had or have with the brand and sub-brand. Utilizing OdinText to analyze the unstructured data (a.k.a. stories) we received, we looked for narrative and emotional patterns across and between the legacy and prospective consumers.

Dormant Brand Equities at the Intersection

Our client’s operating hypothesis was that the perceptions and emotions of the two targets would vary dramatically.  As it turned out, legacy consumers did express more nostalgia with the sub-brand and recalled their past experiences with it fondly, associating the sub-brand with family connection and memorable special events. Prospective consumers, not surprisingly, expressed a greater sense of trepidation related to the unfamiliar-but-established sub-brand.

Interestingly, and most useful to our client, however, there were important areas of intersection between the two consumer groups, both in perception and experience of the sub-brand and the parent brand.

The sub-brand and parent brand elicited joyful emotions and communicated the concept of care, a key tenant of the parent brand. Additionally, the sub-brand reflections of both consumer groups contained elements of enjoyable education (think “learning is fun!”) and heartwarming interactions — equities that were well aligned with recent parent brand initiatives.

All in all, the client was pleased with the outcome and benefited greatly from the knowledge obtained: their quest to determine next steps in this endeavor were finally realized.

Methodological Review

The development and execution of this branch of methods at Brandtrust could have been daunting, but with OdinText at our fingertips, it was far less manual and labor-intensive than it would have been in the days of building code frames, buckets, and nets.  And yet, there is still a great deal of merit in the means by which text analysis was initially derived.

At this point in technological advancement machines cannot, and likely never fully will, replace humans; which is why Brandtrust employs a distinctive approach called Lateral Pattern Analysis — with parallel machine and human analysis, and a combined synthesis between the two, to determine the final outcome of our Narrative Inquiry studies.

Narrative Inquiry questionnaires are built on Brandtrust’s key research pillars, including grounded theory, phased dialogue, behavioral framing, narrative pattern identification, and priming reduction. Reliance on these key elements ensures that our Narrative Inquiry respondents — through a process of recall and reflection — can share with us the rational and emotional makeup of their perceptions and behaviors.

OdinText’s built-in emotional framework assists our team with the “machine” side of processing a vital but squishy element of human understanding through story: Emotion. Brandtrust draws from years of experience in processing story and emotion qualitatively, and OdinText’s features have helped us extend the reach and statistical certainty of that expertise.


1 Analyst + 15K Comments in 8 languages + 2 hours = Awesome Insights!

1 Analyst + 15K Comments in 8 languages + 2 hours = Awesome Insights! Please join us on September 14th for this free live webinar co-hosted by TMRE.

Spaces limited/first come first serve, please register here.

We’ll be covering our extremely well received multi country, multi lingual analysis case study. I think you’ll be surprised at the implications and amazed that this kind of global research can now be done quickly and inexpensively by anyone.

Look forward to seeing you there!


Seven Text Analytics Myths Exposed at IIEX

What I Learned from Attendees in IIEX Text Analytics Sessions This week I had the opportunity to attend and to present at the Insights Innovation Exchange (IIEX) in Atlanta. This conference always provides a wonderful chance to connect with a lot of smart, forward-looking researchers.

For those who missed IIEX or weren’t able to attend my presentation, I provided a case study outlining how we conducted a massive international study in 10 countries and eight languages for almost no cost with results analyzed in just two hours. If you’d like to know more, feel free contact us for a free e-book detailing the project.

My presentation aside, what I’d like to cover here today actually came out of the Text Analytics Information Sessions we were asked to host on Monday, and which I’m pleased to report were well attended—notably by representatives from more than a few major supplier and client brands.

Text Analytics IIEX

I had originally anticipated that there would be more group conversation and peer-to-peer sharing, but it turned out that most of the attendees were less interested in talking than they were in learning, and so the sessions involved quite a bit of Q&A, with my colleague Tim Lynch and I fielding more questions about text analytics, generally, than expected.

What I took from these sessions was a sense that a lot of confusion and misperception around text analytics persists among researchers today and that the industry is urgently in need of more educational resources on the topic (more on this at the end of the blog).

I’ve cherry-picked for you here today the most common misconceptions revealed in these sessions. Hopefully, this will help dispel some persistent myths that do anyone interested in text analytics a huge disservice…

MYTH 1: Text analytics is synonymous with social media monitoring

As I feared, a common misconception about text analytics is that its primary application—and pretty much the extent of its practical utility—is for analyzing social media data. Nothing could be further from the truth!

While social media monitoring firms have done a great job marketing themselves, this is just ONE SMALL SUBSET of data that text analytics can be used to solve for. Moreover, while everyone seems fixated on social media analysis, in my honest opinion, social media monitoring is NOT where the greatest opportunity lies for using text analytics in market research.

And a word of caution: yes, text analytics platforms can easily handle social media data, but the same cannot be said about social media monitoring tools, so be careful not to limit yourself.

MYTH 2: Text analytics are perfect for analyzing qualitative transcripts

I cannot tell you how often I’ve been approach by researchers who want to use text analytics software to analyze focus group transcripts. My first response is always why would you want to do that?

Just because focus group data contains a lot of text doesn’t mean you should run it through a text analytics platform, unless you have very large qualitative communities or run the same exact group 10 times within a category.

Bear in mind, text analytics can be applied quite effectively to small samples (I actually didn’t think so until I learned otherwise from a client), but using small sample IDIs or focus groups doesn’t typically make a lot of sense because text analytics is all about pattern identification.

If you talk to just 15 physicians, for example, you’ll still need to read each of their comments. Text analysis may add additional value, but usually it isn’t worthwhile UNLESS you either have a large enough sample to mine for patterns AND/OR the data is extremely important/valuable (e.g., these are the top 15 MD PhDs in their field working on a life-saving cure).

MYTH 3: Sentiment is REALLY important and useful

Sentiment has been COMPLETELY hyped. In the majority of our text analytics projects sentiment isn’t even a factor. In fact, some firms purporting to offer “text analytics” only offer sentiment analysis. This is unbelievable to me. Having worked with text analytics for the past 15 years I don’t understand why someone would approach data that simplistically. There are so many other, potentially more useful and valuable ways to look at data.

When thinking about text analytics, relevant feature/topic extraction is most important. As important is how this can be turned into actionable advice or a recommended course of action. If you analyze data and come back to management with something as simplistic as “this is what makes people angry,” or happy, chances are you’ll soon be replaced by someone who can tell management how to increase return behavior and revenue.

MYTH 4: Look for AI and Machine Learning

I’ve blogged about this before, and it still drives me nuts!

Everyone seems hung up on this year’s buzzwords—“artificial Intelligence” (AI) and/or “machine learning”—and just about every possible vendor is touting them, whatever the solution they’re selling. For your purposes, I’m telling you they are meaningless.

This is not to say that AI and machine learning are not important—in fact, they’re integral components to the OdinText platform—but they’re terms that are misused, abused, and thrown about cavalierly without any explanation as to how or why they matter. If someone tells you their tool uses AI or machine learning, ask them what they mean by “AI” specifically and to explain precisely how that enables their tool to deliver differentiated results. I’ll wager you’ll walk away from that conversation without any better understanding of why AI is a feature they’re touting than you did before the conversation began. (For more information on this topic, again, read this post.)

Beware also other technical-sounding terms (including sentiment, mentioned above) that frequently crop up around text analytics like NLP (natural language processing), ontologies, taxonomies, support vector machines… I could go on.

If a sales person is throwing jargon like this at you, chances are they are using it to conceal their own lack of knowledge about text analytics.

Conversations should instead focus on: How do I quickly identify the most important topics/ideas mentioned by my customers? How do I know they are important? How do they affect my KPIs? Show me with my data how I can quickly do these things.

MYTH 5: All text analytics are basically the same

Text analytics are not a commoditizable, standardized sort of item. Unlike the deliverables from panel companies or survey vendors, the variety of potential forms text analytics can take is diverse and complex, ranging from more linguistically-based approaches to more mathematical/statistical solutions.

Beyond this, though, practical experience in the given field of application also comes into play. What experience do the developers have in answering problems in your specific field? This will impact underlying thinking as well as user interface considerations.

DO NOT assume that just because a feature is listed in one company’s sell sheet (see buzzwords above, for example), it is a must-have or even a good-to-have, and that you should look for it across vendors.

Again, always fall back to your own data. How does this software tell me how customer group A is different than Group B? How will I know the impact of topics X, Y and Z on sales? These are the questions to ask.

MYTH 6: Text analytics is as easy as just pressing a button and may be totally automated

I’m sorry, but again, no.

On the one hand there are extremely involved and expensive mechanical turk solutions you can purchase. Typically, using one of these solutions will require a few months to build a static dictionary for, say, your customer satisfaction data set, which is then dashboarded. You can easily expect to pay mid-six figures for something like this, and it won’t allow you to do any ad hoc analysis.

The other option is a pure AI/Machine learning solution like IBM’s Watson. It’s fast and cheap because it’s not valuable. (If it were, then IBM could charge a lot more for it.) Look for their case studies and actual customers who have been happy with their solutions. You won’t find many, if any.

Included in the same category as IBM Watson are Microsoft Azure, Amazon AWS and Google NLP tools, as well as vendors that do other things (surveys etc.); plug into one of these and they’ll claim they have “text analytics.” But these tools will not get management what it needs to make intelligent decisions.

The optimal solution is somewhere in between, where machine and human meet in the most effective and intuitive manner. This will mean high-value analysis. What you get back in terms of value of insights depends on the quality of data and the analytical thinking brought to bear by the analyst—just like on any quantitative data project!

MYTH 7: There are lots of great resources for learning about text analytics

Sadly, the net of these IIEX groups on Monday was that it became clear to me that we still don’t have ANY solid educational or training resources devoted to text analytics in this industry. NONE!!!

MR trade orgs don’t offer any; the top masters and MBA programs in research don’t offer much; Burke Institute (whose training I love, by the way) doesn’t offer any...

There aren’t any good books on the subject, either; they’re either way too academic and 10+ years behind, or they’re sales tools in disguise, or it’s just a chapter in a book written by a research generalist who does not specialize in text analytics.

We need educational and training resources rather desperately, it seems.

I plan on continuing to do my part by lecturing on the subject at a few MBA classes each year. I’ve also offered to work with the Burke Institute and the University of Georgia’s Terry School’s Master of Marketing Research program on developing resources.

BUT in the meantime, if you have any questions about text analytics, generally, and totally apart from OdinText, please consider me a resource. Feel free to ping me on LinkedIn or via the info request button here.

I hope this was helpful. Thanks for reading and I welcome your comments!


About Tom H. C. Anderson

Tom H. C. Anderson is the founder and managing partner of OdinText, a venture-backed firm based in Stamford, CT whose eponymous, patented SAS platform is used by Fortune 500 companies like Disney, Coca-Cola and Shell Oil to mine insights from complex, unstructured and mixed data. A recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research, Anderson is the recipient of numerous awards for innovation from industry associations such as CASRO, ESOMAR and the ARF. He was named one of the "Four under 40" market research leaders by the American Marketing Association in 2010. He tweets under the handle @tomhcanderson.

Attending Insights Associations' NEXT 2017 Tomorrow?

Are you attending The Insights Associations’ Inaugural Analytics event NEXT in NYC tomorrow? If so please stop by the OdinText presentation on text analytics at 3:30. I’ve been asked to present not just some of the findings of a very exciting blog post, but also explain how you too can conduct this type of advanced analysis quickly and affordably.

The session is even more exciting than the title, called 'Tap the Power of a Single Open-Ended Question' (follow the link for a more detailed description).

I hope to see you there!

If for whatever reason you can’t make it, as usual feel free to reach out if you’re interested and would like to learn more.



About Tom H. C. Anderson Tom H. C. Anderson is the founder and managing partner of OdinText, a venture-backed firm based in Stamford, CT whose eponymous, patented SAS platform is used by Fortune 500 companies like Disney, Coca-Cola and Shell Oil to mine insights from complex, unstructured and mixed data. A recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research, Anderson is the recipient of numerous awards for innovation from industry associations such as CASRO, ESOMAR and the ARF. He was named one of the “Four under 40” market research leaders by the American Marketing Association in 2010. He tweets under the handle @tomhcanderson.

How to Be a Text Analytics Rock Star in your Organization: 5 Steps

5 MUST DOs to SUCCESSFULLY Implement Text Analytics Software and Maximize its Potential in your Company! Introducing a new technology or approach for generating insights in any organization can be more challenging than one realizes. There is a succession of hurdles to overcome if you really want to achieve traction and make a lasting impact, and a misstep in any one of these can doom an otherwise promising new addition to your insights arsenal.

In fact, one of the top questions I get from managers these days is how to effectively implement something like text analytics software in the organization. The process begins well before a technology solution has been selected.

I’ve spoken with hundreds of users over the past few years at different types of companies of varying size in various industries. This post is based on those conversations, and to keep it simple—although we support a host of internal functions and disciplines—I’ll focus on one of the most popular and arguably best use cases: customer listening (marketing research).

1 Establishing Text Analytics Need

1 Establishing a Need for Text Analytics (Do you have mixed data?)

This one may sound obvious, but unfortunately it isn’t. I often talk to prospective text analytics users who want a software demo, but they don’t have a data set to use in the demo. In other words, they haven’t thought far enough along to determine specifically what data they would actually analyze using text analytics software.

Almost any company of middle-market size or above—especially if they are consumer-facing—will have data from various sources of VOC (Voice of Customer) that would be perfectly suited for text analytics, and which, it goes without saying, are not being exploited to their full insights potential. These data may be small or large, more or less frequently collected, and longitudinal or ad hoc in nature. Sources include survey data, customer feedback and email, online research community threads, and call center transcripts (to name just a few).

The point is, wherever there is at least one unstructured/text “comment” field in a dataset, there is an opportunity to tremendously enrich analysis by leveraging this data. Furthermore, most of the time, truly valuable data consists of some mix of structured and unstructured data (i.e., text and numeric data).

Inventory the data you already have and identify which data sets look ripe for exploring with text analytics. Then select one that fairly represents the data you expect to analyze with your future tool and use this data set for your demos.

What about social media?

Yes, social media is an increasingly popular source of data that text analytics users are eager to analyze. I’ll emphasize here that when people refer to social media data, social listening, etc., in the context of research, they are almost exclusively talking about Twitter data (sometimes without even realizing it).

When I sit down with clients, I prefer to distinguish and separate social/Twitter from the aforementioned other sources of data because traditional sources have already proven themselves valuable enough to collect and analyze on an ongoing basis. This is often not the case with social media/Twitter data.

Many people are under the impression that Twitter will yield a goldmine of insights. In actuality, the extent to which Twitter has any meaningful insights value is limited and depends highly on the category/industry. In fact, many CPG companies will find very little of interest on Twitter, while high-ticket service industry brands may find a bit more.

The point is that if your company has not yet collected or looked at social/Twitter data, it’s probably not critically important for you (and it shouldn’t be the primary reason you adopt text analytics).

Moreover, if you have already determined that social media is actually important, then you should be able to articulate more than one research objective around what you expect to be able to answer with that data. If you cannot, then social/Twitter data will probably not provide a good insights ROI for your organization, and I would strongly suggest that you focus on a traditional data set first.

Bottom line: Get your feet wet with text analytics using data that has already shown clear value!

2 Text Analytics Users

2 Identify the Text Analytics Software User/s (The “Analytics” in Text Analytics Requires an Analyst!)

Good text analytics is two parts science and one part art, therefore it will require a human analyst. It’s incumbent upon you to figure out who that person/s will be.

What you need to know about artificial intelligence and machine learning…

This may come as a shock to some folks, as many people have been led to believe that software leveraging artificial intelligence and machine learning effectively removes the need for a human analyst. This is utter nonsense.

IBM’s Watson comes to mind here. I’m not picking on Watson, but the notion that analyses that will produce meaningful insights can be completely conducted without a human analyst by a human-like computer that will automatically intuit everything and anything about any dataset is a complete fallacy and a PR gimmick.

I’ve blogged about AI & machine learning before here. Luckily for those of us in market research, a human analyst still needs to be involved for anything meaningful or useful to come from the analysis. (I say “luckily” because if this weren’t the case you and I would both be out of a job! But don’t worry; human analysts will not be replaced by machines any time soon.)

Back to identifying a human user…

Having hopefully dispelled any myths about not needing a human analyst, I want to emphasize that this does not and should not mean that you need to hire a data scientist. On the contrary, if the tool requires expertise in scripting, for example, chances are it’s not very intuitive and more of a programming tool better suited to academics.

Good text analytics tools for researchers should provide immediate applied value, and allow a common business practitioner to start analyzing any data set right away or with minimal training (I usually recommend about an hour or two). With a good tool, text analytics will be learned in the trenches using actual data for actual analysis that has real value for your company.

So who will this analyst be? Who will use the software? How much will they use it? Surprisingly a lot of companies get tripped up at this step, too, which overlaps with step #3.

Hint: Unicorns don’t exist!

They say a camel is a horse designed by a committee, but in my experience the enterprise designs a unicorn. The user should never be “everyone.”

Many companies—especially those in which procurement departments play a significant role in the decision—tend to oversimplify steps #1 and #2, and these buyers are more likely to fall for sweeping marketing promises by providers that claim to offer an one-in-all solution for everyone that can do anything.

Frequently in these cases, a long wish list of feature/attributes is compiled by a committee, often by adding wishes from various potential users in different functions across the company or by cobbling together features from very different types of text analytics software. This list ends up looking pretty unrealistic and usually calls for a solution that is suitable for all kinds of data, even calling for some sort of imaginary “merging” ability of completely non-complimentary data that do not have any common unique identifiers or even meta-level merge fields.

This theoretical software is also supposed to be equally useful for marketing, marketing research, customer service, sales, HR, PR, operations, and legal departments, and, of course, IT, too. Not only that, but it must be simple enough for everyone to understand the output without any training or prior analytical knowledge (i.e., static dashboards).

This is an insane expectation!

Applying the same logic, imagine if a hospital bought tools this way—if doctors across all departments from neurology to obgyn had to settle for, say, one scalpel. Oh, and by the way, it should also be useful for the maintenance department, because, after all, they need to cut things, too (like electrical wires or plumbing). This universal scalpel should also be useful for the administrative staff, because they have envelopes. A scalpel should be able to open an envelope, right?

Here’s the frank talk: if you put together a $150-$500K RFP, someone will answer it and claim to have the perfect one-size-fits-all universal scalpel. Good luck with that. (I feel especially sorry for the patients.)

There is no one-size-fits-all product. A text analytics decision should be handled at the department level according to that department’s unique data, objectives and staffing needs.

Will YOU be using it? Then you are the user. Congratulations!

3 Text Analytics Software

3 Identify a Text Analytics Software Solution

You’ve identified that you have data of value, and that you have at least one user to whom this new task will fall and for which they will be directly responsible. Now it’s time to find the right tool for this user with the best ROI.

Provided you’re not looking for the mythical unicorn I mentioned in step 2, this step should be an easy one.

ALWAYS request a demo with your own data. Text analytics software providers should be happy to sign a mutual NDA; in fact, most enterprise companies require it. This MNDA covers your data and any discussions regarding your business, as well as the IP of the software provider, so it’s a win-win.

Why is this so important? Anyone can put a mock demo together on a mock data set and make it look like it works. The ONLY way to evaluate a software provider is to do so with your own data—data that you are familiar with and that is relevant to you and your business objectives.

One more thing (touched on in step 2): You should approach vendors with an open mind each time. Do not use one vendor’s approach as the basis for assessing another vendor; judge them based on actual output. Does the software have all the features needed to discover/answer your business questions and meet your objectives AND is it easy to learn and use?

One more important tip…

Do NOT allow the vendor to have a lot of time beforehand with your data. If they do you will have no idea how much time they put into setting the demo up. For a $250K contract, a company might well invest two full-time analysts across the span of a couple of weeks to make your demo impressive. Sadly, they may even use “mechanical Turk” (human) coding.

I would advise allowing a vendor no more than a day or two with your data, so make sure to schedule the demo within a day or two of giving them the data. In some instances we’ve even been asked to do the demo the same day, or just an hour or two before receiving said data. Which data is that? The data we chose in step 1, of course!

4 Text Analytics Output

4 Expect Immediate and Ongoing Results

Congratulations! You’ve purchased your software, and hopefully you’ve received some basic training. Ideally you’ve begun using the software right away after the training.

You won’t be a text analytics master on day one, but if you have real data and real objectives and at least one person is responsible for using the tool (and that means that they will have at least a few hours per month for this purpose), then you are in very good shape.

By the way, if you are just getting started with text analytics and/or you have staffing issues, some text analytics vendors may be able to offer you some initial support and be available for special request ad hoc analysis and/or be able to suggest trusted third-party agencies who are trained in use of their tool to help you out in those cases.

Hopefully you didn’t buy the dashboard-only solution—the one everyone uses on all data with no analytical firepower. Instead, you were informed enough to select  the tool that does what you need it to do using your data whenever you need it. Now you’re able to answer business-critical questions in new ways and management will take notice!

5 Socializing Text Analytics

5 Socializing Text Analytics Findings (Recognition and Growth)

This last step is often neglected. It’s only fair that you get noticed for your smart software decision, and more importantly for the incredibly useful insights that you generate using text analytics. Often formerly stale data will come alive, and unstructured data usually has better predictive power than structured data.

Be prepared to evangelize your findings, and don’t be afraid to ask your software provider for suggestions about how to do so. In some cases, an initial small use case in one department ends up spreading to other departments. HR comes to marketing research asking, “Hey, I heard about that analysis you did. We think this data is kind of similar. Would you take a look at it?”

And then there are more formal opportunities, of course, if you are willing to share a case study in an article or conference presentation. The latter are not any more important than the former; in fact, the former is how you will ultimately be judged more immediately.

I hope the above was helpful. Please reach out if you have questions about any of the steps above. I would, of course, be honored if you included us in your process when you get to step 3, and happy to discuss steps 1 and 2 with you before that as well! Contact us to talk about it. 😊

Good luck!


Text Analytics Software Tom H C Anderson


Marketing Research Blooper Reveals Lots of Surprises and Two Important Lessons

April Foolishness: What Happens When You Survey People in the Wrong Language?

I’m going to break with convention today and, in lieu of an April Fool’s gag, I’m going to tell you about an actual goof we recently made that yielded some unexpected but interesting results for researchers.

As you know, last week on the blog we highlighted findings from an international, multilingual Text Analytics Poll™ we conducted around culture. This particular poll spanned 10 countries and eight languages, and when we went to field it we accidentally sent the question to our U.S. sample in Portuguese!

Shockingly, in many cases, people STILL took the time to answer our question! How?

First, bear in mind that these Text Analytics Polls™ consist of only one question and it’s open-ended, not multiple choice. The methodology we use intercepts respondents online and requires them to type an answer to our question before they can proceed to content they’re interested in.

Under the circumstances, you might expect someone to simply type “n/a” or “don’t understand” or even some gibberish in order to move on quickly, and indeed we saw plenty of that. But in many cases, people took the time to thoughtfully point out the error, and even with wit.

Verbatim examples [sic]:

“Are you kidding me, an old american who can say ¡adios!”

“Tuesday they serve grilled cheese sandwiches.” “What the heck is that language?”

“No habla espanol”

“i have no idea what that means”

“2 years of Spanish class and I still don't understand”

Others expressed themselves more…colorfully…

“No, I don't speak illegal immigrant.”

“Speak English! I'm switching to News 13 Orlando. They have better coverage than FT.”

Author’s note: I suspect that last quote was from someone who was intercepted while trying to access a Financial Times article. ;-)

While a lot of people clearly assumed our question was written in Spanish, still others took the time to figure out what the language was and even to translate the question!

“I had to use google translate to understand the question.”

“what the heck does this mean i don't speak Portuguese”

But what surprised me most was that a lot of Americans actually answered our question—i.e., they complied with what we had asked—even though it was written in Portuguese. And many of those replies were in Spanish!!!

We caught our mistake quickly enough when we went to machine-translate the responses and we were told that replies to a question in Portuguese were now being translated from English to English, but two important lessons were learned here:

Takeaway One: Had we made this mistake with a multiple-choice instrument, we either might not have caught it until after the analysis or perhaps not at all. Not only would respondents not have been able to tell us that we had made a mistake, but they would’ve had the easy option of just clicking a response at random. And unless those random clicks amounted to a conspicuous pattern in the data, we could’ve potentially taken the data as valid!

Takeaway Two: The notion that people will not take the time to thoughtfully respond to an open-ended question is total bunk. People not only took the time to answer our question in detail when it was correctly served to them in their own language, but they even spared a thought for us when they didn’t understand the language!

I want to emphasize here that if you’re one of those researchers (and I used to be among this group, by the way) who thinks you can’t include an open-ended question in a quantitative instrument, compel the respondent to answer it, and get a meaningful answer to your question, you are not only mistaken but you’re doing yourself and your client a huge disservice.

Take it from this April fool, open-ended questions not only tell you what you didn’t know; they tell you what you didn’t know you didn’t know.

Thanks for reading. I’d love to hear what you think!


P.S. Find out how much more value an open-ended question can add to your survey using OdinText. Contact us to talk about it.

About Tom H. C. Anderson

Tom H. C. Anderson is the founder and managing partner of OdinText, a venture-backed firm based in Stamford, CT whose patented SAS platform is used by Fortune 500 companies like Disney, Coca-Cola and Shell Oil to mine insights from complex, unstructured and mixed data. A recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research, Anderson is the recipient of numerous awards for innovation from industry associations such as CASRO, ESOMAR and the ARF. He was named one of the "Four under 40" market research leaders by the American Marketing Association in 2010. He tweets under the handle @tomhcanderson.

You Asked for It. Here’s a Chance to Learn More about Our International Culture Poll…

It’s True: You Only Need One Open-Ended Question and Language Doesn’t Matter!

First of all, thank you all so much for the incredible response to this week’s multi-country, multilingual Text Analytics Poll!

I’ve received a flood of email and calls for additional information and I’m always happy to share, so if you have questions or want to geek out with me, please feel free to contact me on our website, LinkedIn or Twitter.

While so many of you thought the findings of our poll were remarkable, I was pleased that the implications for researchers weren’t lost on anyone, notably:

  • A single analyst, speaking English only, can today analyze data in eight different languages,


  • In an age of steeply declining response rates, one can gather deep insights on a multi-dimensional subject with just a single question!

This analysis of more than 15, 500 text comments spanning 11 cultures, 10 countries and eight languages really showcased the power and practicality of modern text analytics.

So much so, in fact, that I am delighted to announce that I’ve been invited by the Insights Association to present on this topic at their inaugural analytics conference, NEXT: Advancing Insights Through Innovation & Research, May 9-10 in New York.

For what it’s worth, I really got a lot out of attending the Insights Association’s CEO conference earlier this year (I blogged about it here).

Anyone interested in conducting international, multilingual research on the scale of our poll this week easily, quickly and affordably will not want to miss my presentation. Please feel free to use my speaker code [NEXTTA15] to register at a 15% discount.

If you won’t be able to attend NEXT, or you can’t wait until May to learn more about what OdinText can do for YOU, please request additional info or a demo here.

Thanks again for your readership, support and interest in what we are doing!


About Tom H. C. Anderson

Tom H. C. Anderson is the founder and managing partner of OdinText, a venture-backed firm based in Stamford, CT whose eponymous, patented SAS platform is used by Fortune 500 companies like Disney, Coca-Cola and Shell Oil to mine insights from complex, unstructured and mixed data. A recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research, Anderson is the recipient of numerous awards for innovation from industry associations such as CASRO, ESOMAR and the ARF. He was named one of the "Four under 40" market research leaders by the American Marketing Association in 2010. He tweets under the handle @tomhcanderson.


Text Analytics Answers - Is All Culture Becoming American? Part 2

Defined in Their Own Words: 11 Cultures, 10 Countries & Eight Languages  

In Part 1 of this series, I provided a top line from our analysis of comments from more than 15,500 people spanning 11 cultures in 10 countries and eight languages in response to one question:

“How would you explain <insert country> culture to someone who isn’t at all familiar with it?”

After translating and analyzing these data with OdinText—an exercise that took fewer than two hours—we discovered that across cultures, by and large, one of the defining characteristics of almost every culture represented in our sample is that it is multicultural, suggesting that there may indeed be some validity to the argument that globalization is having a “melting pot” effect on cultures around the world.

Of course, multiculturalism/diversity was far from the only common attribute that people mentioned across cultures (it was simply the most prevalent one); it took quite a few commonalities mentioned across cultures to generate what we saw in the aggregate visualization we shared in Part 1, which showed by graphic proximity how alike or dissimilar the 11 cultures in our sample are and which, not coincidentally, put U.S. culture at the relative center of it all.

Not surprisingly, though, we also found that every culture retains unique characteristics in the eyes of its respective members. Today we’re going to look closely at what those similarities and differences are for each culture.

Cultural Characteristics in Their Own Words

Each of the charts below contains primary cultural descriptors—features/attributes/topics—identified by OdinText at the country/culture level compared to the mean aggregate for all countries/cultures studied in the sample.


Baseball, hotdogs, apple pie and Chevrolet are surprisingly NOT top-of-mind for Americans. In fact, only FOUR people out of 1,500 mentioned baseball. Instead, we found that Americans overwhelmingly view their cultural identity in terms of freedom and multiculturalism/melting pot.

Visualization is a powerful and important tool for telling a story through data in research today, so just to offer a little variety I rendered the same data in a spider chart. What do you think? What does this visualization say about U.S. culture compared to the international aggregate?


The Brits are well known for their humor, and apparently they consider it a key part of their cultural identity. They are also a little unusual in that Brits closely associate their culture with a culinary staple—fish & chips—something we had expected to see more of across cultures, but did not.



It’s almost cliché, but Aussies are laid back and they know it.



Brazilians are keenly aware of their cultural diversity and they think trendiness and sexiness set them apart.


Recall that in Part 1 yesterday I noted that terms like “French,” “American,” and “Spanish” turned up in people’s descriptions and that for our purposes here they weren’t terribly useful? Well, this is one case where the use of the term “French” speaks volumes.  The French are unusually self-aware and see their culture as being so distinctive and pronounced that little explanation is actually needed. It’s almost self-evident in their minds, so they assume that characterizing something as “French”—“French cuisine,” for example—is sufficient to describe their culture.


Mexicans explain their culture in terms of tradition and vibrancy—color, beauty, flavor.


Like their neighbors in France, Spaniards have a sense of their culture as being highly distinctive. Lifestyle featured prominently here—things like siesta, the beach and sunshine, etc. I personally found it interesting that the Spanish simultaneously see diversity/multiculturalism as a key facet of their culture.


Asked about their culture, Germans point to beer, but there isn’t much in the way of fun or frivolity beyond that.  They also consider their culture to be versatile/flexible, orderly/rule-abiding and efficacious.

More importantly, comments from our German sample had a conspicuously lower incidence of actual cultural features than those of other cultures. This would seem to indicate that Germans are somewhat uncomfortable talking about German culture, which isn’t entirely surprising. Obviously, there’s a great deal of sensitivity and angst around discussion of German identity today as a legacy of Nazism. Remember also that until relatively recently Germany was two different countries. What German culture is, exactly, post-reunification may not be entirely clear to Germans, themselves.


The Japanese were unique in many ways, not the least of which being that describing their culture proved exceedingly difficult—and a very different kind of difficult from what we see in the German analysis. A significant number of Japanese respondents characterize Japanese culture as something that almost defies description and must instead be experienced to be understood. The Japanese also see their culture as being rigid and extremely pronounced and comments suggest that the Japanese find great comfort in rules. Indeed, this is the only group in our sample where not a single person mentioned “freedom.”

CANADA (English)

I promised you 11 cultures. To illustrate, here’s a side-by-side comparison of French Canadians and English Canadians.

For residents of English-speaking Canada, multiculturalism is a huge facet of their culture, while tradition is apparently less important. Canadians also take their national pastime—Hockey—as a cultural hallmark (unlike their neighbors in the States who, again, hardly mentioned baseball).

CANADA (French)

I promised you 11 cultures.

French-speaking Canadians (aka the Québécois) are, of course, quite dissimilar from their English-speaking countrymen in many ways. First and foremost, they’re fiercely French—so much so that “Frenchness” is more important to their cultural identity than it is to the actual French in France!

This concludes Part 2 of our international culture expedition.

In Part 3, I’ll share the fascinating results of OdinText’s emotional analysis of this comment data. What people say about their respective cultures, analyzed for significant patterns of emotion, tells an entirely new story! Join us for Part 3 tomorrow.

Tomorrow: Part III – How Emotions Speak Louder than Words


@TomHCAnderson - @OdinText

PS. Have questions about today’s post? Feel free to post a comment or request more info here.

About Tom H. C. Anderson

Tom H. C. Anderson is the founder and managing partner of OdinText, a venture-backed firm based in Stamford, CT whose eponymous, patented SAS platform is used by Fortune 500 companies like Disney, Coca-Cola and Shell Oil to mine insights from complex, unstructured and mixed data. A recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research, Anderson is the recipient of numerous awards for innovation from industry associations such as CASRO, ESOMAR and the ARF. He was named one of the "Four under 40" market research leaders by the American Marketing Association in 2010. He tweets under the handle @tomhcanderson.