Posts in Tom- H- C- Anderson
Analitica de Texto En Español

Analitica de Texto En Español – Spanish Text Analysis

Analitica de Texto En Español, I didn’t write that, it is machine translation of "Text Analytics in Spanish"

Mathematics has often been called the Universal Language, but in an age of instant machine translation, any text, or text data, is as understandable as math.

That’s one of the reasons I was very happy to take part in a special series of interviews in celebration of the Spanish Association of Market Research’s 50th Anniversary.

Several of our clients are analyzing non English text with OdinText, but in some ways a single mono lingual analyst being able to instantly analyze the comments of millions of customers speaking multiple foreign languages is even more exciting. And this isn’t science fiction, many of our global clients have been doing this for some time now.

The current issue of AEDMO’s Magazine (Asociación Española de Estudios de Mercado, Marketing y Opinión) celebrates technology in the world of research, and several prominent researchers have been invited to write on their core issues of expertise. I was honored to give an interview on text analytics.

If you don’t get their magazine you can read our Q&A on their blog here in Spanish or English.

Their Editor Xavier Moraño asked some very interesting and pertinent questions.

I’d love to hear your thoughts and questions.

Tom H. C. Anderson Chief Research Officer @OdinText

The State of Marketing Research Innovation

What You Missed at IIEX 2018 – 3 Takeaways Walking the floor at the Insights Innovation Exchange (IIEX) for a day and a half with our new CEO, Andy Greenawalt, we spoke to several friends, client and supplier side partners, and ducked into quite a few exciting startup sessions.

Three things struck me this year:

-Insights Technology is Finally Getting More Innovative. By that I mean there are no longer just the slight immaterial modifications to existing ways of doing things, but actual innovation that has disruptive implications (passive monitoring, blockchain, image recognition, more intelligent automation…).

As expected most of this innovation is coming from startups, many of which, while they have interesting ideas, have little to no experience in marketing research - and have yet to prove their use cases.

-A Few Marketing Research Suppliers are picking up their consulting game. Surprisingly perhaps, in this area it seems that change is coming from the Qualitative side. For a while qualitative looked like a race to the bottom in terms of price, even more so than what was happening in Quantitative Research. But there are now a handful of Image/Brand/Ideation ‘Agencies’ whose primary methodologies are qualitative who are leading the way to a higher value proposition. There are a couple, but I will mention two I’ve been most impressed with specifically, Brandtrust and Shapiro+Raj, Bravo!

-The Opportunity. I think the larger opportunity if there is one, lies in the ability of the traditional players to partner with and help prove the use cases of some of these newer startup technologies. Incorporating them into consulting processes with higher end value propositions, similar to what the qualitative agencies I noted above have done.

This seems to be both an opportunity and a real challenge. Can Old help New, and New help Old? It may be more likely that the end clients, especially those that are more open to DIY processes will be the ones that select and prove the use cases of these new technologies offered by the next generation of startups, and therefore benefit the most.

While this too is good, I fear that by leaving some of the traditional companies behind we will lose some institutional thinking and sound methodology along the way.

Either way, I’m more optimistic on new Marketing Research Tech than I’ve ever been.

Keep in mind though, Innovation in Marketing Research should be about more than just speed and lower cost (automation). It should be even more about doing things better, giving the companies and clients we work for an information advantage!

@TomHCAnderson

OdinText Names Andy Greenawalt CEO
Andy Greenawalt to lead OdinText accelerated growth phase

We are happy to announce serial Inc. 500 entrepreneur Andy Greenawalt as CEO effective June 1. OdinText founder and current CEO Tom H.C. Anderson will transition to the roles of Chief Research Officer and Chairman.

Andy Greenawalt

An accomplished tech entrepreneur and leader, Greenawalt has successfully built two Inc. 500 SaaS (software as a service) businesses. Most recently, he was CEO of Continuity, a pioneer in the Regulatory Technology industry, and he remains chairman of its board. Prior to Continuity, Greenawalt founded Perimeter eSecurity, now part of BAE Systems, serving as CEO and CTO and on its board. He is a graduate of the University of Massachusetts, Amherst with a degree in Philosophy and Cognitive Linguistics.

“With more Fortune 500 companies choosing OdinText, Andy Greenawalt’s credentials in innovation, his successful record of building SaaS businesses, and his singular focus on creating customer value make him a perfect fit to lead OdinText through its next phase of growth,” said Anderson.

“OdinText is a truly rare startup with Fortune 500 enterprise customers —  the most sophisticated buyers in the world,” said Greenawalt. “This is a testament to the vision and team that Tom Anderson has assembled and it’s a great position to be starting from as a pioneer in the text analytics market. The company is very well positioned to bring a new platform to bear and serve as a cornerstone to the smart enterprise of the future.”

Alison Malloy, the lead investor in OdinText from Connecticut Innovations, stated, “Connecticut Innovations has worked with Andy Greenawalt for 20 years. We have absolute confidence that he’s the right person to realize the market potential of OdinText — which has pioneered the next generation of text analytics — allowing Tom Anderson to focus on the research needed to continue to develop and lead the market with industry-leading products.”

“OdinText has developed patented IP, raised pre-seed funding and created an MVP product,” Greenawalt said. “OdinText is a transformative solution that is now poised to redefine how businesses improve satisfaction, retention and revenue. We expect to grow dramatically.”

GRIT Survey 2018

Celebrating Innovative Companies in Marketing Research

It's that time of year again when Greenbook fields their biannual GRIT market research industry survey.

Thankfully it looks like the Greenbook team has made the survey a bit shorter than last year. I do encourage fellow researchers to take the survey, as it does give everyone some direction in terms of where things seem to be heading.

You can take the survey here.

Thanks in advance for your participation.

@TomHCAnderson

PS. This is the GRIT survey which looks for the most innovative insights companies, both supplier side and client side. We encourage you to give some thought to this section as well. Its nice to recognize up and coming companies, as well as your go-to favorites.

I also want to take this time thank everyone who voted for OdinText in the most innovative supplier category last year. We were very encouraged by the support and have been working harder than ever to release a brand new version of the software next month!

A New Trend in Qualitative Research

Almost Half of Market Researchers are doing Market Research Wrong! - My Interview with the QRCA (And a Quiet New Trend - Science Based Qualitative).

Two years ago I shared some research on research about how market researchers view Quantitative and Qualitative research. I stated that almost half of researchers don’t understand what good data is. Some ‘Quallies’ tend to rely and work almost exclusively with comment data from extremely small samples (about 25% of market researchers surveyed), conversely there is a large group of ‘Quant Jockey’s’ who while working with larger more representative sample sizes, purposefully avoid any unstructured data such as open ended comments because they don’t want to deal with coding and analyzing it or don’t believe in it’s accuracy and ability to add to the research objectives. In my opinion both researcher groups have it totally wrong, and are doing a tremendous disservice to their companies and clients.  Today, I’ll be focusing on just the first group above, those who tend to rely primarily on qualitative research for decisions.

Note that today’s blog post is related to a recent interview, which I was asked to take part in by the QRCA’s (Qualitative Research Consultant’s Association) Views Magazine. When they contacted me I told them that in most cases (with some exceptions), Text Analytics really isn’t a good fit for Qualitative Researchers, and asked if they were sure they wanted to include someone with that opinion in their magazine? I was told that yes, they were ok with sharing different viewpoints.

I’ll share a link to the full interview in the online version of the magazine at the bottom of this post. But before that, a few thoughts to explain my issues with qualitative data and how it’s often applied as well as some of my recent experiences with qualitative researchers licensing our text analytics software, OdinText.

 The Problem with Qualitative Research

IF Qual research was really used in the way it’s often positioned, ‘as a way to inform quant research’, that would be ok. The fact of the matter is though, Qual often isn’t being used that way, but instead as an end in and of itself. Let me explain.

First, there is one exception to this rule of only using Qual as pilot feedback for Quant. If you had a product for instance which was specifically made only for US State Governors, then your total population is only N=50. And of course it is highly unlikely that you would ever get all the Governors of each and every US State to participate in any research (which would be a census of all governors), and so if you were fortunate enough to have a group of say 5 Governors whom were willing to give you feedback on your product or service, you would and should obviously hang on to and over analyze every single comment they gave you.

IF however you have even a slightly more common mainstream product, I’ll take a very common product like hamburgers as an example, and you are relying on 5-10 focus groups of n=12 to determine how different parts of the USA (North East, Mid-West, South and West) like their burgers, and rather than feeding  directly into some quantitative research instrument with a greater sample, you issue a ‘Report’ that you share with management; well then you’ve probably just wasted a lot of time and money for some extremely inaccurate and dangerous findings. Yet surprisingly, this happens far more often than one would imagine.

Cognitive Dissonance Among Qual Researchers when Using OdinText

How do I know this you may ask? Good Text Analytics software is really about data mining and pattern recognition. When I first launched OdinText we had a lot of inquiries from Qualitative researchers who wanted some way to make their lives easier. After all, they had “a lot” of unstructured/text comment data which was time consuming for them to process, read, organize and analyze. Certainly, software made to “Analyze Text” must therefore be the answer to their problems.

The problem was that the majority of Qual researchers work with tiny projects/sample, interviews and groups between n=1 and n=12. Even if they do a couple of groups like in the hamburger example I gave above, we’re still taking about a total of just around n=100 representing four or more regional groups of interest, and therefore fewer than n=25 per group. It is impossible to get meaningful/statistically comparable findings and identify real patterns between the key groups of interest in this case.

The Little Noticed Trend In Qual (Qual Data is Getting Bigger)

However, slowly across the past couple of years or so, for the first time I’ve seen a movement of some ‘Qualitative’ shops and researchers, toward Quant. They have started working with larger data sets than before. In some cases, it has been because they have been pulled in to manage larger ongoing community/boards, in some cases larger social media projects, and in others, they have started using survey data mixed with qual, or even better, employing qualitative techniques in quant research (think better open-ends in survey research).

For this reason, we now have a small but growing group of ‘former’ Qual researchers using OdinText. These researchers aren’t our typical mixed data or quantitative researchers, but qualitative researchers that are working with larger samples.

And guess what, “Qualitative” has nothing to do with whether data is in text or numeric format, instead it has everything to so with sample size. And so perhaps unknowingly, these ‘Qualitative Researchers’ have taken the step across the line into Quantitative territory, where often for the first time in their career, statistics can actually be used. – And it can be shocking!

My Experience with ‘Qualitative’ Researchers going Quant/using Text Analytics

Let me explain what I mean. Recently several researchers that come from a clear ‘Qual’ background have become users of our software OdinText. The reason is that the amount of data they had was quickly getting “bigger than they were able to handle”. They believe they are still dealing with “Qualitative” data because most of it is text based, but actually because of the volume, they are now Quant researchers whether they know it or not (text or numeric data is irrelevant).

Ironically, for this reason, we also see much smaller data sizes/projects than ever before being uploaded to the OdinText servers. No, not typically single focus groups with n=12 respondents, but still projects that are often right on the line between quant and qual (n=100+).

The discussions we’re having with these researchers as they begin to understand the quantitative implications of what they have been doing for years are interesting.

Let me preface this with the fact that I have a great amount of respect for the ‘Qualitative’ researchers that begin using OdinText. Ironically, the simple fact that we have mutually determined that an OdinText license is appropriate for them means that they are no longer ‘Qualitative’ researchers (as I explained earlier). They are in fact crossing the line into Quant territory, often for the first time in their careers.

The data may be primarily text based, though usually mixed, but there’s no doubt in their mind nor ours, that one of the most valuable aspects of the data is the customer commentary in the text, and this can be a strength

The challenge lies in getting them to quickly accept and come to terms with quantitative/statistical analysis, and thereby also the importance of sample size.

What do you mean my sample is too small?

When you have licensed OdinText you can upload pretty much any data set you have. So even though they may have initially licensed OdinText to analyze some projects with say 3,000+ comments, there’s nothing to stop them from uploading that survey or set of focus groups with just n=150 or so.

Here’s where it sometimes gets interesting. A sample size of n=150 is right on the borderline. It depends on what you are trying to do with it of course. If half of your respondents are doctors (n=75) and half are nurses (n=75), then you may indeed be able to see some meaningful differences between these two groups in your data.

But what if these n=150 respondents are hamburger customers, and your objective was to understand the difference between the 4 US regions in the I referenced earlier? Then you have about n=37 in each subgroup of interest, and you are likely to have very few, IF ANY, meaningful patterns or differences.

Here’s where that cognitive dissonance can happen --- and the breakthroughs if we are lucky.

A former ‘Qual Researcher’ who has spent the last 15 years of their career making ‘management level recommendations’ on how to market burgers differently in different regions based on data like this, for the first time is looking at software which says that there are maybe just two to 3 small differences, or even worse, NO MEANINGFUL PATTERNS OR DIFFERENCES WHATSOEVER, in their data, may be in shock!

How can this be? They’ve analyzed data like this many times before, and they were always able to write a good report with lots of rich detailed examples of how North Eastern Hamburger consumers preferred this or that because of this and that. And here we are, looking at the same kind of data, and we realize, there is very little here other than completely subjective thoughts and quotes.

Opportunity for Change

This is where, to their credit, most of our users start to understand the quantitative nature of data analysis. They, unlike the few ‘Quant Only Jockie’s’ I referenced at the beginning of the article already understand that many of the best insights come from text data in free form unaided, non-leading, yet creative questions.

They only need to start thinking about their sample sizes before fielding a project. To understand the quantitative nature of sampling. To think about the handful of structured data points that they perhaps hadn’t thought much about in previous projects and how they can be leveraged together with the unstructured data. They realize they need to start thinking about this first, before the data has all been collected and the project is nearly over and ready for the most important step, the analysis, where rubber hits the road and garbage in really should mean garbage out.

If we’re lucky, they quickly understand, its not about Quant and Qual any more. It’s about Mixed Data, it’s about having the right data, it’s about having enough data to generate robust findings and then superior insights!

Final Thoughts on the Two Meaningless Nearly Terms of ‘Quant and Qual’

As I’ve said many times before here and on the NGMR blog, the terms “Qualitative” and “Quantitative” at least the way they are commonly used in marketing research, is already passé.

The future is Mixed Data. I’ve known this to be true for years, and almost all our patent claims involve this important concept. Our research shows time and time again, that when we use both structured and unstructured data in our analysis, models and predictions, the results are far more accurate.

For this reason we’ve been hard at work developing the first ever truly Mixed Data Analytics Platform, we’ll be officially launching it three months from now, but many of our current customers already have access. [For those who are interested in learning more or would like early access you can inquire here: OdinText.com/Predict-What-Matters].

In the meantime, if you’re wondering whether you have enough data to warrant advanced mixed data and text annalysis, check out the online version of article in QRCA Views magazine here. Robin Wedewer at QRCA really did an excellent job in asking some really pointed questions that forced me too answer more honestly and clearly than I might otherwise have.

I realize not everyone will agree with today’s post nor my interview with QRCA, and I welcome your comments here. I just please ask that you read both the post above, as well as the interview in QRCA before commenting solely based on the title of this post.

Thank you for reading. As always, I welcome questions publicly in post below or privately via LinkedIn or our Inquiry form.

@TomHCAnderson

Best 10 Text Analytics Tips Posts of The Year

Our Top 10 Most Read Data and Text Mining Posts of 2017

Thank you for reading our blog this year. The OdinText blog has quickly become even more popular than the Next Gen Market Research blog, and I really appreciate the thoughtful feedback we’ve gotten here on the blog, via Twitter, and email.

In case you’re curious, here are the most popular posts of the year:

#10 NFL Players Taking a Knee is More Complex and Polarizing Than We Think If a Topic is Worth Quantifying – It’s Also Worth Understanding The Why’s Behind It

#9 Text Analytics Picks The 10 Strongest Super Bowl Ads New Text Analytics Poll Shows Which Super Bowl Ads Really Performed Best

#8 Why Your HR Survey is a Lie and How to Get The Truth OdinText Discovers Job Satisfaction Drivers in Anonymous Employee Data

#7 Of Tears & Text Analytics (An OdinText User Story – Text Analytics Guest Post (AI Meets VOC))

#6 65 CEO’s Share Thoughts on Insights (Insights Associations Inaugural CEO Summit – A Future Tied to Collaboration and Technology)

#5 Why Machine Learning is Meaningless (Beware of Buzzwords! The Truth about ‘Machine Learning’ and ‘Artificial Intelligence’)

#4 Do You Speak Teen? OdinText Announces 2nd Annual List of Top 10 Slang Terms (How Text Analytics Can Help Marketers Move at the Speed of Slang)

#3 Text Analysis Reveals Potential French Election French Election Upset (Text Analytics Poll Showed How Close Le Pen Came to ‘Trumping’ Macron)

#2 Text Analytics Poll: Why We Unfriend on Facebook (You Can’t Handle The Truth (And Other Top Reasons Why We Unfriend on Facebook)

#1 What Americans Really Think About Trump’s Immigration Ban and Why (Text Analysis of What People Say in Their Own Words Reveals More Than Multi-Choice Surveys)

 

I thought I’d also check what our top 5 posts were from last year as well, here they are in case you missed them:

Top Posts From 2016

#1 Text Analysis Answers Is The Quran Really More Violent Than The Bible (3 Parts)

#2 Attensity Sold – What Does it Mean?

#3 Customer Satisfaction Surveys: What do Satisfied VS Dissatisfied Customers Talk About?

#4 What’s Really Wrong With Polling?

#5 What Your Customer Satisfaction Research Isn’t Telling You

Thanks again for reading and commenting. As always I welcome your thoughts and questions via LinkedIn, or feel free to request info on anything you’ve read above here.

Happy New Year!

@TomHCAnderson

2018 Predictions for Market Research and Analytics

What Kind of Researcher are You?

It’s that time of year again where RFL Communications and Greenbook request predictions from market researchers on what trends they expect to see in the new year. Of course no one knows for sure, but some are interesting fun to read and I always like searching for the overall patterns, if any.

That said, here’s the one I submitted this year. I’m curios to to hear yours as well.

 

2018 The Best of Times & The Worst of Times

 The gap between what I’ll call ‘Just Traditional Research’ and more flexible, fluid ‘Advanced Analytics Generalists’ will continue to grow.

 There are three groups of marketing researchers along this dimension. Some ‘Just Traditional’ researchers and companies will not be able to adapt and will want to continue doing just the focus groups or panel surveys they have been doing and will become increasingly out of touch.

 A second group will feign expertise in these not so new areas of data and text mining (Advanced Analytics), they will prefer to call it “AI and Machine Learning” of course, but without any meaningful change to their products, services or analysis. It will be a sales and marketing treatment only.

 Both these groups are rather process oriented. The former doesn’t want to change their process, the latter just want a shiny new process. In either case, the end goal suffers. For both of these two groups the future is dim indeed.

 A third group of researchers, the group OdinText is invested in, don’t try to improve and change because they think they must in order to survive, they were already doing it because they are genuinely curious and ambitious. They don’t just want to run that survey a little faster and a little cheaper, they want much more than that. They want to add significant value for their company via their analysis.

 They will invest in learning new tools and techniques, and yet will not expect these tools to magically do the work for them after they push a button. These are not lazy employees/managers, they are A type employees, and they are the future of what ‘Marketing Research/Analytics’ is to become.

 They realize their own ingenuity and sweat need to be coupled with the new technology to achieve a competitive advantage and surpass management expectations and their competition. They are excited by those prospects, not scared.

 I too am very excited about meeting and working with more of these true ‘Advanced Analytics Generalists’ and the Marketing Research Supplier firms who serve them and realize Co-Opetition with other firms with key strengths that they don’t have make more sense than buzz words and feigning expertise in all categories.

 For these ‘New Data Scientists’, no these ‘Next Gen Market Researchers’ 2018 will be the best of times!

It’s a BIT lengthy and general for a prediction. But I believe it’s a real trend that will continue to accelerate. Do you agree or disagree?  What are your predictions?

If you subscribe to RFL Communications Business Report you’ll be receiving the annual writeup on this topic there, you can check out the Greenbook version from 36 CEO’s online here.

While you can tell all those participating takes this with various degrees of seriousness, and answer with different Point of Views, I believe that reading all of them, and deciding what patterns if any are detectable across them is well worth the 30 minutes or so it takes to do this.

Again, very much appreciate YOUR thoughts and predictions as well, so please feel free to comment below.

@TomHCAnderson

Artificial Intelligence in Consumer Insights

A Q&A session with ESOMAR’s Research World on Artificial Intelligence, Machine Learning, and implications in Marketing Research  [As part of an ESOMAR Research World article on Artificial Intelligence OdinText Founder Tom H. C. Anderson was recently took part in a Q&A style interview with ESOMAR’s Annelies Verheghe. For more thoughts on AI check out other recent posts on the topic including Why Machine Learning is Meaningless, and Of Tears and Text Analytics. We look forward to your thoughts or questions via email or in the comments section.]

 

ESOMAR: What is your experience with Artificial Intelligence & Machine Learning (AI)? Would you describe yourself as a user of AI or a person with an interest in the matter but with no or limited experience?

TomHCA: I would describe myself as both a user of Artificial Intelligence as well as a person with a strong interest in the matter even though I have limited mathematical/algorithmic experience with AI. However, I have colleagues here at OdinText who have PhD's in Computer Science and are extremely knowledgeable as they studied AI extensively in school and used it elsewhere before joining us. We continue to evaluate, experiment, and add AI into our application as it makes sense.

ESOMAR: For many people in the research industry, AI is still unknown. How would you define AI? What types of AI do you know?

TomHCA: Defining AI is a very difficult thing to do because people, whether they are researchers, data scientists, in sales, or customers, they will each have a different definition. A generic definition of AI is a set of processes (whether they are hardware, software, mathematical formulas, algorithms, or something else) that give anthropomorphically cognitive abilities to machines. This is evidently a wide-ranging definition. A more specific definition of AI pertaining to Market Research, is a set of knowledge representation, learning, and natural language processing tools that simplifies, speeds up, and improves the extraction of meaningful data.

The most important type of AI for Market Research is Natural Language Processing. While extracting meaningful information from numerical and categorical data (e.g., whether there is a correlation between gender and brand fidelity) is essentially an easy and now-solved problem, doing the same with text data is much more difficult and still an open research question studied by PhDs in the field of AI and machine learning. At OdinText, we have used AI to solve various problems such as Language Detection, Sentence Detection, Tokenizing, Part of Speech Tagging, Stemming/Lemmatization, Dimensionality Reduction, Feature Selection, and Sentence/Paragraph Categorization. The specific AI and machine learning algorithms that we have used, tested, and investigated range a wide spectrum from Multinomial Logit to Principal Component Analysis, Principal Component Regression, Random Forests, Minimum Redundancy Maximum Relevance, Joint Mutual Information, Support Vector Machines, Neural Networks, and Maximum Entropy Modeling.

AI isn’t necessarily something everyone needs to know a whole lot about. I blogged recently, how I felt it was almost comical how many were mentioning AI and machine learning at MR conferences I was speaking at without seemingly any idea what it means. http://odintext.com/blog/machine-learning-and-artificial-intelligence-in-marketing-research/

In my opinion, a little AI has already found its way into a few of the applications out there, and more will certainly come. But, if it will be successful, it won’t be called AI for too long. If it’s any good it will just be a seamless integration helping to make certain processes faster and easier for the user.

ESOMAR: What concepts should people that are interested in the matter look into?

TomHCA: Unless you are an Engineer/Developer with a PhD in Computer Science, or someone working closely with someone like that on a specific application, I’m not all that sure how much sense it makes for you to be ‘learning about AI’. Ultimately, in our applications, they are algorithms/code running on our servers to quickly find patterns and reduce data.

Furthermore, as we test various algorithms from academia, and develop our own to test, we certainly don’t plan to share any specifics about this with anyone else. Once we deem something useful, it will be incorporated as seamlessly as possible into our software so it will benefit our users. We’ll be explaining to them what these features do in layman’s terms as clearly as possible.

I don’t really see a need for your typical marketing researcher to know too much more than this in most cases. Some of the algorithms themselves are rather complex to explain and require strong mathematical and computer science backgrounds at the graduate level.

ESOMAR: Which AI applications do you consider relevant for the market research industry? For which task can AI add value?

TomHCA: We are looking at AI in areas of Natural Language Processing (which includes many problem subsets such as Part of Speech Tagging, Sentence Detection, Document Categorization, Tokenization, and Stemming/Lemmatization), Feature Selection, Data Reduction (i.e., Dimensionality Reduction) and Prediction. But we've gone well beyond that. As a simple example, take key driver analysis. If we have a large number of potential predictors, which are the most important in driving a KPI like customer satisfaction?

ESOMAR: Can you share any inspirational examples from this industry or related industries (advertisement, customer service)  that can illustrate these opportunities

TomHCA: As one quick example, a user of OdinText I recently spoke to used the software to investigate what text comments were most likely to drive belonging into either of several predefined important segments. The nice thing about AI is that it can be very fast. The not so nice thing is that sometimes at first glance some of the items identified, the output, can either be too obvious, or on the other extreme, not make any sense whatsoever.  The gold is in the items somewhere in the middle. The trick is to find a way for the human to interact with the output which gives them confidence and understanding of the results.

a human is not capable of correctly analyzing thousands, 100s of thousands, or even millions of comments/datapoints, whereas AI will do it correctly in a few seconds. The downside of AI is that some outcomes are correct but not humanly insightful or actionable. It’s easier for me to give examples when it didn’t work so well since its hard for me to share info on how are clients are using it. But for instance recently AI found that people mentioning ‘good’ 3 times in their comments was the best driver of NPS score – this is evidently correct but not useful to a human.

In another project a new AI approach we were testing reported that one of the most frequently discussed topics was “Colons”. But this wasn’t medical data! Turns out the plural of Colon is Cola, I didn’t know that. Anyway, people were discussing Coca-Cola, and AI read that as Colons…  This is exactly the part of AI that needs work to be more prevalent in Market Research.”

Since I can’t talk about too much about how our clients use our software on their data, In a way it’s easier for me to give a non-MR example. Imagine getting into a totally autonomous car (notice I didn’t have to use the word AI to describe that). Anyway, you know it’s going to be traveling 65mph down the highway, changing lanes, accelerating and stopping along with other vehicles etc.

How comfortable would you be in stepping into that car today if we had painted all the windows black so you couldn’t see what was going on?  Chances are you wouldn’t want to do it. You would worry too much at every turn that you might be a casualty of oncoming traffic or a tree.  I think partly that’s what AI is like right now in analytics. Even if we’ll be able to perfect the output to be 100 or 99% correct, without knowing what/how we got there, it will make you feel a bit uncomfortable.  Yet showing you exactly what was done by the algorithm to arrive at the solution is very difficult.

Anyway, the upside is that in a few years perhaps (not without some significant trial and error and testing), we’ll all just be comfortable enough to trust these things to AI. In my car example, you’d be perfectly fine getting into an Autonomous car and never looking at the road, but instead doing something else like working on your pc or watching a movie.

The same could be true of a marketing research question. Ultimately the end goal would be to ask the computer a business question in natural language, written or spoken, and the computer deciding what information was already available, what needed to be gathered, gathering it, analyzing it, and presenting the best actionable recommendation possible.

ESOMAR: There are many stories on how smart or stupid AI is. What would be your take on how smart AI Is nowadays. What kind of research tasks can it perform well? Which tasks are hard to take over by bots?

TomHCA: You know I guess I think speed rather than smart. In many cases I can apply a series of other statistical techniques to arrive at a similar conclusion. But it will take A LOT more time. With AI, you can arrive at the same place within milliseconds, even with very big and complex data.

And again, the fact that we choose the technique based on which one takes a few milliseconds less to run, without losing significant accuracy or information really blows my mind.

I tell my colleagues working on this that hey, this can be cool, I bet a user would be willing to wait several minutes to get a result like this. But of course, we need to think about larger and more complex data, and possibly adding other processes to the mix. And of course, in the future, what someone is perfectly happy waiting for several minutes today (because it would have taken hours or days before), is going to be virtually instant tomorrow.

ESOMAR: According to an Oxford study, there is a 61% chance that the market research analyst job will be replaced by robots in the next 20 years. Do you agree or disagree? Why?

TomHCA: Hmm. 20 years is a long time. I’d probably have to agree in some ways. A lot of things are very easy to automate, others not so much.

We’re certainly going to have researchers, but there may be fewer of them, and they will be doing slightly different things.

Going back to my example of autonomous cars for a minute again. I think it will take time for us to learn, improve and trust more in automation. At first autonomous cars will have human capability to take over at any time. It will be like cruise control is now. An accessory at first. Then we will move more and more toward trusting less and less in the individual human actors and we may even decide to take the ability for humans to intervene in driving the car away as a safety measure. Once we’ve got enough statistics on computers being safe. They would have to reach a level of safety way beyond humans for this to happen though, probably 99.99% or more.

Unlike cars though, marketing research usually can’t kill you. So, we may well be comfortable with a far lower accuracy rate with AI here.  Anyway, it’s a nice problem to have I think.

ESOMAR: How do you think research participants will react towards bot researchers?

TomHCA: Theoretically they could work well. Realistically I’m a bit pessimistic. It seems the ability to use bots for spam, phishing and fraud in a global online wild west (it cracks me up how certain countries think they can control the web and make it safer), well it’s a problem no government or trade organization will be able to prevent from being used the wrong way.

I’m not too happy when I get a phone call or email about a survey now. But with the slower more human aspect, it seems it’s a little less dangerous, you have more time to feel comfortable with it. I guess I’m playing devil’s advocate here, but I think we already have so many ways to get various interesting data, I think I have time to wait RE bots. If they truly are going to be very useful and accepted, it will be proven in other industries way before marketing research.

But yes, theoretically it could work well. But then again, almost anything can look good in theory.

ESOMAR: How do you think clients will feel about the AI revolution in our industry?

TomHCA: So, we were recently asked to use OdinText to visualize what the 3,000 marketing research suppliers and clients thought about why certain companies were innovative or not in the 2017 GRIT Report. One of the analysis/visualizations we ran which I thought was most interesting visualized the differences between why clients claimed a supplier was innovative VS why a supplier said these firms were innovative.

I published the chart on the NGMR blog for those who are interested [ http://nextgenmr.com/grit-2017 ], and the differences couldn’t have been starker. Suppliers kept on using buzzwords like “technology”, “mobile” etc. whereas clients used real end result terms like “know how”, "speed" etc.

So I’d expect to see the same thing here. And certainly, as AI is applied as I said above, and is implemented, we’ll stop thinking about it as a buzz word, and just go back to talking about the end goal. Something will be faster and better and get you something extra, how it gets there doesn’t matter.

Most people have no idea how a gasoline engine works today. They just want a car that will look nice and get them there with comfort, reliability and speed.

After that it’s all marketing and brand positioning.

 

[Thanks for reading today. We’re very interested to hear your thoughts on AI as well. Feel free to leave questions or thoughts below, request info on OdinText here, or Tweet to us @OdinText]

New Sentiment Analysis Reveals Reasons Behind Stance on Confederate Statues

Text Analytics Poll™ shows asking respondents to provide reasons for their opinions may increase cognition and decrease “No Opinion”

Asking People WHY They Support/Oppose Civil War Monuments May Affect Results. Judging from the TV news and social media, the entire country is up in arms over the status of Confederate Civil War monuments. What really is the mood of the country in regard to these statues?

A quick Google search turned up the chart below, which to YouGov’s credit broke out not just Democrats VS Republicans, but also blacks VS whites. On a high level this structured survey question, which allowed respondents to answer a standard five-point agreement scale from ‘strongly approve’ to ‘strongly disapprove,’ seems to indicate that “almost half of Americans (48%)” want the Charlottesville Robert E. Lee statue to stay.

While emotions as depicted on TV and social media are running  high, there doesn’t seem to be much reasoned discussion about WHY people feel so strongly on either side. Therefore, we were curious if rather than just asking a closed-ended agreement scale, what would happen instead if respondents were asked to elaborate on their choice with a reason?

Note: the goal here is not to uncover all the best reasons for or against keeping the statues. If that was the case we could approach a handful of social science professors with expertise in history, civil rights or ethics and psychology. Instead, we were curious to see if simply asking someone to consider a reason for their choice (even if they could not give a very good one) would affect the proportions of those agreeing or disagreeing. Of course, we were also curious about how many reasons each side might enumerate and what the quality of those reasons might be.

We asked a random sample of 1,500 Americans the following:

Q. Should Confederate Civil War Monuments be allowed in the US, why or why not?

Asking respondents to provide a reason, and using Text Analytics to measure sentiment, provided an almost identical number in favor of removing Confederate Civil War statues (29%) as the simple Likert scale poll; however, it halved the number of “Don’t Know/Don’t Care” responses (just 10%), apparently to the benefit of those who support keeping Confederate Civil War statues intact (61%).

EMOTIONS VS EXPLICIT REASONS

Let’s have a look at the reasons each side provided…

First, it’s noteworthy but not surprising that a number of the comments registered high emotional valence – especially anger – among both groups. Among those who favor keeping the statues, there is also significantly more fear/anxiety expressed in their comments.

As for the specific reasons, among those who want statues to remain, ‘history’ (implicitly the preservation of) is the most frequently mentioned reason by far (46%), and that history shouldn’t be deleted (3%), and history is both Good and Bad (2%).

The main argument among those who want to remove the statues is that Confederates were losers and traitors (9%) and that these statues should be limited to museums and battle grounds (8%), that glorifying what these men stood for is wrong (6%), as well as more general mentions of its symbolism of hate or slavery (6%).

A QUICK LOOK AT REGION

We took a quick look at answers by geography. Southerners were 5% more likely than total to mention the historic importance of the statues (35% VS 30% in total). They were also half as likely to have made the argument that statues for losers/traitors aren’t appropriate (1.7% VS 2.8% in total).

Americans in the Northeast region were significantly more likely than average to say they weren’t sure or didn’t care (7% VS 5% in total), and were also significantly more likely to mention the importance of “remembering” (3% VS 1% in total).

Americans in the West Region were significantly less likely to mention the importance of ‘History’ (25% VS 30% in Total).

The Verdict Changes When Asked Why

The court of public opinion in a standard Likert scale instrument appears fairly evenly split on whether or not to remover Confederate Civil War monuments, but when we ask people to explain why they hold a position on this matter in their own words, we see a significant shift in the data toward keeping these monuments intact.

Most respondents didn’t offer any surprises in terms of their explanations for why they support/oppose keeping the monuments. Indeed, a few arguments on both sides have already been fleshed out in the media, and this may have affected how people responded.

The ah-ha for us in this exercise was that the “don’t care/don’t knows” shrank by half when respondents were asked to provide a reason for their opinion. Whether this is a matter of causality, of course, is debatable. But it does suggest that allowing people to explain in their own words will produce a different, possibly more accurate picture, as well as which reasons have strongest appeal.

@TomHCAnderson

*Note: n=1,500 responses were collected via Google Surveys 8/19-8/21 2017. Google Surveys allow researchers to reach a validated U.S. General Population Representative sample by intercepting people attempting to access high-quality online content or who have downloaded the Google Opinion Rewards mobile app. Results are +/- 2.53% accurate at the 95% confidence interval. Data was analyzed using OdinText 8/21/17. Request more info on OdinText here.

About Tom H. C. Anderson Tom H. C. Anderson is the founder and managing partner of OdinText, a venture-backed firm based in Stamford, CT whose eponymous, patented SAS platform is used by Fortune 500 companies like Disney, Coca-Cola and Shell Oil to mine insights from complex, unstructured and mixed data. A recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research, Anderson is the recipient of numerous awards for innovation from industry associations such as CASRO, ESOMAR and the ARF. He was named one of the “Four under 40” market research leaders by the American Marketing Association in 2010. He tweets under the handle @tomhcanderson.

Seven Text Analytics Myths Exposed at IIEX

What I Learned from Attendees in IIEX Text Analytics Sessions This week I had the opportunity to attend and to present at the Insights Innovation Exchange (IIEX) in Atlanta. This conference always provides a wonderful chance to connect with a lot of smart, forward-looking researchers.

For those who missed IIEX or weren’t able to attend my presentation, I provided a case study outlining how we conducted a massive international study in 10 countries and eight languages for almost no cost with results analyzed in just two hours. If you’d like to know more, feel free contact us for a free e-book detailing the project.

My presentation aside, what I’d like to cover here today actually came out of the Text Analytics Information Sessions we were asked to host on Monday, and which I’m pleased to report were well attended—notably by representatives from more than a few major supplier and client brands.

Text Analytics IIEX

I had originally anticipated that there would be more group conversation and peer-to-peer sharing, but it turned out that most of the attendees were less interested in talking than they were in learning, and so the sessions involved quite a bit of Q&A, with my colleague Tim Lynch and I fielding more questions about text analytics, generally, than expected.

What I took from these sessions was a sense that a lot of confusion and misperception around text analytics persists among researchers today and that the industry is urgently in need of more educational resources on the topic (more on this at the end of the blog).

I’ve cherry-picked for you here today the most common misconceptions revealed in these sessions. Hopefully, this will help dispel some persistent myths that do anyone interested in text analytics a huge disservice…

MYTH 1: Text analytics is synonymous with social media monitoring

As I feared, a common misconception about text analytics is that its primary application—and pretty much the extent of its practical utility—is for analyzing social media data. Nothing could be further from the truth!

While social media monitoring firms have done a great job marketing themselves, this is just ONE SMALL SUBSET of data that text analytics can be used to solve for. Moreover, while everyone seems fixated on social media analysis, in my honest opinion, social media monitoring is NOT where the greatest opportunity lies for using text analytics in market research.

And a word of caution: yes, text analytics platforms can easily handle social media data, but the same cannot be said about social media monitoring tools, so be careful not to limit yourself.

MYTH 2: Text analytics are perfect for analyzing qualitative transcripts

I cannot tell you how often I’ve been approach by researchers who want to use text analytics software to analyze focus group transcripts. My first response is always why would you want to do that?

Just because focus group data contains a lot of text doesn’t mean you should run it through a text analytics platform, unless you have very large qualitative communities or run the same exact group 10 times within a category.

Bear in mind, text analytics can be applied quite effectively to small samples (I actually didn’t think so until I learned otherwise from a client), but using small sample IDIs or focus groups doesn’t typically make a lot of sense because text analytics is all about pattern identification.

If you talk to just 15 physicians, for example, you’ll still need to read each of their comments. Text analysis may add additional value, but usually it isn’t worthwhile UNLESS you either have a large enough sample to mine for patterns AND/OR the data is extremely important/valuable (e.g., these are the top 15 MD PhDs in their field working on a life-saving cure).

MYTH 3: Sentiment is REALLY important and useful

Sentiment has been COMPLETELY hyped. In the majority of our text analytics projects sentiment isn’t even a factor. In fact, some firms purporting to offer “text analytics” only offer sentiment analysis. This is unbelievable to me. Having worked with text analytics for the past 15 years I don’t understand why someone would approach data that simplistically. There are so many other, potentially more useful and valuable ways to look at data.

When thinking about text analytics, relevant feature/topic extraction is most important. As important is how this can be turned into actionable advice or a recommended course of action. If you analyze data and come back to management with something as simplistic as “this is what makes people angry,” or happy, chances are you’ll soon be replaced by someone who can tell management how to increase return behavior and revenue.

MYTH 4: Look for AI and Machine Learning

I’ve blogged about this before, and it still drives me nuts!

Everyone seems hung up on this year’s buzzwords—“artificial Intelligence” (AI) and/or “machine learning”—and just about every possible vendor is touting them, whatever the solution they’re selling. For your purposes, I’m telling you they are meaningless.

This is not to say that AI and machine learning are not important—in fact, they’re integral components to the OdinText platform—but they’re terms that are misused, abused, and thrown about cavalierly without any explanation as to how or why they matter. If someone tells you their tool uses AI or machine learning, ask them what they mean by “AI” specifically and to explain precisely how that enables their tool to deliver differentiated results. I’ll wager you’ll walk away from that conversation without any better understanding of why AI is a feature they’re touting than you did before the conversation began. (For more information on this topic, again, read this post.)

Beware also other technical-sounding terms (including sentiment, mentioned above) that frequently crop up around text analytics like NLP (natural language processing), ontologies, taxonomies, support vector machines… I could go on.

If a sales person is throwing jargon like this at you, chances are they are using it to conceal their own lack of knowledge about text analytics.

Conversations should instead focus on: How do I quickly identify the most important topics/ideas mentioned by my customers? How do I know they are important? How do they affect my KPIs? Show me with my data how I can quickly do these things.

MYTH 5: All text analytics are basically the same

Text analytics are not a commoditizable, standardized sort of item. Unlike the deliverables from panel companies or survey vendors, the variety of potential forms text analytics can take is diverse and complex, ranging from more linguistically-based approaches to more mathematical/statistical solutions.

Beyond this, though, practical experience in the given field of application also comes into play. What experience do the developers have in answering problems in your specific field? This will impact underlying thinking as well as user interface considerations.

DO NOT assume that just because a feature is listed in one company’s sell sheet (see buzzwords above, for example), it is a must-have or even a good-to-have, and that you should look for it across vendors.

Again, always fall back to your own data. How does this software tell me how customer group A is different than Group B? How will I know the impact of topics X, Y and Z on sales? These are the questions to ask.

MYTH 6: Text analytics is as easy as just pressing a button and may be totally automated

I’m sorry, but again, no.

On the one hand there are extremely involved and expensive mechanical turk solutions you can purchase. Typically, using one of these solutions will require a few months to build a static dictionary for, say, your customer satisfaction data set, which is then dashboarded. You can easily expect to pay mid-six figures for something like this, and it won’t allow you to do any ad hoc analysis.

The other option is a pure AI/Machine learning solution like IBM’s Watson. It’s fast and cheap because it’s not valuable. (If it were, then IBM could charge a lot more for it.) Look for their case studies and actual customers who have been happy with their solutions. You won’t find many, if any.

Included in the same category as IBM Watson are Microsoft Azure, Amazon AWS and Google NLP tools, as well as vendors that do other things (surveys etc.); plug into one of these and they’ll claim they have “text analytics.” But these tools will not get management what it needs to make intelligent decisions.

The optimal solution is somewhere in between, where machine and human meet in the most effective and intuitive manner. This will mean high-value analysis. What you get back in terms of value of insights depends on the quality of data and the analytical thinking brought to bear by the analyst—just like on any quantitative data project!

MYTH 7: There are lots of great resources for learning about text analytics

Sadly, the net of these IIEX groups on Monday was that it became clear to me that we still don’t have ANY solid educational or training resources devoted to text analytics in this industry. NONE!!!

MR trade orgs don’t offer any; the top masters and MBA programs in research don’t offer much; Burke Institute (whose training I love, by the way) doesn’t offer any...

There aren’t any good books on the subject, either; they’re either way too academic and 10+ years behind, or they’re sales tools in disguise, or it’s just a chapter in a book written by a research generalist who does not specialize in text analytics.

We need educational and training resources rather desperately, it seems.

I plan on continuing to do my part by lecturing on the subject at a few MBA classes each year. I’ve also offered to work with the Burke Institute and the University of Georgia’s Terry School’s Master of Marketing Research program on developing resources.

BUT in the meantime, if you have any questions about text analytics, generally, and totally apart from OdinText, please consider me a resource. Feel free to ping me on LinkedIn or via the info request button here.

I hope this was helpful. Thanks for reading and I welcome your comments!

@TomHCAnderson

About Tom H. C. Anderson

Tom H. C. Anderson is the founder and managing partner of OdinText, a venture-backed firm based in Stamford, CT whose eponymous, patented SAS platform is used by Fortune 500 companies like Disney, Coca-Cola and Shell Oil to mine insights from complex, unstructured and mixed data. A recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research, Anderson is the recipient of numerous awards for innovation from industry associations such as CASRO, ESOMAR and the ARF. He was named one of the "Four under 40" market research leaders by the American Marketing Association in 2010. He tweets under the handle @tomhcanderson.