Posts tagged artificial intelligence
What you Need to Know Before Buying AI/Machine Learning

7 Things to Know About AI/Machine Learning (Boiled Down to two Cliff Notes that are even more important).

In case you missed our session on Artificial Intelligence and Machine Learning (AI/ML) at the Insights Associations’ NEXT conference last week, I thought I would share a bit on the blog about what you missed. We had a full room, with some great questions both during and after the session. However, 30 minutes wasn’t enough time to cover everything thoroughly. In the end we agreed on four takeaways:

  • AI is part of how research & insights pros will address the ever-increasing demand for fast research results
  • AI Helps focus on the most important data
  • AI can’t compensate for bad data
  • AI isn’t perfect

So today I thought I would share seven additional points about AI/ML that I often get questions on, and then at the end of this post I’m going to share the ‘Cliff Notes’, i.e. I’m going to share just the 2 most important things you really need to know.  So, unless you want to geek out with me a bit, feel free to scroll to the bottom.

OK, first, before we can talk about anything, we need to define what Artificial Intelligence (AI) is and isn’t.

1. AI/ML definition is somewhat fuzzy

AI, and more specifically machine learning (ML) is a term that is abused almost as often as it is used. On the one hand this is because a lot of folks are inaccurately claiming to be using it, but also because not unlike big data, its definitions can be a bit unclear, and don’t always make perfect sense.

Let’s take this common 3-part regression analysis process:

  1. Data Prep (pre-processing including cleaning, feature identification, and dimension reduction)
  2. Regression
  3. Analysis of process & reporting

This process, even if automated would not be considered machine learning. However, switch out regression with a machine learning technique like Neural Nets, SVM, Decision Trees or Random Forests and bang, it’s machine learning. Why?

Regression models are also created to predict something, and they also require training data. If the data is linear, then there is no way any of these other models will beat regression in terms of ROI. So why would regression not be considered machine learning?

Who knows. Probably just because the first writers of the first few academic papers on ML refenced these techniques and not regression as ML. It really doesn’t make much sense.

2. There are basically 2 types of ML

Some ML approaches are binary like SVM (Support Vector Machines), for predicting something like male or female, and others like Decision Trees are multi class classification.

If you are using decision trees to predict an NPS rating on an 11 point scale then that’s a multi class problem. However, you can ‘trick’ binary techniques like SVM to solve the multi class problem by setting it up to run multiple times.

Either way, you are predicting something.

3. ML can be slow

Depending on the approach used, like Neural Nets for instance, training a model can take several days on a normal computer. There are other issues with Neural Nets as well, like the difficulty for humans to understand and control what they are doing.

But let’s focus on speed for now. Of course, if you can apply a previously trained model on very similar data, then results will be very fast indeed. This isn’t always possible though

If your goal is to insert ML into a process to solve a problem which a user is waiting for, then training an algorithm might not be a very good solution. If another technique, ‘machine learning’ or not, can solve the problem much faster with similar accuracy, then that should be the approach to use.

4. Neural Nets are not like the brain

I’ll pick on Neural Nets a bit more, because they are almost a buzz word unto themselves. That’s because a lot of people have claimed they work like the human brain. This isn’t true. If we’re going to be honest, we’re not sure how the human brain works. In fact, what we do know about the human brain makes me think the human brain is quite different.

The human brain contains nearly 90 billion neurons, each with thousands of synapses. Some of these fire and send information for a given task, some will not fire, and yet others fire and do not send any information. The fact is we don’t know exactly why. This is something we are still working on with hopes that new more powerful quantum computers may give us some insight.

We can however map some functions of the brain to robotics to do things like lift arms, without knowing exactly what happens in between.

There is one problematic similarity between the brain and Neural Nets though. That is, we’re not quite sure how Neural Nets work either. When running a Neural Net, we cannot easily control or explain what happens in the intermediary nodes. So, this (along with speed I mentioned earlier) is more of a reason to be cautious about using Neural Nets.

5. Not All Problems are best solved with Machine Learning

Are all problems best solved with ML? No, probably not.

Take pricing as an example. People have solved for this problem for years, and there are many different solutions depending on your unique situation. These solutions can factor in everything from supply and demand, to cost.

Introducing machine learning, or even just a simpler non-ML based automated technique can sometimes cause unexpected problems. As an example, consider the automated real-time pricing model which Uber used to model supply and demand as inputs. When fares skyrocketed to over $1,000 as drunk people were looking for a ride on New Years eve, the model created a lot of angry customers and bad press.

More on dangers of AI/ML in a bit…

6. It’s harder to beat humans than you think

One of the reasons ML is often touted as a solution is because of how much better than humans computers allegedly are. While theoretically there is truth to this, when applied to real world situations we often see a less ideal picture.

Take self driving cars as an example. Until recently they were touted as “safer than humans”. That was until they began crashing and blowing up.

Take the recent Tesla crash as an example. The AI/ML accidentally latched onto an older faded lane line rather than the newly painted correct lane line and proceeded without breaking, at full speed, into a head on collision with a divider. A specific fatal mistake no human would have been likely to make.

The truth is if we remove driving under the influence and falling asleep from the statistics (two things that are illegal anyway), then human accident statistics are incredibly low.

7. ML is Context Specific!

This is an important one. IBM Watson might be able to Google Lady Gaga’s age quickly, but Watson will be completely useless in identifying her in a picture. Machine learning solutions are extremely context specific.

This context specificity also comes into play when training any type of model. The model will only be as good as the training data used to create it, and the similarity to future data it is uses for predictions.

Model validation methods only test the accuracy of the model on the same exact type of data (typically a random portion of the same data), it does not test the quality of the data itself, nor the application of this model on future data other than the training data.

Be wary of anyone who claims their AI does all sorts of things well, or does it with extremely 100% accuracy.

My final point about Machine Learning & two Cliff Notes…

If some of the above points make it sound as if I’m not bullish on machine learning, I want to clarify that in fact I am. At OdinText we are continuously testing and implementing ML when it makes sense. I’m confident that we as an industry will get better and better at machine learning.

In the case of Tesla above, there are numerous ways to make the computers more efficient, including using special paint that would be easier for computer cameras to see, and traffic lights that send signals telling the computer stating “I am red”, “I am Green” etc., rather than having to guess it via color/light sensing. Things will certainly change, and AI/ML will play an important part.

However, immediately after my talk at the Insights Association I had two very interesting conversations on how to “identify the right AI solution”? In both instances, the buyer was evaluating vendors that made a lot of claims. Way too many in my opinion.

If you forget everything else from today’s post, please remember these two simple Cliff Notes on AI:

  1. You Don’t Buy AI, you buy a solution that does a good job solving your need (which may or may not involve AI)
  2. Remember AI is context specific, and not perfect. Stay away from anyone who says anything else. Select vendors you know you can trust.

There’s no way to know whether something is AI or not without looking at the code.

Unlike academics who share everything under peer review, companies protect their IP, Trade Secrets and code, so there will technically be no way for you to evaluate whether something actually is “AI” or not.

However, the good news is, this makes your job easier. Rather than reviewing someone’s code your job is simply still to decide whether the products solves your needs well or not.

In fact, in my opinion it is far more important to choose a vendor who is honest with you about what they can do to solve your problems. If a vendor claims they have AI everywhere that solves all kinds of various needs, and does so with 100% accuracy, run!

@TomHCAnderson

AI and Machine Learning NEXT at The Insights Association
Insight practitioners from Aon, Conagra and Verizon speak out on what they think about AI and Machine Learning

Artificial Intelligence and Machine Learning are hot topics today in many fields, and marketing research is no  exception. At the Insights Association’s NEXT conference on May 1 in NYC I've been asked to take part in a practitioner panel on AI to share a bit about how we are using AI in natural language processing and analytics at OdinText.

While AI is an important part of what data mining and text analytics software providers like OdinText do, before the conference I thought I’d reach out to a couple of the client-side colleagues to see what they think about the subject.

With me today I have David Lo, Associate Partner at the Scorpio Partnership (a collaboration between McLagan and the Aon Hewitt Corporation) Thatcher Schulte, Sr. Director, Strategic Insights at Conagra Brands, and Jonathan Schwedel, Consumer & Marketplace Insights at Verizon, all who will also be speaking at NEXT.

THCA: Artificial Intelligence means different things to different people and companies. What does it mean to you, and how if at all you are planning to use it in your departments?

Thatcher Schulte – Conagra:

Artificial intelligence is like many concepts we discuss in business, it’s a catch all that loses its meaning as more and more people use it.  I’ve even heard people refer to “Macros” as AI.  To me it means trying to make machines make decisions like people would, but that would beg the question on whether it would be “intelligent.”  I make stupid decisions all the time.

We’re working with Voice to make inferences on what help consumers might need as they make decisions around food.

Jonathan Schwedel – Verizon:

I'm not a consumer insight professional - I'm a data analyst who works in the insights department, so my perspective is different. There are teams in other parts of Verizon who are doing a lot with more standard artificial intelligence and machine learning approaches, so I want to be careful not to conflate the term with broader advanced analytics. I have this image of cognitive scientists sitting in a lab, and am tempted to reduce "AI" to that.

For our specific insights efforts, we work on initiatives that are AI-adjacent - with automation, predictive modeling, machine learning, and natural language processing, but with a few exceptions those efforts are not scaled up, and are ad hoc on a project by project basis. We dabble with a lot of the techniques that are highlighted at NEXT, but I'm not knowledgeable enough about our day to day custom research efforts to speak well to them. One of the selling points of the knowledge management system we are launching is that it's supposed to leverage machine learning to push the most relevant content to our researchers and partners around our company.

David Lo – Scorpio Partnership/McLagan:

Working in the financial services space and specifically within wealth management, AI is a hot topic as it relates to how it will change advice delivery

[we are looking at using it for] Customer journey mapping through the various touchpoints they have with an organization.

 

THCA: There’s a lot of hype these days around AI. What is your impression on what you’ve been hearing, and about the companies you’ve been hearing it from, is it believable?

Thatcher Schulte - Conagra:

I don’t get pitched on AI a lot except through email, which frankly hurts the purpose of those people pitching me solutions.  I don’t read emails from vendors.

Jonathan Schwedel – Verizon:

It's easy to tell if someone does not have a minimum level of domain expertise. The idea that any tool or platform can provide instant shortcuts is fiction. Most of the value in these techniques are very matter of fact and practical. Fantastic claims demand a higher level of scrutiny. If instead the conversation is about how much faster, cheaper, or easier they are, those are at least claims that can be quickly evaluated.

David Lo – Scorpio Partnership/McLagan:

Definitely a lot of hype.  I think as it relates to efficiency, the hype is real.  We will continue to see complex tasks such as trade execution optimized through AI.

 

THCA: For the Insights function specifically, how ready do you think the idea of completely unsupervised vs. supervised/guided AI is? In other words, do you think that the one size fits all AI provided by likes of Microsoft, Amazon, Google and IBM are very useful for research, or does AI need to be more customized and fine tuned/guided before it can be very useful to you?

And related to this, what areas of Market Research do you thing AI currently is better suited to AI?

 Thatcher Schulte - Conagra:

Data sets are more important to me than the solutions that are in the market.  Food decision making is specialized and complex and it varies greatly by what life stage you are in and where you live. Valid data around those factors are frankly more important than the company we push the data through.

David Lo – Scorpio Partnership/McLagan:

Guard rails are always important, particularly as it relates to unique customer needs.

[In terms of usefulness to market research], Data mining

Jonathan Schwedel – Verizon:

Most custom quantitative research studies use small sample sizes, making it often not feasible to do bespoke advanced analytics. When you are working with much larger data sets (the kind you'd see in analytics as a function as opposed to insights), AWS and Azure let you scale, especially with limited resources. It's a good general approach to use algorithmic type approaches with brand new data sets, and then start customizing when you hit the point of diminishing returns, in a way that your work can later be automated at scale.

[In regard to marketing research] It depends how you're defining research - are we broadening that to customer experience? Then text analytics is a most prominent area, because there are many prominent use cases for large companies at the enterprise level. If "market research" covers broader buckets of customer data, then there's potentially a lot you can do.

 

THCA: OK, so which areas are currently less well suited to AI?

David Lo – Scorpio Partnership/McLagan:

Hard to say, but probably less suited toward qualitative research.  In my line of business we do a lot of work among UHNW investors where sample sizes are very small and there isn’t a lot of activity in the online space.

Jonathan Schwedel – Verizon:

I think sample size is often an issue when talking about research studies. Then it comes down to the research design. Is the machine learning component going to be baked in from the start, or is it just bolted on? A lot of these efforts are difficult to quantify. Verizon's insights group learns things all the time from talking to and observing consumers that we would not have otherwise thought to ask.

 

THCA: Does anyone have thoughts on usefulness of chat bots and/or other social media/twitter bots currently?

Jonathan Schwedel – Verizon:

They could potentially allow you to collect a lot more data, and reach under-represented consumers groups in the channels that they want to be in. A lot of our team's focus at Verizon is on the user experience and building a great digital experience for our customers. I think they will be important tools to understand and improve in that area.

 

THCA: Realistically where do you see AI in market research being 3-4 years from now?

David Lo – Scorpio Partnership/McLagan:

Integrated more fully with traditional quantitative research techniques, with researchers re-focusing their efforts on the more creative and thoughtful interpretations of the output.

Jonathan Schwedel – Verizon:

They will provide some new techniques that will be important for specific use cases, but I think the bulk of the fruitful efforts will come from automation and improved scalability. The desire to do more with less is pretty universal, and there's a good roadmap there. The prospect of genuinely groundbreaking insights offers a lot more uncertainty, but it would be great if we do see that level of innovation.

 

Big thanks to Jonathan, David and Thatcher for sharing their insights and opinions on AI.

If you’re interested in further discussion on AI and Machine Learning please feel free too post a comment here, or join me for the 'What’s New & What’s Ahead for AI & Machine Learning?' Panel on May 1st . I will be joined by John Colias of Decision Analyst, Andrew Konya of remesh, and moderator Kathryn Korostoff of Research Rockstar.

-Tom H. C. Anderson @OdinText

 

PS. If you would like to learn more about how OdinText can help you better understand your customers and employees feel free to request more info here. If you’re planning on attending the confernece feel free use my speaker code for a $150 discount [ODINTEXT]. I look forward to seeing some of you at the event!

 

A New Trend in Qualitative Research

Almost Half of Market Researchers are doing Market Research Wrong! - My Interview with the QRCA (And a Quiet New Trend - Science Based Qualitative).

Two years ago I shared some research on research about how market researchers view Quantitative and Qualitative research. I stated that almost half of researchers don’t understand what good data is. Some ‘Quallies’ tend to rely and work almost exclusively with comment data from extremely small samples (about 25% of market researchers surveyed), conversely there is a large group of ‘Quant Jockey’s’ who while working with larger more representative sample sizes, purposefully avoid any unstructured data such as open ended comments because they don’t want to deal with coding and analyzing it or don’t believe in it’s accuracy and ability to add to the research objectives. In my opinion both researcher groups have it totally wrong, and are doing a tremendous disservice to their companies and clients.  Today, I’ll be focusing on just the first group above, those who tend to rely primarily on qualitative research for decisions.

Note that today’s blog post is related to a recent interview, which I was asked to take part in by the QRCA’s (Qualitative Research Consultant’s Association) Views Magazine. When they contacted me I told them that in most cases (with some exceptions), Text Analytics really isn’t a good fit for Qualitative Researchers, and asked if they were sure they wanted to include someone with that opinion in their magazine? I was told that yes, they were ok with sharing different viewpoints.

I’ll share a link to the full interview in the online version of the magazine at the bottom of this post. But before that, a few thoughts to explain my issues with qualitative data and how it’s often applied as well as some of my recent experiences with qualitative researchers licensing our text analytics software, OdinText.

The Problem with Qualitative Research

IF Qual research was really used in the way it’s often positioned, ‘as a way to inform quant research’, that would be ok. The fact of the matter is though, Qual often isn’t being used that way, but instead as an end in and of itself. Let me explain.

First, there is one exception to this rule of only using Qual as pilot feedback for Quant. If you had a product for instance which was specifically made only for US State Governors, then your total population is only N=50. And of course it is highly unlikely that you would ever get all the Governors of each and every US State to participate in any research (which would be a census of all governors), and so if you were fortunate enough to have a group of say 5 Governors whom were willing to give you feedback on your product or service, you would and should obviously hang on to and over analyze every single comment they gave you.

IF however you have even a slightly more common mainstream product, I’ll take a very common product like hamburgers as an example, and you are relying on 5-10 focus groups of n=12 to determine how different parts of the USA (North East, Mid-West, South and West) like their burgers, and rather than feeding  directly into some quantitative research instrument with a greater sample, you issue a ‘Report’ that you share with management; well then you’ve probably just wasted a lot of time and money for some extremely inaccurate and dangerous findings. Yet surprisingly, this happens far more often than one would imagine.

Cognitive Dissonance Among Qual Researchers when Using OdinText

How do I know this you may ask? Good Text Analytics software is really about data mining and pattern recognition. When I first launched OdinText we had a lot of inquiries from Qualitative researchers who wanted some way to make their lives easier. After all, they had “a lot” of unstructured/text comment data which was time consuming for them to process, read, organize and analyze. Certainly, software made to “Analyze Text” must therefore be the answer to their problems.

The problem was that the majority of Qual researchers work with tiny projects/sample, interviews and groups between n=1 and n=12. Even if they do a couple of groups like in the hamburger example I gave above, we’re still taking about a total of just around n=100 representing four or more regional groups of interest, and therefore fewer than n=25 per group. It is impossible to get meaningful/statistically comparable findings and identify real patterns between the key groups of interest in this case.

The Little Noticed Trend In Qual (Qual Data is Getting Bigger)

However, slowly across the past couple of years or so, for the first time I’ve seen a movement of some ‘Qualitative’ shops and researchers, toward Quant. They have started working with larger data sets than before. In some cases, it has been because they have been pulled in to manage larger ongoing community/boards, in some cases larger social media projects, and in others, they have started using survey data mixed with qual, or even better, employing qualitative techniques in quant research (think better open-ends in survey research).

For this reason, we now have a small but growing group of ‘former’ Qual researchers using OdinText. These researchers aren’t our typical mixed data or quantitative researchers, but qualitative researchers that are working with larger samples.

And guess what, “Qualitative” has nothing to do with whether data is in text or numeric format, instead it has everything to so with sample size. And so perhaps unknowingly, these ‘Qualitative Researchers’ have taken the step across the line into Quantitative territory, where often for the first time in their career, statistics can actually be used. – And it can be shocking!

My Experience with ‘Qualitative’ Researchers going Quant/using Text Analytics

Let me explain what I mean. Recently several researchers that come from a clear ‘Qual’ background have become users of our software OdinText. The reason is that the amount of data they had was quickly getting “bigger than they were able to handle”. They believe they are still dealing with “Qualitative” data because most of it is text based, but actually because of the volume, they are now Quant researchers whether they know it or not (text or numeric data is irrelevant).

Ironically, for this reason, we also see much smaller data sizes/projects than ever before being uploaded to the OdinText servers. No, not typically single focus groups with n=12 respondents, but still projects that are often right on the line between quant and qual (n=100+).

The discussions we’re having with these researchers as they begin to understand the quantitative implications of what they have been doing for years are interesting.

Let me preface this with the fact that I have a great amount of respect for the ‘Qualitative’ researchers that begin using OdinText. Ironically, the simple fact that we have mutually determined that an OdinText license is appropriate for them means that they are no longer ‘Qualitative’ researchers (as I explained earlier). They are in fact crossing the line into Quant territory, often for the first time in their careers.

The data may be primarily text based, though usually mixed, but there’s no doubt in their mind nor ours, that one of the most valuable aspects of the data is the customer commentary in the text, and this can be a strength

The challenge lies in getting them to quickly accept and come to terms with quantitative/statistical analysis, and thereby also the importance of sample size.

What do you mean my sample is too small?

When you have licensed OdinText you can upload pretty much any data set you have. So even though they may have initially licensed OdinText to analyze some projects with say 3,000+ comments, there’s nothing to stop them from uploading that survey or set of focus groups with just n=150 or so.

Here’s where it sometimes gets interesting. A sample size of n=150 is right on the borderline. It depends on what you are trying to do with it of course. If half of your respondents are doctors (n=75) and half are nurses (n=75), then you may indeed be able to see some meaningful differences between these two groups in your data.

But what if these n=150 respondents are hamburger customers, and your objective was to understand the difference between the 4 US regions in the I referenced earlier? Then you have about n=37 in each subgroup of interest, and you are likely to have very few, IF ANY, meaningful patterns or differences.

Here’s where that cognitive dissonance can happen --- and the breakthroughs if we are lucky.

A former ‘Qual Researcher’ who has spent the last 15 years of their career making ‘management level recommendations’ on how to market burgers differently in different regions based on data like this, for the first time is looking at software which says that there are maybe just two to 3 small differences, or even worse, NO MEANINGFUL PATTERNS OR DIFFERENCES WHATSOEVER, in their data, may be in shock!

How can this be? They’ve analyzed data like this many times before, and they were always able to write a good report with lots of rich detailed examples of how North Eastern Hamburger consumers preferred this or that because of this and that. And here we are, looking at the same kind of data, and we realize, there is very little here other than completely subjective thoughts and quotes.

Opportunity for Change

This is where, to their credit, most of our users start to understand the quantitative nature of data analysis. They, unlike the few ‘Quant Only Jockie’s’ I referenced at the beginning of the article already understand that many of the best insights come from text data in free form unaided, non-leading, yet creative questions.

They only need to start thinking about their sample sizes before fielding a project. To understand the quantitative nature of sampling. To think about the handful of structured data points that they perhaps hadn’t thought much about in previous projects and how they can be leveraged together with the unstructured data. They realize they need to start thinking about this first, before the data has all been collected and the project is nearly over and ready for the most important step, the analysis, where rubber hits the road and garbage in really should mean garbage out.

If we’re lucky, they quickly understand, its not about Quant and Qual any more. It’s about Mixed Data, it’s about having the right data, it’s about having enough data to generate robust findings and then superior insights!

Final Thoughts on the Two Meaningless Nearly Terms of ‘Quant and Qual’

As I’ve said many times before here and on the NGMR blog, the terms “Qualitative” and “Quantitative” at least the way they are commonly used in marketing research, is already passé.

The future is Mixed Data. I’ve known this to be true for years, and almost all our patent claims involve this important concept. Our research shows time and time again, that when we use both structured and unstructured data in our analysis, models and predictions, the results are far more accurate.

For this reason we’ve been hard at work developing the first ever truly Mixed Data Analytics Platform, we’ll be officially launching it three months from now, but many of our current customers already have access. [For those who are interested in learning more or would like early access you can inquire here: OdinText.com/Predict-What-Matters].

In the meantime, if you’re wondering whether you have enough data to warrant advanced mixed data and text annalysis, check out the online version of article in QRCA Views magazine here. Robin Wedewer at QRCA really did an excellent job in asking some really pointed questions that forced me too answer more honestly and clearly than I might otherwise have.

I realize not everyone will agree with today’s post nor my interview with QRCA, and I welcome your comments here. I just please ask that you read both the post above, as well as the interview in QRCA before commenting solely based on the title of this post.

Thank you for reading. As always, I welcome questions publicly in post below or privately via LinkedIn or our Inquiry form.

@TomHCAnderson

Artificial Intelligence in Consumer Insights

A Q&A session with ESOMAR’s Research World on Artificial Intelligence, Machine Learning, and implications in Marketing Research  [As part of an ESOMAR Research World article on Artificial Intelligence OdinText Founder Tom H. C. Anderson was recently took part in a Q&A style interview with ESOMAR’s Annelies Verheghe. For more thoughts on AI check out other recent posts on the topic including Why Machine Learning is Meaningless, and Of Tears and Text Analytics. We look forward to your thoughts or questions via email or in the comments section.]

 

ESOMAR: What is your experience with Artificial Intelligence & Machine Learning (AI)? Would you describe yourself as a user of AI or a person with an interest in the matter but with no or limited experience?

TomHCA: I would describe myself as both a user of Artificial Intelligence as well as a person with a strong interest in the matter even though I have limited mathematical/algorithmic experience with AI. However, I have colleagues here at OdinText who have PhD's in Computer Science and are extremely knowledgeable as they studied AI extensively in school and used it elsewhere before joining us. We continue to evaluate, experiment, and add AI into our application as it makes sense.

ESOMAR: For many people in the research industry, AI is still unknown. How would you define AI? What types of AI do you know?

TomHCA: Defining AI is a very difficult thing to do because people, whether they are researchers, data scientists, in sales, or customers, they will each have a different definition. A generic definition of AI is a set of processes (whether they are hardware, software, mathematical formulas, algorithms, or something else) that give anthropomorphically cognitive abilities to machines. This is evidently a wide-ranging definition. A more specific definition of AI pertaining to Market Research, is a set of knowledge representation, learning, and natural language processing tools that simplifies, speeds up, and improves the extraction of meaningful data.

The most important type of AI for Market Research is Natural Language Processing. While extracting meaningful information from numerical and categorical data (e.g., whether there is a correlation between gender and brand fidelity) is essentially an easy and now-solved problem, doing the same with text data is much more difficult and still an open research question studied by PhDs in the field of AI and machine learning. At OdinText, we have used AI to solve various problems such as Language Detection, Sentence Detection, Tokenizing, Part of Speech Tagging, Stemming/Lemmatization, Dimensionality Reduction, Feature Selection, and Sentence/Paragraph Categorization. The specific AI and machine learning algorithms that we have used, tested, and investigated range a wide spectrum from Multinomial Logit to Principal Component Analysis, Principal Component Regression, Random Forests, Minimum Redundancy Maximum Relevance, Joint Mutual Information, Support Vector Machines, Neural Networks, and Maximum Entropy Modeling.

AI isn’t necessarily something everyone needs to know a whole lot about. I blogged recently, how I felt it was almost comical how many were mentioning AI and machine learning at MR conferences I was speaking at without seemingly any idea what it means. http://odintext.com/blog/machine-learning-and-artificial-intelligence-in-marketing-research/

In my opinion, a little AI has already found its way into a few of the applications out there, and more will certainly come. But, if it will be successful, it won’t be called AI for too long. If it’s any good it will just be a seamless integration helping to make certain processes faster and easier for the user.

ESOMAR: What concepts should people that are interested in the matter look into?

TomHCA: Unless you are an Engineer/Developer with a PhD in Computer Science, or someone working closely with someone like that on a specific application, I’m not all that sure how much sense it makes for you to be ‘learning about AI’. Ultimately, in our applications, they are algorithms/code running on our servers to quickly find patterns and reduce data.

Furthermore, as we test various algorithms from academia, and develop our own to test, we certainly don’t plan to share any specifics about this with anyone else. Once we deem something useful, it will be incorporated as seamlessly as possible into our software so it will benefit our users. We’ll be explaining to them what these features do in layman’s terms as clearly as possible.

I don’t really see a need for your typical marketing researcher to know too much more than this in most cases. Some of the algorithms themselves are rather complex to explain and require strong mathematical and computer science backgrounds at the graduate level.

ESOMAR: Which AI applications do you consider relevant for the market research industry? For which task can AI add value?

TomHCA: We are looking at AI in areas of Natural Language Processing (which includes many problem subsets such as Part of Speech Tagging, Sentence Detection, Document Categorization, Tokenization, and Stemming/Lemmatization), Feature Selection, Data Reduction (i.e., Dimensionality Reduction) and Prediction. But we've gone well beyond that. As a simple example, take key driver analysis. If we have a large number of potential predictors, which are the most important in driving a KPI like customer satisfaction?

ESOMAR: Can you share any inspirational examples from this industry or related industries (advertisement, customer service)  that can illustrate these opportunities

TomHCA: As one quick example, a user of OdinText I recently spoke to used the software to investigate what text comments were most likely to drive belonging into either of several predefined important segments. The nice thing about AI is that it can be very fast. The not so nice thing is that sometimes at first glance some of the items identified, the output, can either be too obvious, or on the other extreme, not make any sense whatsoever.  The gold is in the items somewhere in the middle. The trick is to find a way for the human to interact with the output which gives them confidence and understanding of the results.

a human is not capable of correctly analyzing thousands, 100s of thousands, or even millions of comments/datapoints, whereas AI will do it correctly in a few seconds. The downside of AI is that some outcomes are correct but not humanly insightful or actionable. It’s easier for me to give examples when it didn’t work so well since its hard for me to share info on how are clients are using it. But for instance recently AI found that people mentioning ‘good’ 3 times in their comments was the best driver of NPS score – this is evidently correct but not useful to a human.

In another project a new AI approach we were testing reported that one of the most frequently discussed topics was “Colons”. But this wasn’t medical data! Turns out the plural of Colon is Cola, I didn’t know that. Anyway, people were discussing Coca-Cola, and AI read that as Colons…  This is exactly the part of AI that needs work to be more prevalent in Market Research.”

Since I can’t talk about too much about how our clients use our software on their data, In a way it’s easier for me to give a non-MR example. Imagine getting into a totally autonomous car (notice I didn’t have to use the word AI to describe that). Anyway, you know it’s going to be traveling 65mph down the highway, changing lanes, accelerating and stopping along with other vehicles etc.

How comfortable would you be in stepping into that car today if we had painted all the windows black so you couldn’t see what was going on?  Chances are you wouldn’t want to do it. You would worry too much at every turn that you might be a casualty of oncoming traffic or a tree.  I think partly that’s what AI is like right now in analytics. Even if we’ll be able to perfect the output to be 100 or 99% correct, without knowing what/how we got there, it will make you feel a bit uncomfortable.  Yet showing you exactly what was done by the algorithm to arrive at the solution is very difficult.

Anyway, the upside is that in a few years perhaps (not without some significant trial and error and testing), we’ll all just be comfortable enough to trust these things to AI. In my car example, you’d be perfectly fine getting into an Autonomous car and never looking at the road, but instead doing something else like working on your pc or watching a movie.

The same could be true of a marketing research question. Ultimately the end goal would be to ask the computer a business question in natural language, written or spoken, and the computer deciding what information was already available, what needed to be gathered, gathering it, analyzing it, and presenting the best actionable recommendation possible.

ESOMAR: There are many stories on how smart or stupid AI is. What would be your take on how smart AI Is nowadays. What kind of research tasks can it perform well? Which tasks are hard to take over by bots?

TomHCA: You know I guess I think speed rather than smart. In many cases I can apply a series of other statistical techniques to arrive at a similar conclusion. But it will take A LOT more time. With AI, you can arrive at the same place within milliseconds, even with very big and complex data.

And again, the fact that we choose the technique based on which one takes a few milliseconds less to run, without losing significant accuracy or information really blows my mind.

I tell my colleagues working on this that hey, this can be cool, I bet a user would be willing to wait several minutes to get a result like this. But of course, we need to think about larger and more complex data, and possibly adding other processes to the mix. And of course, in the future, what someone is perfectly happy waiting for several minutes today (because it would have taken hours or days before), is going to be virtually instant tomorrow.

ESOMAR: According to an Oxford study, there is a 61% chance that the market research analyst job will be replaced by robots in the next 20 years. Do you agree or disagree? Why?

TomHCA: Hmm. 20 years is a long time. I’d probably have to agree in some ways. A lot of things are very easy to automate, others not so much.

We’re certainly going to have researchers, but there may be fewer of them, and they will be doing slightly different things.

Going back to my example of autonomous cars for a minute again. I think it will take time for us to learn, improve and trust more in automation. At first autonomous cars will have human capability to take over at any time. It will be like cruise control is now. An accessory at first. Then we will move more and more toward trusting less and less in the individual human actors and we may even decide to take the ability for humans to intervene in driving the car away as a safety measure. Once we’ve got enough statistics on computers being safe. They would have to reach a level of safety way beyond humans for this to happen though, probably 99.99% or more.

Unlike cars though, marketing research usually can’t kill you. So, we may well be comfortable with a far lower accuracy rate with AI here.  Anyway, it’s a nice problem to have I think.

ESOMAR: How do you think research participants will react towards bot researchers?

TomHCA: Theoretically they could work well. Realistically I’m a bit pessimistic. It seems the ability to use bots for spam, phishing and fraud in a global online wild west (it cracks me up how certain countries think they can control the web and make it safer), well it’s a problem no government or trade organization will be able to prevent from being used the wrong way.

I’m not too happy when I get a phone call or email about a survey now. But with the slower more human aspect, it seems it’s a little less dangerous, you have more time to feel comfortable with it. I guess I’m playing devil’s advocate here, but I think we already have so many ways to get various interesting data, I think I have time to wait RE bots. If they truly are going to be very useful and accepted, it will be proven in other industries way before marketing research.

But yes, theoretically it could work well. But then again, almost anything can look good in theory.

ESOMAR: How do you think clients will feel about the AI revolution in our industry?

TomHCA: So, we were recently asked to use OdinText to visualize what the 3,000 marketing research suppliers and clients thought about why certain companies were innovative or not in the 2017 GRIT Report. One of the analysis/visualizations we ran which I thought was most interesting visualized the differences between why clients claimed a supplier was innovative VS why a supplier said these firms were innovative.

I published the chart on the NGMR blog for those who are interested [ http://nextgenmr.com/grit-2017 ], and the differences couldn’t have been starker. Suppliers kept on using buzzwords like “technology”, “mobile” etc. whereas clients used real end result terms like “know how”, "speed" etc.

So I’d expect to see the same thing here. And certainly, as AI is applied as I said above, and is implemented, we’ll stop thinking about it as a buzz word, and just go back to talking about the end goal. Something will be faster and better and get you something extra, how it gets there doesn’t matter.

Most people have no idea how a gasoline engine works today. They just want a car that will look nice and get them there with comfort, reliability and speed.

After that it’s all marketing and brand positioning.

 

[Thanks for reading today. We’re very interested to hear your thoughts on AI as well. Feel free to leave questions or thoughts below, request info on OdinText here, or Tweet to us @OdinText]

Of Tears and Text Analytics

An OdinText User Story - Text Analytics Tips Guest Post (AI Meets VOC)

Today on the blog we have another first in a soon to be ongoing series. We’re inviting OdinText users to participate more on the Text Analytics Tips blog. Today we have Kelsy Saulsbury guest blogging. Kelsy is a relatively new user of OdinText though she’s jumped right in and is doing some very interesting work.

In her post she ponders the apropos topic, whether automation via artificial intelligence may make some tasks too easy, and what if anything might be lost by not having to read every customer comment verbatim.

 

Of Tears and Text Analytics By Kelsy Saulsbury Manager, Consumer Insights & Analytics

“Are you ok?” the woman sitting next to me on the plane asked.  “Yes, I’m fine,” I answered while wiping the tears from my eyes with my fingers.  “I’m just working,” I said.  She looked at me quizzically and went back to reading her book.

I had just spent the past eight hours in two airports and on two long flights, which might make anyone cry.  Yet the real reason for my tears was that I had been reading hundreds of open-end comments about why customers had decided to buy less from us or stop buying from us altogether.  Granted eight hours hand-coding open ends wasn’t the most accurate way to quantify the comments, but it did allow me to feel our customers’ pain from the death of a spouse to financial hardship with a lost job.  Other reasons for buying less food weren’t quite as sad — children off to college or eating out more after retirement and a lifetime of cooking.

I could also hear the frustration in their voices on the occasions when we let them down.  We failed to deliver when we said we would, leaving the dessert missing from a party.  They took off work to meet us, and we never showed.  Anger at time wasted.

Reading their stories allowed me to feel their pain and better share it with our marketing and operations teams.  However, I couldn’t accurately quantify the issues or easily tie them to other questions in the attrition study.  So this year when our attrition study came around, I utilized a text analytics tool (OdinText) for the text analysis of our open ends around why customers were buying less.

It took 1/10th of the time to see more accurately how many people talked about each issue.  It allowed me to better see how the issues clustered together and how they differed based on levels of overall satisfaction.  It was fast, relatively easy to do, and directly tied to other questions in our study.

I’ve seen the benefits of automation, yet I’m left wondering how we best take advantage of text analytics tools without losing the power of the emotion in the words behind the data.  I missed hearing and internalizing the pain in their voices.  I missed the tears and the urgency they created to improve our customers’ experience.

 

Kelsy Saulsbury Manager, Consumer Insights & Analytics Schwan's Company

 

A big thanks to Kelsy for sharing her thoughts on OdinText's Text Analytics Tips blog. We welcome your thoughts and questions in comment section below.

If you’re an OdinText user and have a story to share please reach out. In the near future we’ll be sharing more user blog posts and case studies.

@OdinText

Text Analytics Identifies Globalization Impact on Culture

International Text Analytics Poll™ Explores 11 Cultures in 10 Countries and 8 Languages! [Part I]

When pundits declare that the western world is  now in the throes of a globalization “backlash,” they’re generally referring to the reversal of decades of economic and trade policy, things like Brexit.

But what of other concerns typically associated with globalization? What about culture?

Specifically, there are those who argue that globalization will mean the end of cultures, that the various cultures of the world will over time dilute and blend until there is ultimately just one global melting pot culture.

They may be right.

When we think about culture, it’s often in terms of food, music, customs, etc., but it turns out that when you ask people in countries around the world to describe their own culture in their own words, one nearly universal and unexpected attribute rises to the top: diversity/multiculturalism.

In fact, multiculturalism/diversity was one of the primary and most frequently mentioned attributes used by over 15,500 people to describe 11 different cultures across 10 countries and eight languages!

Text Analytics on a Massive, Multilingual International Scale

Last week on this blog, we published the results of a Text Analytics Poll™ for the favorite movie of all time across six countries and five languages. The project generated a flood of inquiries.

Since everyone is so interested in what can be accomplished on an international scale, we increased the scope of this project significantly.

This time, we asked more than 15,500 people (at least n=1,500 per country) in 10 countries and eight languages the following:

“How would you explain <insert country> culture to someone who isn’t at all familiar with it?”

Then we ran their comments through OdinText, which identified the top 200 cultural markers or features from more than 15,500 text comments and also analyzed those comments for significant patterns of emotion.

How We Translated AND Analyzed the Data (In Less Than Two Hours)

Author’s note: If you’re not interested in methodology, please feel free to skip ahead to the results down below!

Many of you contacted us asking for more details last week, so I’ve provided some additional nuts and bolts here…

Step 1: Data Prep (Translation)

I usually limit total analytical time for any of these Text Analytics Poll™ projects to fewer than two hours. I admit that’s going to be a challenge today, as I’m looking at more than 15,500 comments across 11 cultures from 10 countries in eight languages.

The first challenge is translation. I happen to speak a few languages in addition to English, but in this case I’m faced with seven languages that I don’t understand well enough to analyze. If I did understand each of the languages, or were working with analysts who did, we could easily conduct the analysis in OdinText in the native form.

I’ll point out that while some corporations claim to be “global” in everything they do, in reality there is never enough language fluency at corporate to handle this type of analysis, so analyses are typically divvied up and entrusted to local divisions—a time-consuming and imperfect task, especially when the goal in this case is to make head-to-head comparisons across these countries.

Therefore translation is necessary. While less precise than human translation, machine translation lends itself quite well to a project like this and is more than sufficient for OdinText to identify patterns and even to determine which quotes should be of interest. Nothing has a better ROI. Case in point, it took two minutes to translate the data. For those keeping track, I’m at

Above we have an example of machine translated raw data vs. the original French from the multi-country movie analysis I conducted last week. In the case above I’m looking at all mentions of “La Ligne Verte,” a title OdinText identified as appearing frequently among comments from French respondents. I don’t speak French, so I prefer to work with machine translated data on the left, which translated “La Ligne Verte” literally to “The Green Line” –the French title for the U.S. movie “The Green Mile.”

Step 2: Topic Identification

Using the top-down/bottom-up approach we teach in OdinText training and which we’ve blogged about here before, we identify 200 or so topics/features for analysis. This is a semi-supervised approach, and so a human is involved.

Given this somewhat larger multi-country data set, I allowed about 45 minutes for this task, so we’re at 

Step 3: Artificial Intelligence and Structuring the Analysis

Structuring the analysis is the most important and the most difficult part of any project, especially an exploratory mission where you don’t know what you are looking for at the outset.

You may be surprised to know that artificial intelligence and advanced machine learning algorithms can be a lot less useful than one might think. They have a tendency to identify the obvious—the attribute/topic “tradition” in this case—or, in cases, the unexplainable. For instance, terms like “French,” “American,” “Japanese,” “Spanish,” etc., came up in responses to our question. These are, of course, very useful if you’re building an algorithm to predict where comments originate, for example, but they aren’t terribly illuminating for us here.

Examples of other topics auto identified as ‘of interest’ by our AI include “friendliness,” “relaxed/laid back,” “freedom,” and “equality fraternity liberty.” (You can probably guess where that last one came from.) Some of these other, less expected ones warrant a closer look and will be included in the analysis.

We could move right into an exhaustive analysis of each country, but I’m looking to quickly find any interesting patterns in this data, so I elect to use a quick visualization first.

Cultural Differences and Similarities Vizualized

Cultural Differences and Similarities Vizualized (A Few Key Descriptive Dimensions Added)

These visualizations (above) plot cultures that were described in more similar terms by people closer together and those that were described more differently further apart, yielding some interesting patterns. The USA, UK, Brazil, France and even Spain look quite similar. Two countries—Germany and Japan—cluster slightly away from this main bunch, but very close to each other. Then there are those that appear to be most dissimilar from the rest—Mexico, French- and English-speaking Canada, respectively, and Australia.

To my earlier question about whether or not globalization is having a homogenizing effect on cultures, it would appear so at a glance. We’ve noted that several countries cluster closely around the U.S. But look again—the U.S. appears to occupy the center of the cultural universe here! That’s no coincidence, I suspect, as U.S. culture could in many ways be considered the “melting pot” model and, as we saw last week, culture is a major U.S. export.

Analytical time to review multiple visualizations and decide that this is a repeating pattern was 10 minutes. Total analytical time =

Given that we have a full hour left (remember I did not want to spend more than two hours on this analysis), as a next step we conducted a little bottom-up work to look at what makes each country unique from the international aggregate/total and to see whether the pattern in the visualization makes sense.

Example: Why do Germany and Japan look so similar to OdinText?

A glance at the two charts below shows significant differences between how the Japanese and Germans describe their cultures. For instance, the Japanese were 11 times more likely than Germans to say their culture was something that needed to be experienced in order to be understood, and they were four times more likely than Germans to mention their history. They were also 14 times less likely to mention certain places of interest and three times more likely than Germans to mention food.

In contrast, Germans were 27 times more likely to mention beer and eight times more likely to describe their culture as rule-abiding and orderly. (Of course, this does not mean that Japanese culture is any less rule-abiding or orderly; rather, it suggests that for the Japanese these are not defining cultural characteristics.)

Respondents from both countries were more likely than average to mention language, tradition, and politeness, BUT the similarities between these two cultures actually lie primarily in the extent to which they both differ from the other cultures sampled, notably by how infrequently certain features mentioned by people from other cultures appeared in comments from German and Japanese respondents.Total Analytical Time =

This concludes Part 1 of our cultural safari. In Part 2 tomorrow we’ll take a deeper dive into each of the 11 cultures in our study individually, exploring how their members define themselves and the extent to which key cultural drivers differ from or are similar to the international aggregrate. Stay tuned!

Tomorrow: Part II – Key Cultural Drivers in Their Own Words

@TomHCAnderson - @OdinText

PS. Have questions about today's post? Feel free to post a comment or request more info here.

About Tom H. C. Anderson

Tom H. C. Anderson is the founder and managing partner of OdinText, a venture-backed firm based in Stamford, CT whose eponymous, patented SAS platform is used by Fortune 500 companies like Disney, Coca-Cola and Shell Oil to mine insights from complex, unstructured and mixed data. A recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research, Anderson is the recipient of numerous awards for innovation from industry associations such as CASRO, ESOMAR and the ARF. He was named one of the "Four under 40" market research leaders by the American Marketing Association in 2010. He he tweets under the handle @tomhcanderson.

 

Shop Talk on Research Trends: Our Interview with the Industry’s Top Pundit!

GreenBook Interview Covers Partnering, AI/Machine Learning and the Latest Insights Applications for Text Analytics “We should be less worried about each other and more worried about the potential new entrants to this industry.”

That’s what I told GreenBook Blog Editor & Chief Leonard Murphy in an interview recently when he asked me about the trend toward partnering and collaboration between research providers.

It’s not often that one gets to talk shop at length with the industry’s top pundit, so Tim Lynch and I were delighted when Lenny invited us for a frank and broad-based discussion that covered some important ground, including:

  • Why partnering and collaboration among research companies is becoming a critically important factor in today’s marketplace;
  • What the buzz around AI and machine learning is really about and what researchers need to know;
  • How text analytics are being deployed in powerful and novel ways to produce insights that either were not accessible or couldn’t be obtained practically in the past.

Check out Lenny’s post about it here and have a look at the interview below:

 

Special thanks again to Lenny Murphy for a great interview and for your efforts to keep us all informed and to help us get better at what we do!

@TomHCAnderson  - @OdinText

P.S. Want to know more about anything we covered in the interview? Contact us here.

 

About Tom H. C. Anderson

Tom H. C. Anderson is the founder and managing partner of OdinText, a venture-backed firm based in Stamford, CT whose eponymous, patented SAS platform is used by Fortune 500 companies like Disney, Coca-Cola and Shell Oil to mine insights from complex, unstructured and mixed data. A recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research, Anderson is the recipient of numerous awards for innovation from industry associations such as CASRO, ESOMAR and the ARF. He was named one of the "Four under 40" market research leaders by the American Marketing Association in 2010. He tweets under the handle @tomhcanderson.

 

Why Machine Learning is Meaningless

Beware These Buzzwords! The Truth About "Machine Learning" and "Artificial Intelligence" Machine learning, artificial intelligence, deep learning… Unless you’ve been living under a rock, chances are you’ve heard these terms before. Indeed, they seem to have become a must for market researchers.

Unfortunately, so many precise terms have never meant so little!

For computer scientists these terms entail highly technical algorithms and mathematical frameworks; to the layman they are synonyms; but as far as most of us should be concerned, increasingly, they are meaningless.

My engineers would severely chastise me if I used these words incorrectly—an easy mistake to make since there is technically no correct or incorrect way to use these terms, only strict and less strict definitions.

Nor, evidently, is there any regulation about how they’re used for marketing purposes.

(To simplify the rest of this blog post, let’s stick with the term “machine learning” as a catch-all.)

Add to this ambiguity the fact that no sane company would ever divulge the specifics underpinning their machine learning solution for fear of intellectual property theft. Still others may just as easily hide behind an IP claim.

Bottom line: It is simply impossible for clients to know what they are actually getting from companies that claim to offer machine learning unless the company is able and chooses to patent said algorithm.

It’s an environment that is ripe for unprincipled or outright deceitful marketing claims.

A Tale of Two Retailers

Not all machine learning capabilities are created equal. To illustrate, let’s consider two fictitious competing online retailers who use machine learning to increase their add-on sales:

  • The first retailer suggests other items that may be of interest to the shopper by randomly picking a few items from the same category as the item in the shopper’s cart.

 

  • The second retailer builds a complex model of the customer, incorporating spending habits, demographic information and historical visits, then correlates that information with millions of other shoppers who have a similar profile, and finally suggests a few items of potential interest by analyzing all of that data.

In this simplistic example, both retailers can claim they use machine learning to improve shoppers’ experiences, but clearly the second retailer employs a much more sophisticated approach. It’s simply a matter of the standard to which they adhere.

This is precisely what I’m seeing in the insights marketplace today.

At the last market research conference I attended, I was stunned by how many vendors—no matter what they were selling—claimed their product leveraged advanced machine learning and artificial intelligence.

Many of the products being sold would not even benefit from what I would classify as machine learning because the problems they are solving are so simple.

Why run these data through a supercomputer and subject them to very complicated algorithms only to arrive at the same conclusions you could come to with basic math?

Even if all these companies actually did what they claimed, in many cases it would be silly or wasteful.

Ignore Buzzwords, Focus on Results

In this unregulated, buzzword-heavy environment, I urge you to worry less about what it’s called and focus instead on how the technology solves problems and meets your needs.

At OdinText, we use advanced algorithms that would be classified as machine learning/AI, yet we refrain from using these buzzwords because they don’t really say anything.

Look instead for efficacy, real-world results and testimonials from clients who have actually used the tool.

And ALWAYS ask for a real-time demo with your ACTUAL data!

Yours truly,

@TomHCanderson

Ps. See firsthand how OdinText can help you learn what really matters to your customers and predict real behavior. Contact us for a demo using your own data here!

About Tom H. C. Anderson

Tom H. C. Anderson is the founder and managing partner of OdinText, a venture-backed firm based in Stamford, CT whose eponymous, patented SAS platform is used by Fortune 500 companies like Disney, Coca-Cola and Shell Oil to mine insights from complex, unstructured and mixed data. A recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research, Anderson is the recipient of numerous awards for innovation from industry associations such as CASRO, ESOMAR and the ARF. He was named one of the "Four under 40" market research leaders by the American Marketing Association in 2010. He  tweets under the handle @tomhcanderson.

65 CEOs Share Thoughts on Insights

Insight Association’s Inaugural CEO Summit: Future Tied to Collaboration and Technology Writing this at the Miami Airport as I’ve just finished up a great 3 day meeting of the minds at the new Insights Association’s first official event, the Marketing Research CEO Summit.

Though this event was formerly part of the Marketing Research Association (MRA), after the merger between The MRA and the Council for American Survey Research Organizations (CASRO), it is now is part of the greater and brand new Insights Association. This is also the reason I chose to attend the event for the first time this year. I like many others are eager for positive change in our industry and optimistically welcome new initiatives (as I mentioned in a post on their founding earlier this month).

Steve Schlesinger, CEO of Schlesinger Associates and Merrill Dubrow of M/A/R/C Research did a great job putting together and hosting the event.

While the obvious benefit of any event like this is the attendees and not the speakers, we had some other interesting and well respected client guests including Walmart’s Urvi Bhandari, Merck’s Lisa Courtade, Electrolux’s Brett Townsend and Dhan Kashyap from Humana. Their very candid evaluations of how well the industry is delivering *Hint* it’s not even close to as well as we think, was worth the cost of attendance.

Getting back to the attendees though, Market researchers as a breed are a cautious bunch and CEO’s in any industry are likely going to be “Alpha’s”. Quickly gaining trust and enabling sharing among this audience of would be competitors is not an easy task. Partly this was made possible via a fun case study competition sponsored by La Quinta CEO Keith Cline who also spoke at the event.

Another interesting aspect of the event was the Hot Seat interviews wherein a handful of the CEO’s in attendance were asked a series of tough and sometimes semi personal questions. I was one of those selected for this impromptu exercise and was asked what I thought about various aspects of the future of marketing research including digital/social (which I like to separate from other text analytics), and of course the topic of machine learning/AI which seems to be on everyone’s mind. For that reason I’ve decided to do a short blog post on AI and Machine learning later this week.

What I’d like to end this post with though is in re-answering one of the questions which I think Merrill indirectly asked me, and which I was asked by a couple of other attendees. I think the question is also related to the future of research. Do you think of yourself as a Marketing Research co. CEO or a software CEO? [Prior to founding OdinText Inc. in 2015 I ran boutique research firm Anderson Analytics for 10 years]

I admit it’s a tricky question, and obviously if I didn’t consider myself at least in part a marketing research CEO I wouldn’t have attended. Yet many of our software users definitely aren’t market researchers.

So here goes, I think we as an industry have an important skill set and understanding of our clients that no outsider has. I’m proud of this background and like other speakers including ZappiStore’s CRO Ryan Barry and Dan Foreman of Hatted pointed out, the future is not in resisting technology, nor is it necessarily in building your own technology, which can be time consuming and wasteful, but it’s about embracing technology and often learning how to rent or partner with technology experts and adding what you are best at (often data and as importantly consultative insights and strategy).

Several of the CEO’s I spoke with separately admitted having tried various internal technology builds which either weren’t right, or in some cases may have been right when the effort began, but didn’t evolve quickly enough and so was outdated when they did come to market.

Yet it was also quite clear to most of these CEO’s that while it’s critical to watch out for new technology oriented entrants into the market research space, more often than not these simply do not have the knowledge necessary to deliver truly actionable insights. Companies like IBM Watson for instance, certainly have a strong brand name in computers, but their offering as a plug in for marketing research API’s is sorely lacking to say the least.

The point is, knowledge and trust is what we have in good supply at both the event and in our industry in overall. The key to evolving is to remember the knowledge and best practices our industry was based on while being open to understanding outside technologies and ideas, yet resisting the urge to just try to copy them. Importantly as Merrill Dubrow pointed out, there are tremendous benefits in overcoming your fear of collaborating with other research and technology companies and partnering.

This is the idea I’m most optimistic about coming away from the conference. I made several new friends at the event, and I welcome anyone who attended to please reach out if they have are any questions in regard to text analytics and data mining software and discussing potential mutually beneficial relationships.

Until Next Year!

@TomHCAnderson

 

ABOUT ODINTEXT

OdinText is a patented SaaS (software-as-a-service) platform for advanced analytics. Fortune 500 companies such as Disney and Shell Oil use OdinText to mine insights from complex, unstructured text data. The technology is available through the venture-backed Stamford, CT firm of the same name founded by Tom H. C. Anderson, a recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research. Anderson and OdinText have received numerous awards for innovation from industry associations such as ESOMAR, CASRO, the ARF and the American Marketing Association. He tweets under the handle @tomhcanderson. Request OdinText Info or a free demo here.

Why Communicating with Aliens is Easier than You Think – And What It Means for Your Company

The Movie “Arrival,” Text Analytics and Machine Translation When I speak with prospective OdinText users who’ve been exposed to other text analytics software providers, I find they tend to mention and ask about things like POS tagging, taxonomies, ontologies, etc.

These terms come from linguistics, the discipline upon which many of the text analytics software platforms in the market today are predicated.

But you may be surprised to learn that as a basis for text analytics, linguistics is shockingly inefficient compared to approaches that rely on mathematics/statistics.

One of the most popular movies in theaters right now, “Arrival,” inadvertently makes this case rather well.

Understanding Alien Languages is Easy (Provided You’re Not a Linguist)

arrivallanguage

arrivallanguage

“Arrival” begins with a flock of spaceships touching down in locations around the world. Linguistics professor Louise Banks (Amy Adams) is then recruited to lead an elite team of experts in a race against time to find a way to communicate with the extraterrestrial visitors and avert a global war.

The film proceeds to build a lot of drama around a pretty minor problem of language analysis and translation—conveniently consuming several months during which the plot can thicken—when, in fact, the task of understanding an alien language like in the movie would be quite EASY.

I daresay in all modesty that I could have done this in a fraction of the time with OdinText and with a much smaller team than Adams’ character had!

arrival-human

arrival-human

It Only Takes a Few Words

In her first conversation with the aliens, Louise introduces herself by writing the word “human” on a little whiteboard she carries, to which the aliens respond by introducing themselves in their language.

After this initial exchange, in the real world, only a few more words would be necessary to start creating and applying a code book (a taxonomy or ontology in linguistics speak), which would allow one to quickly translate anything else said and to then communicate via a small, imperfect but highly effective vocabulary.

For example, a little later in the movie, one of the aliens tells Louise that another alien who is missing from their meeting that day is “in the death process,” which, of course, means the other alien is absent because he is dying.

Everyone in the audience gets what the alien means by “in the death process.”  Indeed, communicating successfully with a small, imperfect vocabulary like this is far more efficient and reliable than one might assume. My two-year-old son and I are quite good at communicating in these sorts of two- or three-word phrases.  And no parts of speech tagging are necessary (nor would they be very helpful here).

I’ll come back to this idea of small, imperfect but surprisingly efficient vocabularies in a bit. But first, let’s consider a related but more challenging matter: breaking code.

How the Allies Used Text Analytics to Break the German Code

Compared to translating an alien language, it would be only slightly more difficult—though honestly not that much more difficult—to crack the Nazi Enigma code that helped the Allies win WWII today using OdinText.

Why more difficult? Because unlike the aliens in “Arrival,” who actually want the humans to learn their language in order to communicate, the Nazis wanted their encrypted language to stay indecipherable.

BENEDICT CUMBERBATCH stars in THE IMITATION GAME

BENEDICT CUMBERBATCH stars in THE IMITATION GAME

In the 2014 movie “The Imitation Game,” Benedict Cumberbatch stars as Alan Turing, the genius British mathematician, logician, cryptologist and computer scientist who led the effort to crack the German code.

In contrast to “Arrival,” the drama in “The Imitation Game” centers on Turing’s determination to build a decryption machine, instead of attempting to decode Enigma by hand like every other scientist assigned to the task.

When his boss refuses to fund his machine’s construction, Turing writes to Churchill, who arranges the funding and names him team leader. Turing subsequently fires the key linguists from the project and the linguistic approach to this text analysis (i.e., code breaking) is chucked in favor of computational mathematics.

Turing’s machine is, of course, critical to the solution (though the technology is simple by today’s standards), but the real breakthrough happens when the scientists realize that the machine can be sped up by recognizing routinely used phrases like “Heil Hitler” (again providing a basic code frame or taxonomy).

The Turing Test: Did You Know You Were Talking to a Computer?

In computer engineering classes on artificial intelligence there is an oft-mentioned thought experiment called “The Chinese Room,” which is used to think about the differences between human and computer cognition. It’s often referenced when discussing the Turing Test, which assesses computer intelligence based on whether a human being can distinguish between a computer and a human being’s replies to the same questions.

Going back now to my earlier point about a small taxonomy being sufficient for communication, and keeping in mind that today’s far more powerful computers running Google Translate or OdinText can process unstructured text data in any language order of magnitudes faster than any human or Turing’s machine, I think The Chinese Room analogy is not just an interesting AI thought experiment, but a good way to explain why translating the alien language in “Arrival” should have been so much easier than the film made it out to be.

The Chinese Room

Imagine for a moment a room with no windows, only a door with a small mail slot.

In the room, we find an average English speaker recruited randomly off the street, someone without any advanced education or background in foreign languages or linguistics.

This person has been paid to spend the day in this room and given a code book for a “squiggly language” he/she has been tasked with translating. In the story, it’s typically Chinese, but it could be any foreign language with which the person is totally unfamiliar. Let’s assume Chinese to stay close to the original story.

After giving him/her this code book—basically an English-to-Chinese/Chinese-to-English dictionary—we tell this person that on occasion we may pass them a note written in Chinese and that they will need to use the code book to figure out what the message means in English. Likewise, if they need anything—water, food, bathroom break, etc.—they will need to pass the request in a note written in Chinese back through the mail slot to us.

Note that this person has ABSOLUTELY NO TRAINING in the syntax or grammar of Chinese. His/her notes may be rudimentary, but certainly they will still be understood.

What’s more, if a native Chinese speaker walked by and observed the notes coming out, they would probably assume that there was a Chinese speaker in the room.

Now, instead of a code book, suppose the person in the room was using a computer program like Google Translate or OdinText, which can instantaneously translate or otherwise process any number of words coming out of the room, making it even more likely that the Chinese-speaking passerby assumes the person in the room speaks Chinese.

Think about this the next time you’re wondering whether data translated by machine—which is so much faster and cheaper than human translation—is sufficient for text analytics purposes (i.e. understanding what hundreds or hundreds of thousands of humans are saying in some foreign language).

My strong belief is yes, definitely. Whether I’m looking at Swedish or Chinese, I’m always rather impressed by how on point today’s computer translation is, and how irrelevant any nuance is, especially at the aggregate level, which is usually where we need to be.

You don’t need a team of NASA scientists, nor a month to do it. You can have it ready by morning! The technology is already here!

@TomHCAnderson

  1. To learn more about how OdinText can help you learn what really matters to your customers and predict real behavior here on Earth, please contact us or request a FREE demo using your own data here!

[Key Terms: AI, Artificial Intelligence, Machine Translation, Text Analytics, Linguistics, Computational Linguistics, Taxonomies, Ontologies, Natural Language Processing, NLP]

tomtextanalyticstips

tomtextanalyticstips

Tom H. C. Anderson OdinText Inc. www.odintext.com

ABOUT ODINTEXT

OdinText is a patented SaaS (software-as-a-service) platform for advanced analytics. Fortune 500 companies such as Disney and Shell Oil use OdinText to mine insights from complex, unstructured text data. The technology is available through the venture-backed Stamford, CT firm of the same name founded by CEO Tom H. C. Anderson, a recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research. Anderson is the recipient of numerous awards for innovation from industry associations such as ESOMAR, CASRO, the ARF and the American Marketing Association. He tweets under the handle @tomhcanderson.