Posts tagged qualitative research
A New Trend in Qualitative Research

Almost Half of Market Researchers are doing Market Research Wrong! - My Interview with the QRCA (And a Quiet New Trend - Science Based Qualitative).

Two years ago I shared some research on research about how market researchers view Quantitative and Qualitative research. I stated that almost half of researchers don’t understand what good data is. Some ‘Quallies’ tend to rely and work almost exclusively with comment data from extremely small samples (about 25% of market researchers surveyed), conversely there is a large group of ‘Quant Jockey’s’ who while working with larger more representative sample sizes, purposefully avoid any unstructured data such as open ended comments because they don’t want to deal with coding and analyzing it or don’t believe in it’s accuracy and ability to add to the research objectives. In my opinion both researcher groups have it totally wrong, and are doing a tremendous disservice to their companies and clients.  Today, I’ll be focusing on just the first group above, those who tend to rely primarily on qualitative research for decisions.

Note that today’s blog post is related to a recent interview, which I was asked to take part in by the QRCA’s (Qualitative Research Consultant’s Association) Views Magazine. When they contacted me I told them that in most cases (with some exceptions), Text Analytics really isn’t a good fit for Qualitative Researchers, and asked if they were sure they wanted to include someone with that opinion in their magazine? I was told that yes, they were ok with sharing different viewpoints.

I’ll share a link to the full interview in the online version of the magazine at the bottom of this post. But before that, a few thoughts to explain my issues with qualitative data and how it’s often applied as well as some of my recent experiences with qualitative researchers licensing our text analytics software, OdinText.

The Problem with Qualitative Research

IF Qual research was really used in the way it’s often positioned, ‘as a way to inform quant research’, that would be ok. The fact of the matter is though, Qual often isn’t being used that way, but instead as an end in and of itself. Let me explain.

First, there is one exception to this rule of only using Qual as pilot feedback for Quant. If you had a product for instance which was specifically made only for US State Governors, then your total population is only N=50. And of course it is highly unlikely that you would ever get all the Governors of each and every US State to participate in any research (which would be a census of all governors), and so if you were fortunate enough to have a group of say 5 Governors whom were willing to give you feedback on your product or service, you would and should obviously hang on to and over analyze every single comment they gave you.

IF however you have even a slightly more common mainstream product, I’ll take a very common product like hamburgers as an example, and you are relying on 5-10 focus groups of n=12 to determine how different parts of the USA (North East, Mid-West, South and West) like their burgers, and rather than feeding  directly into some quantitative research instrument with a greater sample, you issue a ‘Report’ that you share with management; well then you’ve probably just wasted a lot of time and money for some extremely inaccurate and dangerous findings. Yet surprisingly, this happens far more often than one would imagine.

Cognitive Dissonance Among Qual Researchers when Using OdinText

How do I know this you may ask? Good Text Analytics software is really about data mining and pattern recognition. When I first launched OdinText we had a lot of inquiries from Qualitative researchers who wanted some way to make their lives easier. After all, they had “a lot” of unstructured/text comment data which was time consuming for them to process, read, organize and analyze. Certainly, software made to “Analyze Text” must therefore be the answer to their problems.

The problem was that the majority of Qual researchers work with tiny projects/sample, interviews and groups between n=1 and n=12. Even if they do a couple of groups like in the hamburger example I gave above, we’re still taking about a total of just around n=100 representing four or more regional groups of interest, and therefore fewer than n=25 per group. It is impossible to get meaningful/statistically comparable findings and identify real patterns between the key groups of interest in this case.

The Little Noticed Trend In Qual (Qual Data is Getting Bigger)

However, slowly across the past couple of years or so, for the first time I’ve seen a movement of some ‘Qualitative’ shops and researchers, toward Quant. They have started working with larger data sets than before. In some cases, it has been because they have been pulled in to manage larger ongoing community/boards, in some cases larger social media projects, and in others, they have started using survey data mixed with qual, or even better, employing qualitative techniques in quant research (think better open-ends in survey research).

For this reason, we now have a small but growing group of ‘former’ Qual researchers using OdinText. These researchers aren’t our typical mixed data or quantitative researchers, but qualitative researchers that are working with larger samples.

And guess what, “Qualitative” has nothing to do with whether data is in text or numeric format, instead it has everything to so with sample size. And so perhaps unknowingly, these ‘Qualitative Researchers’ have taken the step across the line into Quantitative territory, where often for the first time in their career, statistics can actually be used. – And it can be shocking!

My Experience with ‘Qualitative’ Researchers going Quant/using Text Analytics

Let me explain what I mean. Recently several researchers that come from a clear ‘Qual’ background have become users of our software OdinText. The reason is that the amount of data they had was quickly getting “bigger than they were able to handle”. They believe they are still dealing with “Qualitative” data because most of it is text based, but actually because of the volume, they are now Quant researchers whether they know it or not (text or numeric data is irrelevant).

Ironically, for this reason, we also see much smaller data sizes/projects than ever before being uploaded to the OdinText servers. No, not typically single focus groups with n=12 respondents, but still projects that are often right on the line between quant and qual (n=100+).

The discussions we’re having with these researchers as they begin to understand the quantitative implications of what they have been doing for years are interesting.

Let me preface this with the fact that I have a great amount of respect for the ‘Qualitative’ researchers that begin using OdinText. Ironically, the simple fact that we have mutually determined that an OdinText license is appropriate for them means that they are no longer ‘Qualitative’ researchers (as I explained earlier). They are in fact crossing the line into Quant territory, often for the first time in their careers.

The data may be primarily text based, though usually mixed, but there’s no doubt in their mind nor ours, that one of the most valuable aspects of the data is the customer commentary in the text, and this can be a strength

The challenge lies in getting them to quickly accept and come to terms with quantitative/statistical analysis, and thereby also the importance of sample size.

What do you mean my sample is too small?

When you have licensed OdinText you can upload pretty much any data set you have. So even though they may have initially licensed OdinText to analyze some projects with say 3,000+ comments, there’s nothing to stop them from uploading that survey or set of focus groups with just n=150 or so.

Here’s where it sometimes gets interesting. A sample size of n=150 is right on the borderline. It depends on what you are trying to do with it of course. If half of your respondents are doctors (n=75) and half are nurses (n=75), then you may indeed be able to see some meaningful differences between these two groups in your data.

But what if these n=150 respondents are hamburger customers, and your objective was to understand the difference between the 4 US regions in the I referenced earlier? Then you have about n=37 in each subgroup of interest, and you are likely to have very few, IF ANY, meaningful patterns or differences.

Here’s where that cognitive dissonance can happen --- and the breakthroughs if we are lucky.

A former ‘Qual Researcher’ who has spent the last 15 years of their career making ‘management level recommendations’ on how to market burgers differently in different regions based on data like this, for the first time is looking at software which says that there are maybe just two to 3 small differences, or even worse, NO MEANINGFUL PATTERNS OR DIFFERENCES WHATSOEVER, in their data, may be in shock!

How can this be? They’ve analyzed data like this many times before, and they were always able to write a good report with lots of rich detailed examples of how North Eastern Hamburger consumers preferred this or that because of this and that. And here we are, looking at the same kind of data, and we realize, there is very little here other than completely subjective thoughts and quotes.

Opportunity for Change

This is where, to their credit, most of our users start to understand the quantitative nature of data analysis. They, unlike the few ‘Quant Only Jockie’s’ I referenced at the beginning of the article already understand that many of the best insights come from text data in free form unaided, non-leading, yet creative questions.

They only need to start thinking about their sample sizes before fielding a project. To understand the quantitative nature of sampling. To think about the handful of structured data points that they perhaps hadn’t thought much about in previous projects and how they can be leveraged together with the unstructured data. They realize they need to start thinking about this first, before the data has all been collected and the project is nearly over and ready for the most important step, the analysis, where rubber hits the road and garbage in really should mean garbage out.

If we’re lucky, they quickly understand, its not about Quant and Qual any more. It’s about Mixed Data, it’s about having the right data, it’s about having enough data to generate robust findings and then superior insights!

Final Thoughts on the Two Meaningless Nearly Terms of ‘Quant and Qual’

As I’ve said many times before here and on the NGMR blog, the terms “Qualitative” and “Quantitative” at least the way they are commonly used in marketing research, is already passé.

The future is Mixed Data. I’ve known this to be true for years, and almost all our patent claims involve this important concept. Our research shows time and time again, that when we use both structured and unstructured data in our analysis, models and predictions, the results are far more accurate.

For this reason we’ve been hard at work developing the first ever truly Mixed Data Analytics Platform, we’ll be officially launching it three months from now, but many of our current customers already have access. [For those who are interested in learning more or would like early access you can inquire here: OdinText.com/Predict-What-Matters].

In the meantime, if you’re wondering whether you have enough data to warrant advanced mixed data and text annalysis, check out the online version of article in QRCA Views magazine here. Robin Wedewer at QRCA really did an excellent job in asking some really pointed questions that forced me too answer more honestly and clearly than I might otherwise have.

I realize not everyone will agree with today’s post nor my interview with QRCA, and I welcome your comments here. I just please ask that you read both the post above, as well as the interview in QRCA before commenting solely based on the title of this post.

Thank you for reading. As always, I welcome questions publicly in post below or privately via LinkedIn or our Inquiry form.

@TomHCAnderson

Don’t Miss the EVOLVE Virtual Roundtable on Monday at Noon!

evolvequaldata Don’t Miss the EVOLVE Virtual Roundtable on Monday at Noon!

The ability to understand decision-making and to predict actual behavior today is advancing rapidly, and much of it is coming out of qualitative research.

On this Monday, Nov. 14 at 12:00 EST, I’ll be participating in a virtual roundtable discussion as part of The Evolve Virtual Conference,  a one-week online event dedicated to exploring the innovations changing the game for online qualitative.

This is a great opportunity for you to get up to speed on the latest innovations in online qualitative without leaving your office!

Check out the roundtable here and learn more about EVOLVE here!

REGISTRATION is free.

Hope to “see” you there!

@TomHCAnderson

 

 


Tom H.C. Anderson | @OdinText

Tom H.C. Anderson

To learn more about how OdinText can help you understand what really matters to your customers and predict actual behavior,  please contact us or request a Free Demo here >

[NOTE: Tom H. C. Anderson is Founder of Next Generation Text Analytics software firm OdinText Inc. Click here for more Text Analytics Tips ]

Five Reasons to NEVER Design a Survey without a Comment Field

Marketing Research Confessions Part II - Researchers Say Open-Ends Are Critical!

My last post focused on the alarmingly high number of marketing researchers (~30%) who, as a matter of policy, either do not include a section for respondent comments (a.k.a. “open-ended” questions) in their surveys or who field surveys with a comment section but discard the responses.

The good news is that most researchers do, in fact, understand and appreciate the value of comment data from open-ended questions.

Indeed, many say feedback in consumers’ own words is indispensable.

Among researchers we recently polled:

  • 70% would NEVER launch tracker OR even an ad-hoc (66%) survey without a comment field
  • 80% DO NOT agree that analyzing only a subset of the comment data is sufficient
  • 59% say comment data is AT LEAST as important as the numeric ratings data (and many state they are the most important data points)
  • 58% ALWAYS allocate time to analyze comment data after fielding

In Their Own Words: “Essential”

In contrast to the flippancy we saw in comments from those who don’t see any need for open-ended survey questions, researchers who value open-ends felt pretty strongly about them.

Consider these two verbatim responses, which encapsulate the general sentiment expressed by researchers in our survey:

“Absolutely ESSENTIAL. Without [customer comments] you can easily draw the wrong conclusion from the overall survey.”

“Open-ended questions are essential. There is no easy shortcut to getting at the nuanced answers and ‘ah-ha!’ findings present in written text.”

As it happens, respondents to our survey provided plenty of detailed and thoughtful responses to our open-ended questions.

We, of course, ran these responses through OdinText and our analysis identified five common reasons for researchers’ belief that comment data from open-ended questions is critically important.

So here’s why, ranked chronologically in ascending order by preponderance of mentions and in their own words

 Top Five Reasons to Always Include an Open-End

 

#5 Proxy for Quality & Fraud

“They are essential in sussing out fraud—in quality control.”

“For data quality to determine satisficing and fraudulent behavior

“…to verify a reasonable level of engagement in the survey…”

 

#4 Understand the ‘Why’ Behind the Numbers

“Very beneficial when trying to identify cause and effect

“Open ends are key to understand the meaning of all the other answers. They provide context, motivations, details. Market Research cannot survive without open ends”

Extremely useful to understand what is truly driving decisions. In closed-end questions people tend to agree with statements that seem a reasonable, logical answer, even if they have not considered them before at all

“It's so critical for me to understand WHY people choose the hard codes, or why they behave the way the big data says they behave. Inferences from quant data only get you so far - you need to hear it from the horse’s mouth...AT SCALE!”

“OEs are windows into the consumer thought process, and I find them invaluable in providing meaning when interpreting the closed-ended responses.”

 

#3 Freedom from Quant Limitations

“They allow respondents more freedom to answer a question how they want to—not limited to a list that might or might not be relevant.”

“Extremely important to gather data the respondent wants to convey but cannot in the limited context of closed ends.”

“Open-enders allow the respondent to give a full explanation without being constrained by pre-defined and pre-conceived codes and structures. With the use of modern text analytics tools these comments can be analyzed and classified with ease and greater accuracy as compared to previous manual processes.”

“…fixed answer options might be too narrow.  Product registration, satisfaction surveys and early product concept testing are the best candidates…”

allowing participants to comment on what's important to them

 

#2 Avoiding Wrong Conclusions

“We code every single response, even on trackers [longitudinal data] where we have thousands of responses across 5 open-end questions… you can draw the wrong conclusion without open-ends. I've got lots of examples!”

“Essential - mitigate risk of (1) respondents misunderstanding questions and (2) analysts jumping to wrong conclusions and (3) allowing for learnings not included in closed-ended answer categories”

“Open ended if done correctly almost always generate more right results than closed ended.  Checking a box is cheap, but communicating an original thought is more valuable.”

 

#1 Unearthing Unknowns – What We Didn’t Know We Didn’t Know

“They can give rich, in-depth insights or raise awareness of unknown insights or concerns.”

“This info can prove valuable to the research in unexpected ways.”

“They are critical to capture the voice of the customer and provide a huge amount of insight that would otherwise be missed.”

“Extremely useful.  I design them to try and get to the unexpected reasons behind the closed-end data.”

“To capture thoughts and ideas, in their own words, the research may have missed.”

“It can give good complementary information. It can also give information about something the researcher missed in his other questions.”

“Highly useful. They allow the interviewee to offer unanticipated and often most valuable observations.”

 

Ps. Additional Reasons…

Although it didn’t make the top five, several researchers cited one other notable reason for valuing open-ended questions, summarized in the following comment:

“They provide the rich unaided insights that often are the most interesting to our clients

 

Next Steps: How to Get Value from Open-Ended Questions

I think we’ve established that most researchers recognize the tremendous value of feedback from open-ended questions and the reasons why, but there’s more to be said on the subject.

Conducting good research takes knowledge and skill. I’ve spent the last decade working with unstructured data and will be among the first to admit that while the quality of tools to tackle this data have radically improved, understanding what kind of analysis to undertake, or how to better ask the questions are just as important as the technology.

Sadly many researchers and just about all text analytics firms I’ve run into understand very little about these more explicit techniques in how to actually collect better data.

Therefore I aim to devote at least one if not more posts over the next few weeks to delve into some of the problems in working with unstructured data brought up by some of our researchers.

Stay tuned!

@TomHCAnderson

 

Ignoring Customer Comments: A Disturbing Trend

One-Third of Researchers Think Survey Ratings Are All They Need

You’d be hard-pressed to find anyone who doesn’t think customer feedback matters, but it seems an alarming number of researchers don’t believe they really need to hear what people have to say!

 

2in5 openends read

In fact, almost a third of market researchers we recently polled either don’t give consumers the opportunity to comment or flat out ignore their responses.

  • 30% of researchers report they do not include an option for customer comments in longitudinal customer experience trackers because they “don’t want to deal with the coding/analysis.” Almost as many (34%) admit the same for ad hoc surveys.
  • 42% of researchers also admit launching surveys that contain an option for customer comments with no intention of doing anything with the comments they receive.

Customer Comments Aren’t Necessary?

2 in 5 researchers it is sufficient to analyze only a small subset of my customers comments

Part of the problem—as the first bullet indicates—is that coding/analysis of responses to open-ended questions has historically been a time-consuming and labor-intensive process. (Happily, this is no longer the case.)

But a more troubling issue, it seems, is a widespread lack of recognition for the value of unstructured customer feedback, especially compared to quantitative survey data.

  • Almost half (41%) of researchers said actual voice-of-customer comments are of secondary importance to structured rating questions.
  • Of those who do read/analyze customer comments, 20% said it’s sufficient to just read/code a small subset of the comments rather than each and every

In short, we can conclude that many researchers omit or ignore customer comments because they believe they can get the same or better insights from quantitative ratings data.

This assumption is absolutely WRONG.

Misconception: Ratings Are Enough

I’ve posted on the serious problems with relying exclusively on quantitative data for insights before here.

But before I discovered text analytics, I used to be in the same camp as the researchers highlighted in our survey.

My first mistake was that I assumed I would always be able to frame the right questions and conceive of all possible relevant answers.

I also believed, naively, that respondents actually consider all questions equally and that the decimal point differences in mean ratings from (frequently onerous) attribute batteries are meaningful, especially if we can apply a T Test and the 0.23% difference is deemed “significant” (even if only at a directional 80% confidence level).

Since then, I have found time and time again that nothing predicts actual customer behavior better than the comment data from a well-crafted open-end.

For a real world example, I invite you to have a look at the work we did with Jiffy Lube.

There are real dollars attached to what our customers can tell us if we let them use their own words. If you’re not letting them speak, your opportunity cost is probably much higher than you realize.

Thank you for your readership,

I look forward to your COMMENTS!

@TomHCAnderson

[PS. Over 200 marketing researchers professionals completed the survey in just the first week in field (statistics above), and the survey is still fielding here. What I was most impressed with so far was ironically the quality and thought fullness of the two open ended comments that were provided. Thus I will be doing initial analysis and reporting here on the blog during the next few days. So come back soon to see part II and maybe even a part III of the analysis to this very short but interesting survey of research professionals]

Customer Satisfaction: What do satisfied vs. dissatisfied customers talk about?

Text Analytics Tips - Branding What do satisfied versus dissatisfied customers talk about? - Group Comparison Example Text Analytics Tips by Gosia

In this post we are going to discuss one of the first questions most researchers tend to explore using OdinText: what do satisfied versus dissatisfied customers talk about? Many market researchers not only seek to find out what the entire population of their survey respondents mentions but it is even more critical for them to understand the strengths mentioned by customers who are happy and the problems mentioned by those who are less happy with the product or service.

To perform this kind of analysis you need to first identify “satisfied” and “dissatisfied” customers in your data. The best way to do it is based on a satisfaction or satisfaction-related metric, e.g., Overall Satisfaction or NPS (Net Promoter Score) Rating (i.e., likelihood to recommend). In this example, satisfied customers are going to be those who answered 4 – “Somewhat satisfied” or 5 – “Very satisfied” to the Overall Satisfaction question (scale 1-5). And dissatisfied customers are those who answered 1 – “Very dissatisfied” or 2 – “Somewhat dissatisfied”.

Next, you can compare the content of comments provided by the two groups of customers (Group Comparison tab). I suggest you first select the frequency of occurrence statistic for your comparison. You can use a dictionary or create your own issues that are meaningful to you and see whether the two groups of customers discuss these issues with different frequency or you can look at any differences in the frequency of most commonly mentioned automatic terms (which OdinText has generated automatically for you).

Figure 1. Frequency of issues mentioned by satisfied (Overall Satisfaction 4-5) versus dissatisfied (Overall Satisfaction 1-2) customers. Descending order of frequency for satisfied customers.Figure 1. Frequency of issues mentioned by satisfied (Overall Satisfaction 4-5) versus dissatisfied (Overall Satisfaction 1-2) customers. Descending order of frequency for satisfied customers.

In the attached figure you can see a chart based on a simple group comparison using a dictionary of terms of a sample service company. There you go, lots of exciting insights to present to your colleagues based on a very quick analysis!

Gosia

Text Analytics Tips with Gosi

[NOTE: Gosia is a Data Scientist at OdinText Inc. Experienced in text mining and predictive analytics, she is a Ph.D. with extensive research experience in mass media’s influence on cognition, emotions, and behavior.  Please feel free to request additional information or an OdinText demo here.]

Beyond Sentiment - What Are Emotions, and Why Are They Useful to Analyze?
Text Analytics Tips - Branding

Text Analytics Tips - Branding

Beyond Sentiment - What are emotions and why are they useful to analyze?Text Analytics Tips by Gosia

Emotions - Revealing What Really Matters

Emotions are short-term intensive and subjective feelings directed at something or someone (e.g., fear, joy, sadness). They are different from moods, which last longer, but can be based on the same general feelings of fear, joy, or sadness.

3 Components of Emotion: Emotions result from arousal of the nervous system and consist of three components: subjective feeling (e.g., being scared), physiological response (e.g., a pounding heart), and behavioral response (e.g., screaming). Understanding human emotions is key in any area of research because emotions are one of the primary causes of behavior.

Moreover, emotions tend to reveal what really matters to people. Therefore, tracking primary emotions conveyed in text can have powerful marketing implications.

The Emotion Wheel - 8 Primary Emotions

OdinText can analyze any psychological content of text but the primary attention has been paid to the power of emotions conveyed in text.

8 Primary Emotions: OdinText tracks the following eight primary emotions: joy, trust, fear, surprise, sadness, disgust, anger, and anticipation (see attached figure; primary emotions in bold).

Sentiment Analysis

Sentiment Analysis

Bipolar Nature: These primary emotions have a bipolar nature; joy is opposed to sadness, trust to disgust, fear to anger, and surprise to anticipation. Emotions in the blank spaces are mixtures of the two neighboring primary emotions.

Intensity: The color intensity dimension suggests that each primary emotion can vary in ntensity with darker hues representing a stronger emotion (e.g., terror > fear) and lighter hues representing a weaker emotion (e.g. apprehension < fear). The analogy between theory of emotions and the theory of color has been adopted from the seminal work of Robert Plutchik in 1980s. [All 32 emotions presented in the figure above are a basis for OdinText Emotional Sentiment tracking metric].

Stay tuned for more tips giving details on each of the above emotions.

Gosia

Text Analytics Tips with Gosi

Text Analytics Tips with Gosi

[NOTE: Gosia is a Data Scientist at OdinText Inc. Experienced in text mining and predictive analytics, she is a Ph.D. with extensive research experience in mass media’s influence on cognition, emotions, and behavior. 

Celebrating Four Consumer Insights Industry Awards for Text Analytics!

A Note of Reflection on Text Analytics and Thanks to the Marketing Research Industry Just posting  a short video today to celebrate our recent awards in an industry that has always been so near and dear to OdinText (and Anderson Analytics before that).

If you would’ve told me 25 years ago when I began my career in consumer insights that one day I would be running my own text analytics software company, I night have laughed. But the field has changed so dramatically since I was a freshly-minted market researcher.

Back when I started out, online research did not exist. There was no social media. Tweeting was something only birds did. And teenagers bugged their parents for their own land lines

As a young researcher, the primary way we collected information and gleaned insights from consumers was by asking them questions, but one of the most pronounced shifts in research today is that so much of the data at our disposal is collected passively.

Today, in addition to the tried-and-true qualitative and quantitative tools of our trade, we have oceans of data flooding into our organizations from non-traditional sources.

And so our job has become less about accumulating information from consumers, and more about connecting the dots and making sense of it all. That’s a pretty significant shift, I think.

And it’s a monumental challenge for those of us in consumer insights.

Many companies now are turning to data scientists for answers, but I would argue that the onus is on those of us in market research to find a way to use these data for competitive advantage.

I got into this because I was conducting advanced text analytics for clients and none of the tools available did what we needed them to do as research analysts.

I did not set out to be a software developer; I just wanted something that worked for my team and our clients. It needed to be fast and easy to use for people who are not data scientists.

We are honored and humbled to be part of such a smart, creative and closely knit community of professionals.  To be recognized by peers for contributing to our progress as a profession is enormously gratifying.

Thank you all again! We hope and plan to continue to innovate and give back to our industry!

 

 

Yours faithfully,

@TomHCAnderson @OdinText

Text analysis answers: Is the Quran really more violent than the Bible? (3of3)

Text analysis answers: Is the Quran really more violent than the Bible?by Tom H. C. Anderson

Text Analytics Bible Q

Part III: The Verdict

To recap…

President Obama in his State of the Union last week urged Congress and Americans to “reject any politics that target people because of race or religion”—clearly a rebuke of presidential candidate Donald Trump’s call for a ban on Muslims entering the United States.

This exchange, if you will, reflects a deeper and more controversial debate that has wended its way into not only mainstream politics but the national discourse: Is there something inherently and uniquely violent about Islam as a religion?

It’s an unpleasant discussion at best; nonetheless, it is occurring in living rooms, coffee shops, places of worship and academic institutions across the country and elsewhere in the world.

Academics of many stripes have interrogated the texts of the great religions and no doubt we’ll see more such endeavors in the service of one side or the other in this debate moving forward.

We thought it would be an interesting exercise to subject the primary books of these religions—arguably the core of their philosophy and tenets—to comparison using the advanced data mining technology that Fortune 500 corporations, government agencies and other institutions routinely use to comb through large sets of unstructured text to identify patterns and uncover insights.

So, we’ve conducted a surface-level comparative analysis of the Quran and the Old and New Testaments using OdinText to uncover with as little bias as possible the extent to which any of these texts is qualitatively and/or quantitatively distinct from the others using metrics associated with violence, love and so on.

Again, some qualifiers…

First, I want to make very clear that we have not set out to prove or disprove that Islam is more violent than other religions.

Moreover, we realize that the Old and New Testaments and the Quran are neither the only literature in Islam, Christianity and Judaism, nor do they constitute the sum of these religions’ teachings and protocols.

I must also reemphasize that this analysis is superficial and the findings are by no means intended to be conclusive. Ours is a 30,000-ft, cursory view of three texts: the Quran and the Old and New Testaments, respectively.

Lastly, we recognize that this is a deeply sensitive topic and hope that no one is offended by this exercise.

 

Analysis Step: Similarities and Dissimilarities

Author’s note: For more details about the data sources and methodology, please see Part I of this series.

In Part II of the series, I shared the results of our initial text analysis for sentiment—positive and negative—and then broke that down further across eight primary human emotion categories: Joy, Anticipation, Anger, Disgust, Sadness, Surprise, Fear/Anxiety and Trust.

The analysis determined that of the three texts, the Old Testament was the “angriest,” which obviously does not appear to support an argument that the Quran is an especially violent text relative to the others.

The next step was to, again, staying at a very high level, look at the terms frequently mentioned in the texts to see what if anything these three texts share and where they differ.

Similarity Plot

Text Analytics Similarity Plot 2

This is yet another iterative way to explore the data from a Bottom-Up data-driven approach and identify key areas for more in-depth text analysis.

For instance—and not surprisingly—“Jesus” is the most unique and frequently mentioned term in the New Testament, and when he is mentioned, he is mentioned positively (color coding represents sentiment).

“Jesus” is also mentioned a few times in the Quran, and, for obvious reasons, not mentioned at all in the Old Testament. But when “Jesus” is mentioned in the New Testament, terms that are more common in the Old Testament—such as “God” and “Lord”—often appear with his name; therefore the placement of “Jesus” on the map above, though definitely most closely associated with the New Testament, is still more closely related to the Old Testament than the Quran because these terms appear more often in the former.

Similarly, it may be surprising to some that “Israel” is mentioned more often in the Quran than the New Testament, and so the Quran and the Old Testament are more textually similar in this respect.

So…Is the Quran really more violent than the Old and New Testaments?

Old Testament is Most Violent

A look into the verbatim text suggests that the content in the Quran is not more violent than its Judeo-Christian counterparts. In fact, of the three texts, the content in the Old Testament appears to be the most violent.

Killing and destruction are referenced slightly more often in the New Testament than in the Quran (2.8% vs. 2.1%), but the Old Testament clearly leads—more than twice that of the Quran—in mentions of destruction and killing (5.3%).

New Testament Highest in ‘Love’, Quran Highest in ‘Mercy’

The concept of ‘Love’ is more often mentioned in the New Testament (3.0%) than either the Old Testament (1.9%) or the Quran (1.26%).

But the concept of ‘Forgiveness/Grace’ actually occurs more often in the Quran (6.3%) than the New Testament (2.9%) or the Old Testament (0.7%). This is partly because references to “Allah” in the Quran are frequently accompanied by “The Merciful.” Some might dismiss this as a tag or title, but we believe it’s meaningful because mercy was chosen above other attributes like “Almighty” that are arguably more closely associated with deities.

Text Analytics Plot 3

‘Belief/ Faith’, ‘Non-Members’ and ‘Enemies’

A key difference emerged immediately among the three texts around the concept of ‘Faith/Belief’.

Here the Quran leads with references to ‘believing’ (7.6%), followed by the New Testament (4.8%) and the Old Testament a distant third (0.2%).

Taken a step further, OdinText uncovered what appears to be a significant difference with regard to the extent to which the texts distinguish between ‘members’ and ‘non-members’.

Both the Old and New Testaments use the term “gentile” to signify those who are not Jewish, but the Quran is somewhat distinct in referencing the concept of the ‘Unbeliever’ (e.g.,“disbelievers,” “disbelieve,” “unbeliever,” “rejectors,” etc.).

And in two instances, the ‘Unbeliever’ is mentioned together with the term “enemy”:

“And when you journey in the earth, there is no blame on you if you shorten the prayer, if you fear that those who disbelieve will give you trouble. Surely the disbelievers are an open enemy to you

 An-Nisa 4:101

“If they overcome you, they will be your enemies, and will stretch forth their hands and their tongues towards you with evil, and they desire that you may disbelieve

Al-Mumtahina 60:2

That said, the concept of “Enemies” actually appears most often in the Old Testament (1.8%).

And while the concept of “Enemies” occurs more often in the Quran than in the New Testament (0.7% vs 0.5%, respectively), there is extremely little difference in how they are discussed (i.e., who and how to deal with them) with one exception: the Quran is slightly more likely than the New Testament to mention “the Devil” or “evil” as being an enemy (.2% vs 0.1%).

Conclusion

While A LOT MORE can be done with text analytics than what we’ve accomplished here, it appears safe to conclude that some commonly-held assumptions about and perceptions of these texts may not necessarily hold true.

Those who have not read or are not fairly familiar with the content of all three texts may be surprised to learn that no, the Quran is not really more violent than its Judeo-Christian counterparts.

Personally, I’ll admit that I was a bit surprised that the concept of ‘Mercy’ was most prevalent in the Quran; I expected that the New Testament would rank highest there, as it did in the concept of ‘Love’.

Overall, the three texts rated similarly in terms of positive and negative sentiment, as well, but from an emotional read, the Quran and the New Testament also appear more similar to one another than either of them is to the significantly “angrier” Old Testament.

Of course, we’ve only scratched the surface here. A deep analysis of unstructured data of this complexity requires contextual knowledge, and, of course, some higher level judgment and interpretation.

That being said, I think this exercise demonstrates how advanced text analytics and data mining technology may be applied to answer questions or make inquiries objectively and consistently outside of the sphere of conventional business intelligence for which our clients rely on OdinText.

I hope you found this project as interesting as I did and I welcome your thoughts.

Yours fondly,

Tom @OdinText

TOM DEC 300X250

 

Text Analytics Tips

Text Analytics Tips, with your Hosts Tom & Gosia: Introductory Post Today, we’re blogging to let you know about a new series of posts starting in January 2016 called ‘Text Analytics Tips’. This will be an ongoing series and our main goal is to help marketers understand text analytics better.

We realize Text Analytics is a subject with incredibly high awareness, yet sadly also a subject with many misconceptions.

The first generation of text analytics vendors over hyped the importance of sentiment as a tool, as well as ‘social media’ as a data source, often preferring to use the even vaguer term ‘Big Data’ (usually just referring to tweets). They offered no evidence of the value of either, and have usually ignored the much richer techniques and sources of data for text analysis. Little to no information or training is offered on how to actually gain useful insights via text analytics.

What are some of the biggest misconceptions in text analytics?

  1. “Text Analytics is Qualitative Research”

FALSE – Text Analytics IS NOT qualitative. Text Analytics = Text Mining = Data Mining = Pattern Recognition = Math/Stats/Quant Research

  1. It’s Automatic (artificial intelligence), you just press a button and look at the report / wordcloud

FALSE – Text Analytics is a powerful technique made possible thanks to tremendous processing power. It can be easy if using the right tool, but just like any other powerful analytical tools, it is limited by the quality of your data and the resourcefulness and skill of the analyst.

  1. Text Analytics is a Luxury (i.e. structured data analysis is of primary importance and unstructured data is an extra)

FALSE – Nothing could be further from the truth. In our experience, usually when there is text data available, it almost always outperforms standard available quant data in terms of explaining and/or predicting the outcome of interest!

There are several other text analytics misconceptions of course and we hope to cover many of them as well.

While various OdinText employees and clients may be posting in the ‘Text Analytics Tips’ series over time, Senior Data Scientist, Gosia, and our Founder, Tom, have volunteered to post on a more regular basis…well, not so much volunteered as drawing the shortest straw (our developers made it clear that “Engineers don’t do blog posts!”).

Kidding aside, we really value education at OdinText, and it is our goal to make sure OdinText users become proficient in text analytics.

Though Text Analytics, and OdinText in particular, are very powerful tools, we will aim to keep these posts light, fun yet interesting and insightful. If you’ve just started using OdinText or are interested in applied text analytics in general, these posts are certainly a good start for you.

During this long running series we’ll be posting tips, interviews, and various fun short analysis. Please come back in January for our first post which will deal with analysis of a very simple unstructured survey question.

Of course, if you’re interested in more info on OdinText, no need to wait, just fill out our short Request Info form.

Happy New Year!

Your friends @OdinText

Text Analytiics Tips T G

[NOTE: Tom is Founder and CEO of OdinText Inc.. A long time champion of text mining, in 2005 he founded Anderson Analytics LLC, the first consumer insights/marketing research consultancy focused on text analytics. He is a frequent speaker and data science guest lecturer at university and research industry events.

Gosia is a Senior Data Scientist at OdinText Inc.. A PhD. with extensive experience in content analytics, especially psychological content analysis (i.e. sentiment analysis and emotion in text), as well as predictive analytics using unstructured data, she is fluent in German, Polish and Spanish.]

 

Text Analytics for 2015 – Are You Ready?

OdinText SaaS Founder Tom H. C. Anderson is on a mission to educate market researchers about text analytics  [Interview Reposted from Greenbook]

TextAnalyticsGreenbookJudging from the growth of interest in text analytics tracked in GRIT each year, those not using text analytics in market research will soon be a minority. But still, is text analytics for everyone?

Today on the blog I’m very pleased to be talking to text analytics pioneer Tom Anderson, the Founder and CEO of Anderson Analytics, which develops one of the leading Text Analytics software platforms designed specifically for the market research field, OdinText.

Tom’s firm was one of the first to leverage text analytics in the consumer insights industry, and they have remained a leader in the space, presenting case studies at a variety events every year on how companies like Disney and Shell Oil are leveraging text analytics to produce remarkably impactful insights.

Lenny: Tom, thanks for taking the time to chat. Let’s dive right in! I think that you, probably more so than anyone else in the MR space, has witnessed the tremendous growth of text analytics within the past few years. It’s an area we post about often here on GreenBook Blog, and of course track via GRIT, but I wonder, is it really the panacea some would have us believe?

Tom: Depends on what you mean by panacea. If you think about it as a solution to dealing with one of the most important types of data we collect, then yes, it can and should be viewed exactly that way. On the other hand, it can only be as meaningful and powerful as the data you have available to use it on.

Lenny: Interesting, so I think what you’re saying is that it depends on what kind of data you have. What kind of data then is most useful, and which is not at all useful?

Tom: It’s hard to give a one size fits all rule. I’m most often asked about size of data. We have clients who use OdinText to analyze millions of records across multiple languages, on the other hand we have other clients who use it on small concept tests. I think it is helpful though to keep in mind that Text Analytics = Text Mining = Data Mining, and that data mining is all about pattern recognition. So if you are talking about interviews with five people, well since you don’t have a lot of data there’s not really going to be many patterns to discover.

Lenny: Good Point! I’ve been really impressed with the case studies you’ve releases in the past year or two on how clients have been using your software. One in particular was the NPS study with Shell Oil. A lot of researchers (and more importantly CMOs) really believed in the Net Promoter Score before that case study. Are those kinds of insights possible with social media data as well?

Tom: Thanks Lenny. I like to say that “not all data are created equal”. Social media is just one type of data that our clients analyze, often there is far more interesting data to analyze. It seems that everyone thinks they should be using text analytics, and often they seem to think all it can be used for is social media data. I’ve made it an early 2015 new year’s resolution to try to help educate as many market researchers as I can about the value of other text data.

Lenny: Is the situation any different than it was last year?

Tom: Awareness of text analytics has grown tremendously, but knowledge about it has not kept up. We’re trying to offer free mini consultations with companies to help them understand exactly what (if any) data they have are good candidates for text analytics.

Lenny: What sources of data, if any, don’t you feel text analytics should be used on?

It seems the hype cycle has been focused on social media data, but our experience is that often these tools can be applied much more effectively to a variety of other sources of data.

However, we often get questions about IDI (In-Depth-Interviews) and focus group data. This smaller scale qualitative data, while theoretically text analytics could help you discover things like emotions etc. there aren’t really too many patterns in the data because it’s so small. So we usually counsel against using text analytics for qual, in part due to lower ROI.

Often it’s about helping our clients take an inventory around what data they have, and help them understand where if at all text analytics makes sense.

Many times we find that a client really doesn’t have enough text data to warrant text analytics. However this is sad in cases where we also find out they do a considerable amount of ad-hoc surveys and/or even a longitudinal trackers that go out to tens of thousands of customers, and they’ve purposefully decided to exclude open ends because they don’t want to deal with looking at them later. Human coding is a real pain, takes a long time, is inaccurate and expensive; so I understand their sentiment.

But this is awful in my opinion. Even if you aren’t going to do anything with the data right now, an open ended question is really the only question every single customer who takes a survey is willing and able to answer. We usually convince them to start collecting them.

Lenny: Do you have any other advice about how to best work with open ends?

ODIN AD 1 300X250

Tom: Well we find that our clients who start using OdinText end up completely changing how they leverage open ends. Usually they get far wiser about their real estate and end up asking both less closed ended questions AND open ended questions. It’s like a light bulb goes off, and everything they learned about survey research is questioned.

Lenny: Thanks Tom. Well I love what your firm is doing to help companies do some really interesting things that I don’t think could have been done with any other traditional research techniques.

Tom: Thanks for having me Lenny. I know a lot of our clients find your blog useful and interesting.

If any of your readers want a free expert opinion on whether or not text analytics makes sense for them, we’re happy to talk to them about it. Best way to do so is probably to hit the info request button on our site, but I always try my best to respond directly to anyone who reaches out to me personally on LinkedIn as well.

Lenny: Thanks Tom, always a pleasure to chat with you!

For readers interested in hearing more of Tom’s thoughts on Text Analytics in market research, here are two videos from IIeX Atlanta earlier this year that are chock full of good information:

Panel: The Great Methodology Debate: Which Approaches Really Deliver on Client Needs?

Discussing the Future of Text Analytics with Tom Anderson of Odin Text