Text Analysis of What People Say in Their Own Words Reveals More Than Multiple-Choice Surveys It’s been just over a week since President Trump issued his controversial immigration order, and the ban continues to dominate the news and social media.
But while the fate of Executive Order 13769—“Protecting the Nation from Foreign Terrorist Entry into the United States”—is being hashed out in federal court, another fierce battle is being waged in the court of public opinion.
In a stampede to assess where the American people stand on this issue, the news networks have rolled out a parade of polls. And so, too, once again, the accuracy of polling data has been called into question by pundits on both sides of the issue.
Notably, on Monday morning the president, himself, tweeted the following:
Any negative polls are fake news, just like the CNN, ABC, NBC polls in the election. Sorry, people want border security and extreme vetting.
— Donald J. Trump (@realDonaldTrump) February 6, 2017
Majority Flips Depending on the Poll
It’s easy to question the accuracy of polls when they don’t agree.
Although on the whole these polls all indicate that support is pretty evenly divided on the issue, the all-important sound bite of where the majority of Americans stand on the Trump immigration moratorium flips depending on the source:
NBC ran with an Ipsos/Reuters poll that found the majority of Americans (49% vs. 41%) support the ban.
Fox News went with similar results from a poll by Quinnipiac College (48% in favor vs. 42% opposed).
CNN publicized results from an ORC Poll with the majority opposed to the ban (53% vs. 47%).
A widely reported Gallup poll found the majority of Americans oppose the order (55% to 42%).
There are a number of possible reasons for these differences, of course. It could be the way the question was framed (as suggested in this Washington Post column); it could be the timing (much has transpired and has been said between the dates these polls were taken); maybe the culprit is sample; perhaps modality played a part (some were done online, others by phone with an interviewer), etc.
My guess is that all of these factors to varying degrees account for the differences, but the one thing all of these polls share is that the instrument was quantitative.
So, I decided to see what if anything happens when we try to “unstructure” this question, which seemingly lends itself so perfectly to a multiple-choice format. How would an open-ended version of the same question compare with the results from the structured version? Would it add anything of value?
Part I: A Multiple-Choice Benchmark
The first thing we did was to run a quantitative poll as a comparator using a U.S. online nationally representative sample* of n=1,531 (a larger sample, by the way, than any of the aforementioned polls used).
In carefully considering how the question was framed in the other polls and how it’s being discussed in the media, we decided on the following wording:
“Q. How do you personally feel about Trump's latest Executive Order 13769 ‘Protecting the Nation from Foreign Terrorist Entry into the United States’ aka ‘A Muslim Ban’”?
We also went with the simplest and most straightforward closed-ended Likert scale—a standard five-point agreement scale. Below are the results:
Given a five-point scale, the most popular answer by respondents (36%) was “strongly disagree.” Interestingly, the least popular choice was “somewhat disagree” (6.6%).
Collapsing “strongly” and “somewhat” (see chart below) we found 4% more Americans (43%) disagree with Trump’s Executive Order than agree with it (39%). A sizeable number (18%) indicated they aren’t sure/don’t know.
Will It Unstructure? - A Text Analytics PollTM
Next, we asked another 1500 respondents from the same U.S. nationally online representative source* EXACTLY the same question, but instead of providing choices for them to select from, we asked them to reply in an open-ended comment box in their own words.
We ran the resulting comments through OdinText, with the following initial results:
As you can see, the results from the unstructured responses were remarkably close to those from structured question. In fact, the open-ended responses suggest Americans are slightly closer to equally divided on the issue, though slightly more disagree (a statistically significant percentage given the sample size).
This, however, is where the similarities between unstructured and structured data end.
While there is nothing more to be done with the Likert scale data, the unstructured question data analysis has just begun…
Low-Incidence Insights are Hardly Incidental
It’s worth noting here that OdinText was able to identify and quantify many important, but low-incidence insights—positive and negative— that would have been treated as outliers in a limited code-base and dismissed by human coders:
“Just Temporary” (0.5%)
“Just Certain/Specific Countries” (0.9%)
“Not a Muslim Ban/Stop Calling it that” (2.9%)
An Emotionally-Charged Policy
It shouldn’t come as a surprise to anyone that emotions around this particular policy run exceptionally high.
OdinText quickly quantified the emotions expressed in people’s comments, and you can see that while there certainly is a lot of anger—negative comments are spread across anger, fear/anxiety and sadness—there is also a significant amount of joy.
What the heck does “joy” entail, you ask? It means that enough people expressed unbridled enthusiasm for the policy along the lines of, “I love it!” or “It’s about time!” or “Finally, a president who makes good on his campaign promises!”
Understanding the Why Behind People’s Positions
Last, but certainly not least, asking the same question in an open-ended format where respondents can reply in their own words enables us to also understand why people feel the way they do.
We can then quantify those sentiments using text analytics and see the results in context in a way that would not have been possible using a multiple-choice format.
Here are a few examples from those who disagree with the order:
“Just plain wrong. It scored points with his base, but it made all Americans look heartless and xenophobic in the eyes of the world.”
“Absolutely and unequivocally unconstitutional. The foundation, literally the reason the first European settlers came to this land, was to escape religious persecution.”
“I don't like and it was poorly thought out. I understand the need for vetting, but this was an absolute mess.”
“I think it is an overly confident action that will do more harm than good.”
“I understand that Trump's intentions mean well, but his order is just discriminating. I fear that war is among us, and although I try my best to stay neutral, it's difficult to support his actions.”
Here are a few from those who agree:
“I feel it could have been handled better but I agree. Let’s make sure they are here documented correctly and backgrounds thoroughly checked.”
“I feel sometimes things need to be done to demonstrate seriousness. I do feel bad for the law abiding that it affects.”
“Initially I thought it was ridiculous, but after researching the facts associated with it, I'm fine with it. Trump campaigned on increasing security, so it shouldn't be a surprise. I think it is reasonable to take a period of time to standardize and enforce the vetting process.”
“I feel that it is not a bad idea. The only part that concerns me is taking away from living the American Dream for those that aren’t terrorists.”
“good but needed more explanation”
“OK with it - waiting to see how it pans out over the next few weeks”
“I think it is good, as long as it is temporary so that we can better vet those who would come to the U.S.”
And just as importantly, yet oft-overlooked those who aren’t completely sure:
“not my circus”
“While the thought is good and just for our safety, the implementation was flawed, much like communism.”
Final Thoughts: What Have we Learned?
First of all, we saw that the results in the open-ended format replicated those of the structured question. With a total sample of 3000, these results are statistically significant.
Second, we found that while emotions run high for people on both sides of this issue, comments from those who disagree with the ban tended to be more emotionally charged than from those who agreed with the ban. I would add here that some of the former group tended not to distinguish between their feelings about President Trump and the policy.
We also discovered that supporters of the ban appear to be better informed about the specifics of the order than those who oppose it. In fact, a significant number of the former group in their responses took the time to explain why referring to the order as “a Muslim ban” is inaccurate and how this misconception clouds the issue.
Lastly, we found that both supporters and detractors are concerned about the order’s implementation.
Let me know what you think. I’d be happy to dig into this data a bit more. In addition, if anyone is curious and would like to do a follow-up analysis, please contact me to discuss the raw data file.
Ps. Stay tuned for Part II of this study, where we’ll explore what the rest of the world thinks about the order!
*Note: Responses (n=3,000) were collected via Google Surveys. Google Surveys allow researchers to reach a validated (U.S. General Population Representative) sample by intercepting people attempting to access high-quality online content—such as news, entertainment and reference sites—or who have downloaded the Google Opinion Rewards mobile app. These users answer up to 10 questions in exchange for access to the content or Google Play credit. Google provides additional respondent information across a variety of variables including source/publisher category, gender, age, geography, urban density, income, parental status, response time as well as google calculated weighting. Results are +/- 1.79% accurate at the 95% confidence interval.