Posts tagged analytical tools
When Oprah is President We Can Celebrate Family Day While Skiing!

Text Analytics Poll™ Shows What We’d Like Instead of Presidents Day It’s been less than a week since our Valentine’s Day poll unearthed what people dislike most about their sweethearts, and already another holiday is upon us! Though apparently for most of us it’s not much of a holiday at all; well over half of Americans say they do nothing to commemorate ‘Presidents Day.’

You’ll note I put the holiday in single quotes. That’s because there’s some confusion around the name. Federally, it’s recognized as Washington’s Birthday. At the state level, it’s known by a variety of names—President’s Day, Presidents’ Day, Presidents Day and others, again, depending on the state.

But the name is not the only inconsistency about Presidents Day. If you’re a federal employee OR you happen to be a state employee in a state where the holiday is observed OR you work for an employer who honors it, you get the day off work with pay. Schools may or may not be closed, but that again depends on where you live.

As for what we’re observing exactly, well, that also depends on the state, but people generally regard the holiday as an occasion to honor either George Washington, alone, or Washington and Abraham Lincoln, or U.S. presidents, in general.

Perhaps the one consistent aspect of this holiday is the sales? It’s particularly popular among purveyors of automobiles, mattresses, and furniture.

Yes, it’s a patriotic sort of holiday, but on the whole, we suspected that ‘Presidents Day’ fell on the weaker end of the American holiday spectrum, so we investigated a little bit…

About this Text Analytics Poll™

In this example for our ongoing series demonstrating the efficiency, flexibility, and practicality of the Text Analytics Poll™ for insights generation, we opted for a light-hearted poll using a smaller sample* than usual. While text analytics have obvious value when applied to larger-scale data where human reading or coding is impossible or too expensive, you’ll see here that OdinText also works very effectively with smaller samples!

I’ll also emphasize that the goal of these little Text Analytics Polls™ is not to conduct a perfect study, but to very quickly design and field a survey with only one open-ended question, analyze the results with OdinText, and report the findings in here on this blog. (The only thing that takes a little time—usually 2-3 days—is the data collection.)

So while the research is representative of the general online population, and the text analytics coding applied with 100% consistency throughout the entire sample, this very speedy exercise is meant to inspire users of OdinText to use the software in new ways to answer new questions. It is not meant to be an exhaustive exploration of the topic. We welcome our users to comment and suggest improvements in the questions we ask and make suggestions for future topics!

Enough said, on to the results…

A Holiday In Search of a Celebrant in Search of a Holiday…

Poll I: Americans Celebrate on the Slopes, Not in Stores

When we asked Americans how they typically celebrate Presidents Day, the vast majority told us they don’t. And those few of us lucky enough to have the day off from work tend to not do much outside of sleeping.

But the surprise came from those few who actually said they do something on Presidents Day!

We expected people to say they go shopping on Presidents Day, but the most popular activity mentioned (after nothing and sleeping) was skiing! And skiing was followed by 2) barbecuing and 3) spending time with friends—not shopping.

Poll II: Change it to Family Day?

So, maybe as far as holidays go, Presidents Day is a tad lackluster? Could we do better?

We asked Americans:

Q. If we could create a new holiday instead of Presidents Day, what new holiday would you suggest we celebrate?

While some people indicated Presidents Day is fine as is, among those who suggested a new holiday there was no shortage of creativity!

The three most frequently mentioned ideas by large margins for replacement of Presidents Day were 1) Leaders/Heroes Day, 2) Native American Day (this holiday already exists, so maybe it could benefit from some publicity?) and 3) Family Day (which is celebrated in parts of Canada and other countries).

People also seemed to like the idea of shifting the date and making a holiday out of other important annually occurring events that lent themselves to a day off in practical terms like Election Day, Super Bowl Monday and, my personal favorite, Taxpayer Day on April 15!

Poll III: From Celebrity Apprentice to Celebrity POTUS

Donald Trump isn’t the first person in history to have not held elected office before becoming president, but he is definitely the first POTUS to have had his own reality TV show! Being Presidents Day, we thought it might be fun to see who else from outside of politics might interest Americans…

 Q: If you could pick any celebrity outside of politics to be President, who would it be?

 

Looks like we could have our first female president if Oprah ever decides to run. The media mogul’s name just rolled off people’s tongues, followed very closely by George Clooney, with Morgan Freeman in a respectable third.

Let Them Tell You in Their Own Words

In closing, I’ll remind you that none of these data were generated by a multiple-choice instrument, but via unaided text comments from people in their own words.

What never ceases to amaze me about these exercises is how even when we give people license to say whatever crazy thing they can think up—without any prompts or restrictions—people often have the same thoughts. And so open-ends lend themselves nicely to quantification using a platform like OdinText.

If you’re among the lucky folks who have the holiday off, enjoy the slopes!

Until next time, Happy Presidents Day!

@TomHCAnderson

PS.  Do you have an idea for our next Text Analytics Poll™? We’d love to hear from you. Or, why not use OdinText to analyze your own data!

[*Today’s OdinText Text Analytics PollTM sample of n=500 U.S. online representative respondents has been sourced through of Google Surveys. The sample has a confidence interval of +/- 4.38 at the 95% Confidence Level. Larger samples have a smaller confidence level. Subgroup analyses within the sample have a larger confidence interval.]

About Tom H. C. Anderson

Tom H. C. Anderson is the founder and managing partner of OdinText, a venture-backed firm based in Stamford, CT whose eponymous, patented SAS platform is used by Fortune 500 companies like Disney, Coca-Cola and Shell Oil to mine insights from complex, unstructured and mixed data. A recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research, Anderson is the recipient of numerous awards for innovation from industry associations such as CASRO, ESOMAR, and the ARF. He was named one of the “Four under 40” market research leaders by the American Marketing Association in 2010. He tweets under the handle @tomhcanderson

What Does the Co-Occurence Graph Tell You?

Text Analytics Tips - Branding What does the co-occurrence graph tell you?Text Analytics Tips by Gosia

The co-occurrence graph in OdinText may look simple at first sight but it is in fact a very complex visualization. Based on an example we are going to show you how to read and interpret this graph. See the attached screenshots of a single co-occurrence graph based on a satisfaction survey of 500 car dealership customers (Fig. 1-4).

The co-occurrence graph is based on multidimensional scaling techniques that allow you to view the similarity between individual cases of data (e.g., automatic terms) taking into account various aspects of the data (i.e., frequency of occurrence, co-occurrence, relationship with the key metric). This graph plots the co-occurrence of words represented by the spatial distance between them, i.e., it plots as well as it can terms which are often mentioned together right next to each other (aka approximate overlap/concurrence).

Figure 1. Co-occurrence graph (all nodes and lines visible).

The attached graph (Fig. 1 above) is based on 50 most frequently occurring automatic terms (words) mentioned by the car dealership customers. Each node represents one term. The node’s size corresponds to the number of occurrences, i.e., in how many customer comments a given word was found (the greater node’s size, the greater the number of occurrences). In this example, green nodes correspond to higher overall satisfaction and red nodes to lower overall satisfaction given by customers who mentioned a given term, whereas brown nodes reflect satisfaction scores close to the metric midpoint. Finally, the thickness of the line connecting two nodes highlights how often the two terms are mentioned together (aka actual overlap/concurrence); the thicker the line, the more often they are mentioned together in a comment.

Figure 2. Co-occurrence graph (“unprofessional” node and lines highlighted).

So what are the most interesting insights based on a quick look at the co-occurrence graph of the car dealership customer satisfaction survey?

  • “Unprofessional” is the most negative term (red node) and it is most often mentioned together with “manager” or “employees” (Fig. 2 above).
  • “Waiting” is a relatively frequently occurring (medium-sized node) and a neutral term (brown node). It is often mentioned together with “room” (another neutral term) as well as “luxurious”, “coffee”, and “best”, which are corresponding to high overall satisfaction (light green node). Thus, it seems that the luxurious waiting room with available coffee is highly appreciated by customers and makes the waiting experience less negative (Fig. 3 below).
  • The dealership “staff” is often mentioned together with such positive terms as “always”, “caring”, “nice”, “trained”, and “quick” (Fig. 4 below). However, staff is also mentioned with more negative terms including “unprofessional”, “trust”, “helpful” suggesting a few negative customer evaluations related to these terms which may need attention and improvement.

    Figure 3. Co-occurrence graph (“waiting” node and lines highlighted).

    Figure 4. Co-occurrence graph (“staff” node and lines highlighted).

    Hopefully, this quick example can help you extract quick and valuable insights based on your own data!

Gosia

Text Analytics Tips with Gosi

[NOTE: Gosia is a Data Scientist at OdinText Inc. Experienced in text mining and predictive analytics, she is a Ph.D. with extensive research experience in mass media’s influence on cognition, emotions, and behavior.  Please feel free to request additional information or an OdinText demo here.]

Code by Hand? The Benefits of Automated and User-Guided Automated Customer Comment Coding

Text Analytics Tips - Branding Why you should not code text data by hand: Benefits of automated and user-guided automated coding Text Analytics Tips by Gosia

Most researchers know very well that the coding of text data manually (using human coders who read the text and mark different codes) is very expensive both in terms of time that coders need to take and money needed to compensate them for this effort.

However, the major advantage of using human coding is their high understanding of complex meaning of text including sarcasms or jokes.

Usually at least two coders are required to code any type of text data and the calculation of inter-rater reliability or inter-rater agreement is a must. This statistic enables us to see how similarly any number of coders has coded the data, i.e., how often they have agreed on using the exact same codes.

Often even with the simplest codes the accuracy of human coding is low. No two human coders consistently code larger amounts of data the same way because of different interpretations of text or simply due to error. The latter is a reason why no single coder will code the same text data identically when done for the second time (perfect reliability for a single coder could be achieved in theory though, e.g., for very small datasets that can be proofread multiple times).

Another limitation is that human coders can only keep in their working memory a limited number of codes while reading the text. Finally, any change to the code will require repeating the entire coding process from the beginning. Because the process of manual coding of larger datasets is expensive and unreliable automated coding using computer software was introduced.

Automated or algorithm-based text coding solves many of the issues of human coding:

  1. it is fast (thousands of text comments can be read in seconds)
  2. cost-effective (automated coding should be always cheaper than human coding as it requires much less time)
  3. offers perfect consistency (same rules are applied every time without errors)
  4. an unlimited number of codes can be used in theory (some software might have limitations)

However, this process does also have disadvantages. As already mentioned above, humans are the only ones who can perfectly understand the complex meaning of text and simple algorithms are likely going to fail when trying to understand it (even though some new algorithms are under development recently, which can be almost as good as humans). Moreover, most software available on the market has low flexibility as codes cannot be known to or changed by the user.

Figure 1. Comparison of OdinText with “human coding” and “automated coding” approaches.Figure 1. Comparison of OdinText with “human coding” and “automated coding” approaches.

Therefore, OdinText developers decided to let users guide the automated coding. Users can view and edit the default codes and dictionaries, create and upload their own, or build custom dictionaries based on the exploratory results provided by the automated analysis. The codes can be very complex and specific producing a good understanding of the meaning of text, which is the key goal of each text analytics software.

OdinText is a user-guided automated text analytics solution, which has aspects and benefits of both fully automated and human coding. It is fast, cost-effective, accurate, and allows for an unlimited number of codes like many other automated text analytics tools. However, OdinText surpasses the capabilities of other software by providing high flexibility and customization of codes/dictionaries and thus a better understanding of the meaning of text. Moreover, OdinText allows you to conduct statistical analyses and create visualizations of your data in the same software.

Try switching from human coding to user-guided automated coding and you will be pleasantly surprised how easy and powerful it is!

Gosia

Text Analytics Tips with Gosi

[Gosia is a Data Scientist at OdinText Inc. Experienced in text mining and predictive analytics, she is a Ph.D. with extensive research experience in mass media’s influence on cognition, emotions, and behavior.  Please feel free to request additional information or an OdinText demo here.]

[NOTE: OdinText is NOT a tool for human assisted coding. It is a tool used by analysts for better and faster insights from mixed (structured and unstructured) data.]

Beyond Sentiment - What Are Emotions, and Why Are They Useful to Analyze?
Text Analytics Tips - Branding

Text Analytics Tips - Branding

Beyond Sentiment - What are emotions and why are they useful to analyze?Text Analytics Tips by Gosia

Emotions - Revealing What Really Matters

Emotions are short-term intensive and subjective feelings directed at something or someone (e.g., fear, joy, sadness). They are different from moods, which last longer, but can be based on the same general feelings of fear, joy, or sadness.

3 Components of Emotion: Emotions result from arousal of the nervous system and consist of three components: subjective feeling (e.g., being scared), physiological response (e.g., a pounding heart), and behavioral response (e.g., screaming). Understanding human emotions is key in any area of research because emotions are one of the primary causes of behavior.

Moreover, emotions tend to reveal what really matters to people. Therefore, tracking primary emotions conveyed in text can have powerful marketing implications.

The Emotion Wheel - 8 Primary Emotions

OdinText can analyze any psychological content of text but the primary attention has been paid to the power of emotions conveyed in text.

8 Primary Emotions: OdinText tracks the following eight primary emotions: joy, trust, fear, surprise, sadness, disgust, anger, and anticipation (see attached figure; primary emotions in bold).

Sentiment Analysis

Sentiment Analysis

Bipolar Nature: These primary emotions have a bipolar nature; joy is opposed to sadness, trust to disgust, fear to anger, and surprise to anticipation. Emotions in the blank spaces are mixtures of the two neighboring primary emotions.

Intensity: The color intensity dimension suggests that each primary emotion can vary in ntensity with darker hues representing a stronger emotion (e.g., terror > fear) and lighter hues representing a weaker emotion (e.g. apprehension < fear). The analogy between theory of emotions and the theory of color has been adopted from the seminal work of Robert Plutchik in 1980s. [All 32 emotions presented in the figure above are a basis for OdinText Emotional Sentiment tracking metric].

Stay tuned for more tips giving details on each of the above emotions.

Gosia

Text Analytics Tips with Gosi

Text Analytics Tips with Gosi

[NOTE: Gosia is a Data Scientist at OdinText Inc. Experienced in text mining and predictive analytics, she is a Ph.D. with extensive research experience in mass media’s influence on cognition, emotions, and behavior. 

Text Analytics Tips

Text Analytics Tips, with your Hosts Tom & Gosia: Introductory Post Today, we’re blogging to let you know about a new series of posts starting in January 2016 called ‘Text Analytics Tips’. This will be an ongoing series and our main goal is to help marketers understand text analytics better.

We realize Text Analytics is a subject with incredibly high awareness, yet sadly also a subject with many misconceptions.

The first generation of text analytics vendors over hyped the importance of sentiment as a tool, as well as ‘social media’ as a data source, often preferring to use the even vaguer term ‘Big Data’ (usually just referring to tweets). They offered no evidence of the value of either, and have usually ignored the much richer techniques and sources of data for text analysis. Little to no information or training is offered on how to actually gain useful insights via text analytics.

What are some of the biggest misconceptions in text analytics?

  1. “Text Analytics is Qualitative Research”

FALSE – Text Analytics IS NOT qualitative. Text Analytics = Text Mining = Data Mining = Pattern Recognition = Math/Stats/Quant Research

  1. It’s Automatic (artificial intelligence), you just press a button and look at the report / wordcloud

FALSE – Text Analytics is a powerful technique made possible thanks to tremendous processing power. It can be easy if using the right tool, but just like any other powerful analytical tools, it is limited by the quality of your data and the resourcefulness and skill of the analyst.

  1. Text Analytics is a Luxury (i.e. structured data analysis is of primary importance and unstructured data is an extra)

FALSE – Nothing could be further from the truth. In our experience, usually when there is text data available, it almost always outperforms standard available quant data in terms of explaining and/or predicting the outcome of interest!

There are several other text analytics misconceptions of course and we hope to cover many of them as well.

While various OdinText employees and clients may be posting in the ‘Text Analytics Tips’ series over time, Senior Data Scientist, Gosia, and our Founder, Tom, have volunteered to post on a more regular basis…well, not so much volunteered as drawing the shortest straw (our developers made it clear that “Engineers don’t do blog posts!”).

Kidding aside, we really value education at OdinText, and it is our goal to make sure OdinText users become proficient in text analytics.

Though Text Analytics, and OdinText in particular, are very powerful tools, we will aim to keep these posts light, fun yet interesting and insightful. If you’ve just started using OdinText or are interested in applied text analytics in general, these posts are certainly a good start for you.

During this long running series we’ll be posting tips, interviews, and various fun short analysis. Please come back in January for our first post which will deal with analysis of a very simple unstructured survey question.

Of course, if you’re interested in more info on OdinText, no need to wait, just fill out our short Request Info form.

Happy New Year!

Your friends @OdinText

Text Analytiics Tips T G

[NOTE: Tom is Founder and CEO of OdinText Inc.. A long time champion of text mining, in 2005 he founded Anderson Analytics LLC, the first consumer insights/marketing research consultancy focused on text analytics. He is a frequent speaker and data science guest lecturer at university and research industry events.

Gosia is a Senior Data Scientist at OdinText Inc.. A PhD. with extensive experience in content analytics, especially psychological content analysis (i.e. sentiment analysis and emotion in text), as well as predictive analytics using unstructured data, she is fluent in German, Polish and Spanish.]