Posts tagged data visualization
When Oprah is President We Can Celebrate Family Day While Skiing!

Text Analytics Poll™ Shows What We’d Like Instead of Presidents Day It’s been less than a week since our Valentine’s Day poll unearthed what people dislike most about their sweethearts, and already another holiday is upon us! Though apparently for most of us it’s not much of a holiday at all; well over half of Americans say they do nothing to commemorate ‘Presidents Day.’

You’ll note I put the holiday in single quotes. That’s because there’s some confusion around the name. Federally, it’s recognized as Washington’s Birthday. At the state level, it’s known by a variety of names—President’s Day, Presidents’ Day, Presidents Day and others, again, depending on the state.

But the name is not the only inconsistency about Presidents Day. If you’re a federal employee OR you happen to be a state employee in a state where the holiday is observed OR you work for an employer who honors it, you get the day off work with pay. Schools may or may not be closed, but that again depends on where you live.

As for what we’re observing exactly, well, that also depends on the state, but people generally regard the holiday as an occasion to honor either George Washington, alone, or Washington and Abraham Lincoln, or U.S. presidents, in general.

Perhaps the one consistent aspect of this holiday is the sales? It’s particularly popular among purveyors of automobiles, mattresses, and furniture.

Yes, it’s a patriotic sort of holiday, but on the whole, we suspected that ‘Presidents Day’ fell on the weaker end of the American holiday spectrum, so we investigated a little bit…

About this Text Analytics Poll™

In this example for our ongoing series demonstrating the efficiency, flexibility, and practicality of the Text Analytics Poll™ for insights generation, we opted for a light-hearted poll using a smaller sample* than usual. While text analytics have obvious value when applied to larger-scale data where human reading or coding is impossible or too expensive, you’ll see here that OdinText also works very effectively with smaller samples!

I’ll also emphasize that the goal of these little Text Analytics Polls™ is not to conduct a perfect study, but to very quickly design and field a survey with only one open-ended question, analyze the results with OdinText, and report the findings in here on this blog. (The only thing that takes a little time—usually 2-3 days—is the data collection.)

So while the research is representative of the general online population, and the text analytics coding applied with 100% consistency throughout the entire sample, this very speedy exercise is meant to inspire users of OdinText to use the software in new ways to answer new questions. It is not meant to be an exhaustive exploration of the topic. We welcome our users to comment and suggest improvements in the questions we ask and make suggestions for future topics!

Enough said, on to the results…

A Holiday In Search of a Celebrant in Search of a Holiday…

Poll I: Americans Celebrate on the Slopes, Not in Stores

When we asked Americans how they typically celebrate Presidents Day, the vast majority told us they don’t. And those few of us lucky enough to have the day off from work tend to not do much outside of sleeping.

But the surprise came from those few who actually said they do something on Presidents Day!

We expected people to say they go shopping on Presidents Day, but the most popular activity mentioned (after nothing and sleeping) was skiing! And skiing was followed by 2) barbecuing and 3) spending time with friends—not shopping.

Poll II: Change it to Family Day?

So, maybe as far as holidays go, Presidents Day is a tad lackluster? Could we do better?

We asked Americans:

Q. If we could create a new holiday instead of Presidents Day, what new holiday would you suggest we celebrate?

While some people indicated Presidents Day is fine as is, among those who suggested a new holiday there was no shortage of creativity!

The three most frequently mentioned ideas by large margins for replacement of Presidents Day were 1) Leaders/Heroes Day, 2) Native American Day (this holiday already exists, so maybe it could benefit from some publicity?) and 3) Family Day (which is celebrated in parts of Canada and other countries).

People also seemed to like the idea of shifting the date and making a holiday out of other important annually occurring events that lent themselves to a day off in practical terms like Election Day, Super Bowl Monday and, my personal favorite, Taxpayer Day on April 15!

Poll III: From Celebrity Apprentice to Celebrity POTUS

Donald Trump isn’t the first person in history to have not held elected office before becoming president, but he is definitely the first POTUS to have had his own reality TV show! Being Presidents Day, we thought it might be fun to see who else from outside of politics might interest Americans…

 Q: If you could pick any celebrity outside of politics to be President, who would it be?


Looks like we could have our first female president if Oprah ever decides to run. The media mogul’s name just rolled off people’s tongues, followed very closely by George Clooney, with Morgan Freeman in a respectable third.

Let Them Tell You in Their Own Words

In closing, I’ll remind you that none of these data were generated by a multiple-choice instrument, but via unaided text comments from people in their own words.

What never ceases to amaze me about these exercises is how even when we give people license to say whatever crazy thing they can think up—without any prompts or restrictions—people often have the same thoughts. And so open-ends lend themselves nicely to quantification using a platform like OdinText.

If you’re among the lucky folks who have the holiday off, enjoy the slopes!

Until next time, Happy Presidents Day!


PS.  Do you have an idea for our next Text Analytics Poll™? We’d love to hear from you. Or, why not use OdinText to analyze your own data!

[*Today’s OdinText Text Analytics PollTM sample of n=500 U.S. online representative respondents has been sourced through of Google Surveys. The sample has a confidence interval of +/- 4.38 at the 95% Confidence Level. Larger samples have a smaller confidence level. Subgroup analyses within the sample have a larger confidence interval.]

About Tom H. C. Anderson

Tom H. C. Anderson is the founder and managing partner of OdinText, a venture-backed firm based in Stamford, CT whose eponymous, patented SAS platform is used by Fortune 500 companies like Disney, Coca-Cola and Shell Oil to mine insights from complex, unstructured and mixed data. A recognized authority and pioneer in the field of text analytics with more than two decades of experience in market research, Anderson is the recipient of numerous awards for innovation from industry associations such as CASRO, ESOMAR, and the ARF. He was named one of the “Four under 40” market research leaders by the American Marketing Association in 2010. He tweets under the handle @tomhcanderson

What Does the Co-Occurence Graph Tell You?

Text Analytics Tips - Branding What does the co-occurrence graph tell you?Text Analytics Tips by Gosia

The co-occurrence graph in OdinText may look simple at first sight but it is in fact a very complex visualization. Based on an example we are going to show you how to read and interpret this graph. See the attached screenshots of a single co-occurrence graph based on a satisfaction survey of 500 car dealership customers (Fig. 1-4).

The co-occurrence graph is based on multidimensional scaling techniques that allow you to view the similarity between individual cases of data (e.g., automatic terms) taking into account various aspects of the data (i.e., frequency of occurrence, co-occurrence, relationship with the key metric). This graph plots the co-occurrence of words represented by the spatial distance between them, i.e., it plots as well as it can terms which are often mentioned together right next to each other (aka approximate overlap/concurrence).

Figure 1. Co-occurrence graph (all nodes and lines visible).

The attached graph (Fig. 1 above) is based on 50 most frequently occurring automatic terms (words) mentioned by the car dealership customers. Each node represents one term. The node’s size corresponds to the number of occurrences, i.e., in how many customer comments a given word was found (the greater node’s size, the greater the number of occurrences). In this example, green nodes correspond to higher overall satisfaction and red nodes to lower overall satisfaction given by customers who mentioned a given term, whereas brown nodes reflect satisfaction scores close to the metric midpoint. Finally, the thickness of the line connecting two nodes highlights how often the two terms are mentioned together (aka actual overlap/concurrence); the thicker the line, the more often they are mentioned together in a comment.

Figure 2. Co-occurrence graph (“unprofessional” node and lines highlighted).

So what are the most interesting insights based on a quick look at the co-occurrence graph of the car dealership customer satisfaction survey?

  • “Unprofessional” is the most negative term (red node) and it is most often mentioned together with “manager” or “employees” (Fig. 2 above).
  • “Waiting” is a relatively frequently occurring (medium-sized node) and a neutral term (brown node). It is often mentioned together with “room” (another neutral term) as well as “luxurious”, “coffee”, and “best”, which are corresponding to high overall satisfaction (light green node). Thus, it seems that the luxurious waiting room with available coffee is highly appreciated by customers and makes the waiting experience less negative (Fig. 3 below).
  • The dealership “staff” is often mentioned together with such positive terms as “always”, “caring”, “nice”, “trained”, and “quick” (Fig. 4 below). However, staff is also mentioned with more negative terms including “unprofessional”, “trust”, “helpful” suggesting a few negative customer evaluations related to these terms which may need attention and improvement.

    Figure 3. Co-occurrence graph (“waiting” node and lines highlighted).

    Figure 4. Co-occurrence graph (“staff” node and lines highlighted).

    Hopefully, this quick example can help you extract quick and valuable insights based on your own data!


Text Analytics Tips with Gosi

[NOTE: Gosia is a Data Scientist at OdinText Inc. Experienced in text mining and predictive analytics, she is a Ph.D. with extensive research experience in mass media’s influence on cognition, emotions, and behavior.  Please feel free to request additional information or an OdinText demo here.]

Look Who’s Talking, Part 1: Who Are the Most Frequently Mentioned Research Panels?

Survey Takers Average Two Panel Memberships and Name Names

Who exactly is taking your survey?

It’s an important question beyond the obvious reasons and odds are your screener isn’t providing all of the answers.

Today’s blog post will be the first in a series previewing some key findings from a new study exploring the characteristics of survey research panelists.

The study was designed and conducted by Kerry Hecht, Director of Research at Ramius. OdinText was enlisted to analyze the text responses to the open-ended questions in the survey.

Today I’ll be sharing an OdinText analysis of results from one simple but important question: Which research companies are you signed up with?

Note: The full findings of this rather elaborate study will be released in June in a special workshop at IIEX North America (Insight Innovation Exchange) in Atlanta, GA. The workshop will be led by Kerry Hecht, Jessica Broome and yours truly. For more information, click here.

About the Data

The dataset we’ve used OdinText to analyze today is a survey of research panel members with just over 1,500 completes.

The sample was sourced in three equal parts from leading research panel providers Critical Mix and Schlesinger Associates and from third-party loyalty reward site Swagbucks, respectively.

The study’s author opted to use an open-ended question (“Which research companies are you signed up with?”) instead of a “select all that apply” variation for a couple of reasons, not the least of which being that the latter would’ve needed to list more than a thousand possible panel choices.

Only those panels that were mentioned by at least five respondents (0.3%) were included in the analysis. As it turned out, respondents identified more than 50 panels by name.

How Many Panels Does the Average Panelist Belong To?

The overwhelming majority of respondents—approx. 80%—indicated they belong to only one or two panels. (The average number of panels mentioned among those who could recall specific panel names was 2.3.)

Less than 2% told us they were members of 10 or more panels.

Finally, even fewer respondents told us they were members of as many as 20+ panels; others could not recall the name of a single panel when asked. Some declined to answer the question.

Naming Names…Here’s Who

Caption: To see the data more closely, please click this screenshot for an Excel file. 

In Figure 1 we have the 50 most frequently mentioned panel companies by respondents in this survey.

It is interesting to note that even though every respondent was signed up with at least one of the three companies from which we sourced the sample, a third of respondents failed to name that company.

Who Else? Average Number of Other Panels Mentioned

Caption: To see the data more closely, please click this screenshot for an Excel file.

As expected—and, again, taking the fact that the sample comes from each of just three firms we mentioned earlier—larger panels are more likely than smaller, niche panels to contain respondents who belong to other panels (Figure 2).

Panel Overlap/Correlation

Finally, we correlate the mentions of panels (Figure 3) and see that while there is some overlap everywhere, it looks to be relatively evenly distributed.

Caption: To see the data more closely, please click this screenshot for an Excel file.

Finally, we correlate the mentions of panels (Figure 3) and see that while there is some overlap everywhere, it looks to be relatively evenly distributed. In a few cases where correlation ishigher, it may be that these panels tend to recruit in the same place online or that there is a relationship between the companies.

What’s Next?

Again, all of the data provided above are the result of analyzing just a single, short open-ended question using OdinText.

In subsequent posts, we will look into what motivates these panelists to participate in research, as well as what they like and don’t like about the research process. We’ll also look more closely at demographics and psychographics.

You can also look forward to deeper insights from a qualitative leg provided by Kerry Hecht and her team in the workshop at IIEX in June.

Thank you for your readership. As always, I encourage your feedback and look forward to your comments!

@TomHCanderson @OdinText

Tom H.C. Anderson

PS. Just a reminder that OdinText is participating in the IIEX 2016 Insight Innovation Competition!

Voting ends Today! Please visit MAKE DATA ACCESSIBLE and VOTE OdinText!


[If you would like to attend IIEX feel free to use our Speaker discount code ODINTEXT]

To learn more about how OdinText can help you understand what really matters to your customers and predict actual behavior,  please contact us or request a Free Demo here >

[NOTE: Tom H. C. Anderson is Founder of Next Generation Text Analytics software firm OdinText Inc. Click here for more Text Analytics Tips ]


Beyond Sentiment - What Are Emotions, and Why Are They Useful to Analyze?
Text Analytics Tips - Branding

Text Analytics Tips - Branding

Beyond Sentiment - What are emotions and why are they useful to analyze?Text Analytics Tips by Gosia

Emotions - Revealing What Really Matters

Emotions are short-term intensive and subjective feelings directed at something or someone (e.g., fear, joy, sadness). They are different from moods, which last longer, but can be based on the same general feelings of fear, joy, or sadness.

3 Components of Emotion: Emotions result from arousal of the nervous system and consist of three components: subjective feeling (e.g., being scared), physiological response (e.g., a pounding heart), and behavioral response (e.g., screaming). Understanding human emotions is key in any area of research because emotions are one of the primary causes of behavior.

Moreover, emotions tend to reveal what really matters to people. Therefore, tracking primary emotions conveyed in text can have powerful marketing implications.

The Emotion Wheel - 8 Primary Emotions

OdinText can analyze any psychological content of text but the primary attention has been paid to the power of emotions conveyed in text.

8 Primary Emotions: OdinText tracks the following eight primary emotions: joy, trust, fear, surprise, sadness, disgust, anger, and anticipation (see attached figure; primary emotions in bold).

Sentiment Analysis

Sentiment Analysis

Bipolar Nature: These primary emotions have a bipolar nature; joy is opposed to sadness, trust to disgust, fear to anger, and surprise to anticipation. Emotions in the blank spaces are mixtures of the two neighboring primary emotions.

Intensity: The color intensity dimension suggests that each primary emotion can vary in ntensity with darker hues representing a stronger emotion (e.g., terror > fear) and lighter hues representing a weaker emotion (e.g. apprehension < fear). The analogy between theory of emotions and the theory of color has been adopted from the seminal work of Robert Plutchik in 1980s. [All 32 emotions presented in the figure above are a basis for OdinText Emotional Sentiment tracking metric].

Stay tuned for more tips giving details on each of the above emotions.


Text Analytics Tips with Gosi

Text Analytics Tips with Gosi

[NOTE: Gosia is a Data Scientist at OdinText Inc. Experienced in text mining and predictive analytics, she is a Ph.D. with extensive research experience in mass media’s influence on cognition, emotions, and behavior. 

Brand Analytics – Branding and Gender

Text Analytics Tips - Branding Text Analytics Tips: Branding Analytics – 500 Major Brands and Gender- by Tom H. C. Anderson (Continuation from yesterday’s Brand Analytics post)

Thank you everyone who contacted us for more information about OdinText yesterday. As a result, I’ve decided to dig into the same branding question a bit deeper taking a look at one or two additional variables today and tomorrow. Of course, if anyone is interested in seeing just how easy and powerful OdinText is feel free to request info or a demo.

Yesterday we looked at Brand Awareness by Age quite a bit. Today, I thought we’d look at gender. But before that, I’ve posted another visualization from the Brand vs. Age data below.

Though popular, we’ve found at OdinText that typical word clouds are almost completely useless. Yesterday, I showed a co-occurrence plot of the data which certainly is more meaningful than a word cloud, as unlike a word cloud, position is used to tell you something about the data (those terms mentioned most frequently together appear closer to each other).

Brand Analytics Unstructured VisualizationBrand Analytics Unstructured Visualization

In the chart above, we are plotting two variables from yesterday’s data, Average Age on the x-axis and frequency of mentions (or popularity) on the y-axis. Visualizations like this are a great way to very quickly explore and understand unstructured data. Even without getting into the detail, often just the overall shape of a chart will tell you something about your data. In the case above, we have a triangle type shape where the most popular brands, such as Samsung, Sony and Coca-Cola, tend to appear in the middle. Two outliers are Apple and Nike who are not only our most popular brands, but also skew a bit younger.

But let’s leave Age and take a look at Gender. Though a nominal variable because gender typically only has two values (male and female), it is also a dichotomous variable and thus lends itself nicely to more advanced visualization. Basically, any dichotomous variable including Yes or No (present vs. not present) can be very useful in OdinText patented text analytics process. What can we tell about brands when we look at gender?

Brand Anallytics and Gender 600by300 Text Analytics Tips

Brands by Gender

There are certain stereotypes that seem validated. You’re far more likely to be a guy than a gal if, when you think of brands, the first things that pop into your mind are Software and Electronics such as Microsoft (15% vs. 6%) and Sony (17% vs. 11%), or if you think of an auto brands like Ford (11% vs. 7%) or a McDonalds (4% vs. 1%).

Text Analytics Gender by Brand with OdinText

Perhaps as expected, women are far more likely to think of consumer packaged goods brands like Kraft (14% vs. 7%), Johnson & Johnson (6% vs. 2%), Kellogg’s (5% vs. 0.4%), General Mills (6% vs. 2%), Dove (4% vs. 0.2%), and P&G (4% vs. 1%).

Interestingly the list doesn’t stop there though, women tend to be able to mention several more brands than men including many less frequently mentioned brands such as Coach (3% vs. 0.3%), Gap (3% vs. 1%), Colgate (3% vs. 0.5%), Tide (2.6% vs. 0.4%), Victoria’s Secret (2.5% vs. 0.2%), Michael Kors (2.8% vs. 0.7%), Kleenex (2.5% vs. 0.5%), Tommy Hilfiger (2$ vs. 0.2%), Huggies (1.7% vs. 0%), Olay (1.7% vs. 0%), Hershey’s (1.7% vs. 0.2%), Mattel (1.5% vs. 0%) and Bath and Body Works (1.3% vs. 0%).

That’s it for today but come back tomorrow and we’ll look at one last data point related to this one unstructured branding question we’ve been looking at to see whether brands can have a political skew.

Of course in the meantime, please feel free to request more information on how you too can become a data scientist with OdinText. Text Analytics can be a great tool for brand analytics including answering brand positioning, brand loyalty and brand equity questions.

-Tom @OdinText


[NOTE: Tom H. C. Anderson is Founder of Next Generation Text Analytics software firm OdinText Inc.]

Brand Analytics Tips – How Old is Your Brand?

Text Analytics Tips Text Analytics Tips Answers, How Old Is Your Brand? - Using OdinText on Brand Mention Type Comment Data By Tom H. C. Anderson

[METHODOLOGICAL NOTES (If you’re not a researcher feel free to skip down to ‘Brands & Age’ section below): In our first official Text Analytics Tips I’ve started with exploring one of the arguably simplest types of unstructured/text data there is, the unaided top-of-mind ‘brand mention’ open-ended survey question. These kinds of questions are especially important to brand positioning, brand equity, brand loyalty and advertising effectiveness research. In this case we’ve allowed for more than one brand mention. The questions reads “Q. When you think of brand names, what company’s product or service brand names first come to mind? [Please name at least 5]”. The question was fielded to n=1,089 US Gen Pop Representative survey respondents in the CriticalMix Panel in December of 2015. The confidence interval is +/-2.9% at the 95% confidence level]

Making Good Use Comment Data Can Be Easy and Insightful

An interesting and rather unique way to look at your brand is to understand for whom it is most likely to be top-of-mind.

Unfortunately, though they have proven more accurate than structured choice or Likert scale rating questions in predicting actual behavior, free form (open end) survey questions are rare due to the assumed difficulty in analyzing results.  Even when they are included in a survey and analyzed, results are rarely expressed in anything more useful than a simple frequency ranked table (or worse, a word cloud). Thanks to the unique patented approach to unstructured and structured data in OdinText, analyzing this type of data is both fast and easy, and insights are only limited to the savviness of the analyst.

The core question asked here is rather simple i.e. “When you think of brand names, what company’s product or service brand names first come to mind?”. However, asking this question to over a thousand people, because of the share volume of brands that will be mentioned (in our case well over 500), even this ‘small data’ can seem overwhelming in volume.

The purpose of this post is to show you just how easy/fast yet insightful analysis of even more specific and technically more basic comment data can be using Next Generation Text AnalyticsTM.

After uploading the data into OdinText, there are numerous ways to look at this comment data, not only the somewhat more obvious frequency counts, but also several other statistics including any interesting relationships to available structured data. Today we will be looking at how brand mentions are related to just one such variable, the age of the respondent. [Come back tomorrow and we may take a look at a few other statistics and variable relationships.]

Text Analytics Tips Age OdinText

Brands by Age

Below is a sortable list of the most frequently mentioned brands ranked by the average age of those mentioning said brand. This is a direct export from OdinText. The best way to think about lists like these is comparatively (i.e. how old is my brand vs. other brands?). If showing a table such as this in a presentation I would highly recommend color coding which can be done either in OdinText (depending on your version), or in excel using the conditional formatting tool.

[NOTE: For additional analytics notes and visualizations please scroll to the bottom of the table below]


Brand Name Average Age
Maxwell House 66
Hunts 66
Aspirin 66
Chrysler 64.6
Stouffers 63.7
Marie Callender's 63.7
Walgreen 63.7
Cooper (Mini) 63.7
Bayer 62.6
USAA 62.5
Epson 62.5
Brother 61.3
Aol 61.3
Comet 61.3
Snapple 61.3
Lowes 61.2
Marriott 60.3
Ritz 60.3
Hellman's 60.3
Ikea 60.3
Belk 60.3
State Farm 60.3
Oscar Mayer 60
Folgers 59.8
Libby's 59.8
Hormel 59.2
Depot 59.2
Heinz 59.2
Electric 59.2
Bordens 59.2
Nestles 59
Green Giant 59
Sargento 58.3
Del Monte 58
Prego 58
Kashi 58
Westinghouse 58
Stouffer 58
Taylor 58
Home Depot 57.6
Publix 57.5
Banquet (Frozen Dinners) 57.5
Buick 57
Krogers 57
Hellman's 57
Safeway 56.5
Purex 56.4
Hewlett 56.4
Unilever 56.1
RCA 56.1
Post 56.1
P&G 55.9
Budweiser 55.9
Yoplait 55.8
Chobani 55.7
Ragu 55.7
Campbell's 55.5
Wells Fargo 55.2
Hershey 55.1
Betty Crocker 55
Sharp 55
Hines 55
Trader Joe's 55
Palmolive 54.9
Kia 54.7
Lexus 54.7
Life 54.7
Hotpoint 54.7
Campbells 54.6
Oscar Mayer 54.5
Dial 54.4
Nissan 54.4
Hillshire Farms 54.3
Motorola 54.1
Keebler 54
CVS 53.8
Canon 53.8
Lakes 53.7
Pillsbury 53.3
Hilton 53.3
Faded Glory 53.3
Friskies 53.3
Duncan Hines 53.3
Puffs 53.3
Olay 52.8
Sketchers 52.5
Fred Meyer 52.5
Delta 52.5
Hunt 52.3
Bose 52.3
Ocean Spray 52.3
Ivory 52.3
Swanson 52.3
Dewalt 52.3
Firestone 51.8
Estee Lauder 51.5
Miller 51.5
Tide 51.4
Honda 51.3
Meijer 51.3
Perdue 51.3
Jeep 51.3
Head 51.3
Lee Jeans 51.3
Pantene 51
Chevrolet 51
Cannon 50.8
Chef Boyardee 50.8
Frito Lay 50.6
Avon 50.5
Motors 50.4
Kodak 50.4
General Mills 50.2
BMW 50
Lipton 49.8
Kohl's 49.8
Goodyear 49.7
Kraft 49.6
Craftsman 49.5
Sunbeam 49.4
IBM 49.3
Frigidare 49.1
Sears 49.1
Ford 49.1
Walgreens 49.1
Dole 49.1
Chevy 49
Wonder (Bread) 49
Dannon 49
JVC 49
Hyundai 49
Clinique 49
Marlboro 49
Mercedes 49
Gerber 49
Acme 49
Kleenex 48.8
Kelloggs 48.7
JC Penney 48.6
Louis Vuitton 48.5
Calvin 48.4
LL Bean 48.4
Gillette 48.4
Johnson & Johnson 48.3
Shell 48.3
Kenmore 48.1
Dawn 48
Hanes 48
Macdonalds 48
Tylenol 48
Colgate 47.5
Wrangler (Jeans) 47.3
Burger King 47.3
Whirlpool 47.1
GMC 47
Yahoo 46.9
Dish Network 46.8
Verizon 46.7
Hersheys 46.6
Whole Foods 46.5
Sara Lee 46.5
Hostess 46.5
Mazda 46.5
Toyota 46.4
Arm & Hammer 46.4
Nabisco 46.3
Tyson 46.1
Starbucks 46
Wal-Mart 45.9
Western Family 45.8
Wegmans 45.8
Dr Pepper 45.7
Hulu 45.7
Time Warner 45.7
Maybelline 45.7
MLB 45.7
Iams 45.7
Cox 45.7
Country Crock 45.7
Compaq 45.7
Sonoma 45.7
Quaker Oats 45.7
Nordstrom 45.4
Coca 45.3
Champion 45.3
Bass 45
Chrome 44.7
Coors 44.7
iPhone 44.6
Bounty 44.5
Dodge 44.4
Maytag 44.3
Black & Decker 44.2
Pfizer 44.2
Suave 44.2
HP 44
Scott 44
Subway 44
Skechers 44
Geico 44
Panasonic 43.9
Lays 43.8
KFC 43.8
Charmin 43.8
Dell 43.8
Polo 43.8
Windex 43.7
Burts Bees 43.5
Purina 43.5
Clorox 43.5
Columbia 43.3
Ralph Lauren 43.2
Visa 43.2
Pepsi 43
Crest 43
NFL 43
Sanyo 43
Dove 42.9
Intel 42.9
Wendy's 42.8
Kroger 42.8
Remington 42.3
Phillips 42.3
Mars 42.3
Cover Girl 42.3
Heb 42.3
Twitter 42.3
Amazon 42
Body Works 42
Best Buy 41.8
Costco 41.8
Banana Republic 41.8
Disney 41.7
Amway 41.7
Levi 41.5
Sony 41.4
Samsung 41.4
Macy's 41.1
Glade 41.1
Boost 41
Boost Mobile 41
Toshiba 40.8
Ebay 40.8
Comcast 40.7
Facebook 40.6
Walmart 40.5
Microsoft 40.5
Google 40.4
Kitchen 40.4
Nestle 39.8
Mcdonalds 39.5
Gucci 39.5
Vons 39.3
Philip Morris 39.3
Loreal 39.3
Mattel 39.1
Apple 39
Pepperidge Farm 39
Vizio 39
Lysol 39
Ugg 39
Tropicana 39
Sure 39
Fila 39
Tmobile 39
Coach 38.9
Acer 38.8
Tommy Hilfiger 38.6
Nike 38.1
Target 38
Old Navy 37.9
Chase 37.8
Michael Kors 37.7
K-Mart 37.5
Lenovo 37.5
Equate 37.2
Hoover 36.8
Under Armour 36.6
Windows 36.5
Asics 36.5
Kitchenaid 36.5
Victoria's Secret 36.2
Mac 36.1
Reebok 36.1
Android 36
Direct TV 36
Sprint 36
Netflix 35.9
Adidas 35.7
Citizen 35.7
New Balance 35.6
Guess 35.4
Bic 35.2
Great Value 35.2
Pizza Hut 35
Puma 34.9
Asus 34.4
Fox 34.3
Justice 34.3
North Face 34.1
Xbox 33.6
Gap 33.4
Doritos 33.4
HTC 33.4
Converse 33.3
Sprite 33.2
Febreeze 33
Axe 33
Kay 32.7
Glad 32.7
Mary Kay 32.7
Viva 32.7
Reese's 31.8
Lego 31.7
Amazon Prime 31.5
Nintendo 31.2
Vans 31.2
Taco Bell 31
Fisher Price 30.4
Chanel 29.7
Old Spice 29.7
Playstation 29.4
Eagle 29.4
Hamilton Beach 29.3
Footlocker 29.3
Pink 29.3
Swiffer 29.3
Timberlands 29.3
Naked Juice 29
Youtube 29
Bing 29
Air Jordans 28.4
Huggies 28.2
Aeropostale 27.7
Hollister 27.3
Prada 27.3
Carters 26.8
Kirkland 26.3
Forever 26.3
Aeropostle 26.3
Arizona 25.6
Pampers 24.5
Versace 24.5
Urban Outfitters 24.5


A few interesting points from the longer list of brands are:

The oldest brand, “Maxwell House Coffee”, has an average age of 66. (If anything, this mean age is actually conservative, as the age question gets coded as 66 for anyone answering that they are “65 or older”). This is a typical technique in OdinText, choosing the mid-point to calculate the mean if the data are in numeric ranges, as is often the case with survey or customer entry form based data.

The Youngest brand on the list, “Urban Outfitters”, with an average age of 24 also probably skews even younger in actuality for the same reason (as is standard in studies representative of the US General Population, typically only adults aged 18+ are included in the research).

Dr Pepper is in the exact middle of our list  (46 years old). Brands like Dr. Pepper which are in the middle (with an average age close to the upper range of Generation X) are of course popular not just among those 46 years old, but are likely to be popular across a wider range of ages. A good example, Coca-Cola also near the middle, mentioned by 156 people with an average age of 45, is pulling from both young and old. The most interesting thing then, as is usual in almost any research, is comparative analysis. Where is Pepsi relative to Coke for instance? As you might suspect, Pepsi does skew younger, but only somewhat younger on average, mentioned by 107 consumers yielding an average for the brand of 43. As is the case with most data, relative differences are often more valuable than specific values.

If there are any high level category trends here related to age, they seem to be that Clothing brands like Urban Outfitters and Versace (both with the youngest average age of 24), Aeropostale (26), and Forever 21 (Ironically with an average age of 26), and several others in the clothing retail category tend to skew very young. Snack Food especially drinks like Arizona Ice Tea (age 25), and Naked Juice (29), as well as web properties (Bing and YouTube both 29), and electronics (obviously PlayStation 29 and slightly older Nintendo 31 being examples), are associated with a younger demographic on average.

In the middle age group, other than products with a wide user base like major soda brands, anything related to the home, either entertainment like Time Warner Cable or even Hulu (both 45), or major retailers like Wegmans and Wal-Mart (also both 45), are likely to skew more middle age.

The scariest position for a brand manager is probably at the top of the list, with average age for Maxwell House, and Hunts (both 66), Stouffers and Marie Callender's (both 64), the question has got to be, who will replace my customer base when they die? What we see by looking at the data are in fact that a slight negative correlation between age and number of mentions.

Again, it’s often the comparative differences that are interesting to look at, and of course the variance. Take Coca-Cola VS Pepsi for instance, while their mean ages are surprisingly close to each other at 45 and 43 respectively, looking at the variance associated with each gives us the spread (i.e. which brand is pulling from a broader demographic). Coca-Cola with a standard deviation of 14.5 years for instance is pulling from a wider demographic than Pepsi which as a standard deviation of 12.9 years. There are several ways to visualize these data and questions in OdinText, though some of our clients also like to use OdinText output in visualization software like Tableau which can have more visualization options, but little to no text analytics capabilities.

Co-Occurrence (aka Market Basket Analysis)

Last but not least, looking at which brands are often mentioned together, either because they are head to head competitors going after the exact same customers or because there may be complimentary (market basket analysis type opportunities if you will) can also certainly be interesting to look at. Brands that co-occur frequently (are mentioned by the same customers), and are not competitors may in fact represent interesting opportunities for ‘co-opetition’.  You may have noticed more cross category partnering on advertising recently as marketers seem to be catching on to the value of joining forces in this manner. Below is one such visualization created using OdinText with just the Top 20 brand mentions visualized in an x-y plot using multi-dimensional scaling (MDS) to plot co-occurrence of brand names.

Text Analytics of Brands with OdinText

Hope you enjoyed today’s discussion of a very simple text question and what can be done with it in OdinText. Come back again soon as we will be giving more tips and mini analysis on interesting mixed data. In fact, if there is significant interest in today’s post we could look at one or two other variables and how they relate to brand awareness comment data tomorrow.

Of course if you aren’t already using OdinText, please feel free to request a demo here.


How to Select Your Analytics Software

SelectingTextAnalyticsSoftwareSelecting the Best Analytics Software at the Right Price – What to Consider In our last Useful Business Analytics Summit Q&A post I asked our client side data scientists about unstructured data and text analytics.

Today I ask them their thoughts on how they select analytics software whether for structured or unstructured data, and how they think about costs.

I’ve written before about the idea of “free software”, and I know that in the area of social monitoring at least it seems to be a topic up for debate… But in my experience, if you need highly specific software, that does what you need it to do efficiently without lots of set up and work, then there really is no other option than professional/paid software. I realize not everyone including some of our speakers may agree though.

As someone whose firm develops text analytics software, and who has previously evaluated and purchased all manner of analytics software for both structured and unstructured data in the past I can’t help but provide my own 3 tips on how to select software today. If at all possible always:

  1. Ask for a demo of the software (under mutual NDA) with your own data. Don’t let them give you a canned data with some fake data.
  2. Ask difficult questions about the value of the software. If your sales person can’t give very specific examples about how the software is successfully used by the company or their clients, then chances are you are evaluating software developed by developers only, and not by analysts. Chances are high that if they can’t tell you how valuable it is, there isn’t much value in it. You must learn that it can be used successfully!
  3. Don't forget to think about ease of use. While there is no such thing as magical analytics software in which you upload data and dashboards automatically tell you everything you need to know (if someone tells you this – run!), the software should however be intuitive and user friendly enough for a junior or mid-level analyst to easily learn how to use it. Anything that requires lots of customization and learning will not be used very much.

I think our speakers bring up some good points below, and I’m very curious which of their points you agree or disagree with?




AlexUherLoreal I’d suggest starting by talking to others in your industry – what are they using? Are they happy with the solution? What are the shortfalls? What are the reasons they selected this package? From there, you should be able to identify a shortlist of relevant candidates. Pressure test them all – take a set of data and ask to be able to “play with” the solution.


LarryShillerYale It depends on your maturity. If your maturity is low (1 or 2), using one or more light-weight tools for 6-12 months is a good bet: You'll get the capability that is not overweight to your maturity at low cost and you'll learn a lot, which will make your eventual move to a sophisticated tool less risky. If your maturity is low and you buy, you'll probably let the tool drive your analytics rather than vice-versa.


JonathanIsernhagenTravelocity I have never done this, but Sara Shikhman of Rent-a-Gent had a brilliant idea for solutions research generally, and that is to 1) get a bunch of interns from her business school, then 2) get 30- to 60-day free trial periods from the solution vendors who are eager to speak with her on the exhibition floor. She puts one intern in charge of learning each solution as it applies to her business, then like the Joker in the Dark Knight movie, she basically breaks the pool cue and lets the best solution win.


FaroukFerchichiToyota One should have an analytics strategy by understanding the difference between descriptive, predictive and perspective analytics and whether you are looking for an analytics and modeling tool development and/or governance and management.


AnthonyPalellaAngiesList If you work up to it by building the lower layers of your data capabilities stack first identifying the important information, validating your ability to accurately get the important data, making the important data easily accessible and able to be rolled up to any level of aggregation needed etc., then it makes it much easier to identify the POC's needed to test candidate software solutions.


DeepakTiwariGoogle Two things:

1. How relevant it is to solve the problem you are faced with - a generic solution usually doesn't work for advanced stuff.

2. Interoperability and ease of use (user friendliness) is key.


ThomasSpeidelSuncor 1. Understand the type of problems you are dealing with and

the costs of improper decision making you will make

2. For low impact improper decisions, choose a tool that favors visualizations and visual exploratory analysis

3. For high impact improper decisions, invest in human expertise first and involve them in the decision regarding software.




AlexUherLoreal This is a great question – often an enterprise level solution may seem like the only option, especially if you work in a large corporation, especially if you don’t do your homework. You have to honestly ask yourself what you will be using the package for and then, in my opinion, start from the cheaper and more basic solutions and work your way up. Try to make the cheapest solution pass your evaluation, but if it doesn’t, eliminate and move one up the food-chain. It’s hard to do and requires discipline – I mean honestly, most of us don’t shop like this, we go for the flashiest car or shoes!

JonathanIsernhagenTravelocity You have to think about analytic software solutions in terms of the value of the decisions it will help you make. If reducing shopping cart abandonment by 10% nets you $1K/week, Google Analytics is going to be your site monitoring tool. With regard to Big Data slicing, there seems to be a changing of the guard as the established stats PhDs have a strong SAS preference while the kids coming out of school mostly know R. Given the license costs and the fact that the Revolution Analytics version of R now enables multi-million-row processing (which had been a big SAS advantage), I would stick with R until you have to go to SAS. That rule is probably generalizable: stick with the cheap solution until you find yourself straining against its limitations.

FaroukFerchichiToyota Free is always good, analytics is no different, when you are starting. Once analytics production becomes integrated into the company operational and/or decision making process and/or formal management / governance processes then the investment is inevitable but should be scaled to enterprise (company) that you are in (i.e. size of the balance sheet, size of portfolio, equity, size and complexity of risk, etc.)

AnthonyPalellaAngiesList In the absence of a fatal flaw associated with the free product (e.g., lack of support), if the free product performs well in the POC phase (see question to answer above) AND the free product does not close future options, then consider it. Keep in mind that the software license cost is only part of the total cost associated with adopting an analytic software solution.

SofiaFreyderMasterCard Small and middle size companies can sometimes get away with free solutions. As the business grows and company needs more specific and detailed Information it will have to pay for more robust tools. Cost of new tools should be justified by incremental business that can be potentially brought with additional business intelligence. (ROI)

DeepakTiwariGoogle Free is almost always good enough if you have a highly qualified / talented team that knows how to get the best out of it. If you need a lot of hand holding and support then definitely use a commercial solution.

ThomasSpeidelSuncor When we look at the landscape of powerful, flexible, well proven and even scalable analytics software solutions, there are still very few serious players out there. Some that cost hundreds of thousands of dollars, some that are free and open source, and some in-between. Long gone are the times when free is synonym of cheap. I started using R in 1999 when a professor gave it to me, hand written on a CD, who in turned received from colleagues in New Zealand. I could not afford S-PLUS to do my homework at home. He told me it was a clone. My immediate thought was that he was providing me with pirated software!

Fast forward to today when R is integrated in almost all analytics and database solutions, from SAS to Statistica, from SPSS to SAP, from IBM to Oracle. Yet, R has a steep learning curve. It's definitely not for everyone. We are starting to see solutions nowadays that attempt to fully incorporate R but in a way that is more user-friendly, at least for the simple things. How successful these efforts will be is too early to tell, in part because of R's licensing model. Other solutions leverage a different programming language, Python.

The question is not so much one of free vs. paid; the crucial question is how serious a company is about analytics. A small team of data scientists and analytics professionals usually has no problem leveraging free software to their full potentials. Support here is seldom needed. But if a company is testing the waters or has very simple analytics needs or wants to increase the analytical mindset of its people they may want to look for solutions that are easy to learn, favor visual explorations but that at the same time promote good analytics practices.


Thanks again to our speakers. Check back again for our next two posts which will give tips for consultants and software vendors wanting to do business with our client speakers as well as what recommendations they give in how to best communicate findings to the C-Suite.



[Full Disclosure: Tom H. C. Anderson is Managing Partner of Anderson Analytics, developers of patented Next Generation Text Analytics™ software platform OdinText. For more information and to inquire about software licensing visit ODINTEXT INFO REQUEST]

[Above also posted on the Next Gen Market Research blog]

Top 10 Big Data Analytics Tips

Top10AnalyticsTips As part of the interview series leading up to the Useful Business Analytics Summit today we post the Top 10 Tips from our analytics experts. Whether you are data mine more structured data, or like myself more often work with unstructured or mixed data using text analytics, I think you’ll agree that the following 10 tips are critical.

  1. Keep It [ridiculously] Simple (10 times more so than is necessary to get your point across).
  2. Hypothesize/Put Problem First
  3. Don’t Assume Data is Good – Check/Validate!
  4. Automate repeat tasks & Carve out time to go exploring
  5. Set a Data Strategy – don’t just collect data for the sake of collecting it
  6. In a rapidly expanding field, work with people on the leading edge
  7. Be a Skeptic about models etc.
  8. Look for the pragmatic and cost effective solutions
  9. Don’t torture Data – in the end it will confess
  10. Think like a Business Owner – what would you like to know?

Below are more detailed tips from some of our client experts. We’d love to hear you tips if you’ve got one to add in the comments section.



Honestly, I think I’d boil it down to a single tip that is more important than all others, in my experience, but is the one most ignored and poorly executed. Keep it simple. Ridiculously simple. Ten times more simple than what you think necessary. Just about then, you are actually getting your point across in a way that people are starting to follow you. You can always increase the complexity from there, but the first time you have an experience and realize that you’ve actually conveyed a complex analytical presentation to a group of C-suite execs, you’ll understand what you’ve been doing wrong this whole time before. Hint – those head nods and blank stares aren’t what you are looking for…


- Understand that any problem is easier if you approach it correctly don't necessarily take a cookie cutter approach. Conventional wisdom is not so wise in a rapidly evolving field.

- Work with people who are able to work on the leading edge ...the people who are helping expand the envelope.



Automate anything you do more than once. It’s very easy to fill your time with routine pulls of data which lie just beyond the reach of the visualization tools available to business stakeholders. You can’t ignore these requests and it frankly feels great for us geeks to bask in the gratitude of camera-ready cool kids, but these tasks may not represent the highest-value use of your time. The more experience you have with the data, the more likely you are to be the only person with eyes on a particular business problem. So carve out time to go exploring. Think entrepreneurially like a business owner, and ask yourself “if I owned this P&L, what would I want to know?”


  -Ensure there is a purpose you understand of why analytics is valuable to the organization. Purpose can be a business sponsor like discovering new ways (i.e. products, markets, etc.) to increase revenue, retention, profit, or control costs. So ask the tough questions and align with executives mandates.

-Ensure clarity around the level of effort you spend gathering data vs. designing experiments, mining and analyzing data. The need / urge to have data to accomplish a specific task can lead to disparate / disjointed data gathering and management effort that can take over the data scientist or analytics professional work and analytics can become a second thought. So be a sponsor or an advocate for a data strategy.


1) Don't assume the data is good. Is the data lineage (with transformation rules) exposed? Is data quality measured and reportable as a trend?

2) Hypothesize and/or uncover non-time-based relationships: These are usually the richest.



Double check your results using data from different sources

Make sure it makes sense

In case of discrepancies use it directionally

Reach out to experts to obtain their opinion



1. Think of the broader perspective. Take a step back. Understand the business and the problem before jumping into solutions.

2. Be an analyst: Adopt a critical approach to thinking all analytical problems. There is nothing wrong with a slight dose of skepticism about models and results. It is healthy.

3. Try to find pragmatic and cost-effective models / solutions. For example you can probably do machine learning and neural networks to solve a lot of problems but a linear regression might sometimes be enough.



 1. Be humble: sometimes data tells us nothing or, worse, will lie to us. Cognitive dissonance is the norm rather than the exception.

2. If you torture data it will confess to any sins (attributed to Frank Harrell).

3. Go ahead, ask questions, be curious, don't be afraid to cross cultures.


Big thanks again to our client side analytics experts. Feel free to check out our previous questions on Big Data and How to Keep Up on Analytics. Don’t forget to check back in for our next question about the value of various types of data… Look forward to seeing you at the Summit!




[Full Disclosure: Tom H. C. Anderson is Managing Partner of Anderson Analytics, developers of a patented Next Generation approach to text analytics known as OdinText. For more information and to inquire about software licensing visit OdinText INFO Request.]

Breakthrough Analysis: Text Analytics 2014

[The following interview is re-posted with Permission from Seth Grimes’ Breakthrough Analysis] I post a yearly look at the Text Analytics industry — technologies and market developments — from the provider perspective. This year’s is Text Analytics 2014.

To gather background material for the article, and for my forth-coming report Text Analytics 2014: User Perspectives on Solutions and Providers (which should be out by late May), I interviewed a number of industry figures: Lexalytics CEO Jeff Catlin, Clarabridge CEO Sid BanerjeeFiona McNeill of SAS, Daedalus co-founder José Carlos González, and Tom Anderson of Anderson Analytics and OdinText. (The links behind the names will take you to the individual Q&A articles.) This article is –

Text Analytics 2014: Q&A with Tom Anderson, Anderson Analytics and OdinText



1) How has the market for text technologies, and text-analytics-reliant solutions, changed in the past year? Any surprises?

Customers are starting to become a little more savvy than before which is something we really welcome. One of two things used to happen before, we either had to explain what text analytics was and what the value was or two, sometimes had to deal with a representative from purchasing who represented various departments all with different unrealistic and unnecessary expectations on their wish list. The latter especially is a recipe for disaster when selecting a text analytics vendor.

Now more often we are talking directly to a manager who oftentimes has used one of our competitors, and knows what they like and don’t like, has very real needs and wants to see a demo of how our software works. This more transparent approach is a win-win for both us and our clients.

2) Do you have a 2013 user story, from a customer, that really illustrates what text analytics is all about?

I have several great ones, but perhaps my favorite this year was how Shell Oil/Jifffy Lube used OdinText to leverage data from three different databases and discover exactly how to drive profits higher :

3) How have perceptions and requirements surrounding sentiment analysis evolved? Where are sentiment capabilities heading, in your view?

OdinText handles sentiment quite a bit differently than other companies. Without getting into that in detail, I will say that I’m pleased to see that one good thing has happened in regard to the discourse around sentiment. Specifically, vendors have stopped making sentiment accuracy claims, as they seem to have figured out what we have known for quite some time, that accuracy is unique to data source.

Therefore the claims you used to hear like “our software is 98% accurate” have stopped. This is refreshing. Now you are likely to only hear accuracy claims from academia, since usually they have very limited access to data and are less likely to work with custom data.

Equally important in the industry realizing that sentiment accuracy claims don’t make sense is the fact that even clients have started to realize that comparing human coding to text analytics is apples to oranges. Humans are not accurate, they are very limited, Text analytics is better, but also very different!

4) What new features or capabilities are top of your customers’ and prospects’ wish lists for 2014? And what new abilities or solutions can we expect to see from your company in the coming year?

We’ve been adding several new powerful features. What we’re struggling with is adding more functionality without making user interface more difficult. We’ll be rolling some of these out in early 2014.

5) Mobile’s growth is only accelerating, complicating the data picture, accompanied by a desire for faster, more accurate, and more useful, situational insights delivery. How are you keeping up?

I know “real time analytics” is almost as popular buzzword as “big data”, but OdinText is meant for strategic and actionable insights. I joke with my clients when discussing various “real-time reporting” issues that (other than ad-hoc analysis of course) “if you are providing standard reports any more often than quarterly or at most monthly, then no one is going to take what you do very seriously”. I may say it as a joke, but of course it’s very true. Real-time analytics is an oxymoron.

6) Where does the greatest opportunity reside, for you as a solution provider Internationalization? Algorithms, visualization, or other technical advances? In data integration and synthesis and expansion to new data sources? In providing the means for your customers to monetize data, or in monetizing data yourselves? In untapped business domains or in greater uptake in the domains you already serve?

I do think there’s a lot more opportunity in monetizing data, one of the things we are looking at.

7) Do you have anything to add, regarding the 2014 outlook for text analytics and your company?

End of 2013 was better than expected, so very excited about 2014.

Thank you to Tom!

Click on the links that follow to read other Text Analytics 2014 Q&A responses: Lexalytics CEO Jeff Catlin, Clarabridge CEO Sid BanerjeeFiona McNeill of SAS, and Daedalus co-founder José Carlos González. And click here for this year’s industry summary, Text Analytics 2014.

Text Analytics World Interview

The Future Directions for Text Analytics [Text Analytics World Pre Conference Interview with Tom H. C. Anderson, CEO of Anderson Analytics - OdinText and Jeremy Bentley CEO, Smart Logic. April 2, 2013 Q&A Reposted with permission from Text Analytics World]

We asked two leading text analytics experts, Tom Anderson of Anderson Analytics – Odin Text and Jeremy Bentley of Smart Logic, what their take was on some possible future directions for the field. Their answers are shown below:

Tom Reamy: What do you see as the major trends in text analytics in the next year or two?

Tom Anderson: Realizing that customization is key. I think we’re only at the tip of the iceberg. It’s great that we’re starting to finally leverage all the data (CRM, Survey etc.) that we’ve spent so much time and money collecting and storing. But over the next two years I predict we’ll be using it in several other areas that are hard for us to foresee now.

Tom Reamy: What are the problems and issues that are slowing down the field?

Tom Anderson: The infatuation with “Social Media Monitoring” which really mainly is “Twitter Monitoring”. Until walled gardens around Facebook and LinkedIn data come down (I’ve been waiting and waiting), there really is limited usefulness in this area and we may be better off concentrating more of our efforts elsewhere. As clients start realizing they’re just listening to 8% of the population on Twitter or blogs, whom really often are somewhat different than normal customers they begin to question the ROI here.

The reason this can be problematic is that clients are so wrapped up thinking that they need to listen to “what people are saying about us on the Internet” that they don’t think about all the valuable data sources text analytics companies can help them with today.

For instance many are already paying a lot of money to field incoming customer calls and emails, storing this data, and yet don’t take the time to listen to what these very real customers are saying.

This in my opinion is hindering the advancement of text analytics in some ways. The focus needs to be broader.

Tom Reamy: What new technologies and developments in text analytics or related fields (predictive analytics, machine learning, artificial intelligence, etc.) do you see or want to see in the next year or two?

Tom Anderson: I think data visualization today is incredibly poor. I can’t believe many of our competitors in the text analytics field still offer simple “word clouds” as output.

Conversely, I think clients have to realize that data visualization techniques are generally best used as exploration tools, and not one click export to a management level PowerPoint slide.

There is currently an opportunity in best ways to communicate insights from text analytics. Having powerful software and the right data is half the battle. But we also need more creative analysts who understand the respective business and data and who can communicate the findings effectively. This more of a shortage of good analysts with the time to use these tools problem than a need for additional technology.

Tom Reamy: Do you see any revolutionary changes for text analytics on the horizon?

Tom Anderson: Yes, what I’ve been talking about a lot is domain expertise. OdinText for instance is focused on the use of text analytics for consumer insights. That is a very different thing than using text analytics for engaging with twitters or detecting terrorists or fraud etc. All these require special knowledge, rule and code modification.

I think there will be less “Enterprise” as well as “Twitter Monitoring” firms, and a lot more domain and industry specific text analytics tools/firms.

Also this technology will be incorporated by most of the companies that own sizeable amounts of unstructured data. So there will be more licensing and acquisitions going on.

Tom Reamy: Is there anything else you would like to say about the future of text analytics?

Tom Anderson: I’m so glad I got into text analytics as early as I did. It’s still in its infancy, not in terms of what we can do with it already/the power, but in terms of adoption and creatively thinking about how to leverage it in different ways. Very exciting times ahead!


Tom Reamy: What do you see as the major trends in text analytics in the next year or two?

Jeremy Bentley: To borrow from Big Data parlance – Velocity, Volume and Variety mean text analytics in real time over a lot of it, in different formats and from different places. Content Intelligence (which includes text analytics) brings structure to unstructured information so it can be joined with the data world. Data tells you what happened, and content tells you why. Associating the what with the why is the major requirement for organizations that protect, value and make money from their information.

Tom Reamy: What are the problems and issues that are slowing down the field?

Jeremy Bentley: The reality check that content is not clean, properly managed or sufficiently findable today. Information overload (the often cited big issue) is nothing but a filter problem – the problem is that the filter parameters are not present in the current information management systems of CMS, ERDMS and search engines. Until it is recognized that the gritty and unglamorous task of metadata management and automatic application of whatever metadata is needed for a particular view of the content at any particular point in time. Once addressed content becomes process-able and valuable.

Tom Reamy: What new technologies and developments in text analytics or related fields (predictive analytics, machine learning, artificial intelligence, etc.) do you see or want to see in the next year or two?

Jeremy Bentley: There is a balance to be drawn between what is fully automatic and what requires some human oversight – Classification and text analysis should be fully automatic – the methods and rules used to drive the analysis should be subject to user oversight. Machine learning and AI have a role to play in the latter – as software become more sophisticated so the effort needed to achieve quality analytics and metadata derivation will go down.

Tom Reamy: Do you see any revolutionary changes for text analytics on the horizon?

Jeremy Bentley: Most users see text analytics as pretty cutting edge as it is, so to this question we have to widen it from Text to Content – in all of its forms to see where the revolution comes.

Content Intelligence for Big Data will revolutionize how organizations use their information to gain insight and competitive advantage. This is already happen ing in forward thinking enterprises- inclreasingly it will not just be the larger organizations that benefit from such an approach.

Tom Reamy: Is there anything else you would like to say about the future of text analytics?

Jeremy Bentley: Being able to process content, as we do data in a database will seem standard in a decades time.