Posts tagged word cloud
Brand Analytics – Branding and Gender

Text Analytics Tips - Branding Text Analytics Tips: Branding Analytics – 500 Major Brands and Gender- by Tom H. C. Anderson (Continuation from yesterday’s Brand Analytics post)

Thank you everyone who contacted us for more information about OdinText yesterday. As a result, I’ve decided to dig into the same branding question a bit deeper taking a look at one or two additional variables today and tomorrow. Of course, if anyone is interested in seeing just how easy and powerful OdinText is feel free to request info or a demo.

Yesterday we looked at Brand Awareness by Age quite a bit. Today, I thought we’d look at gender. But before that, I’ve posted another visualization from the Brand vs. Age data below.

Though popular, we’ve found at OdinText that typical word clouds are almost completely useless. Yesterday, I showed a co-occurrence plot of the data which certainly is more meaningful than a word cloud, as unlike a word cloud, position is used to tell you something about the data (those terms mentioned most frequently together appear closer to each other).

Brand Analytics Unstructured VisualizationBrand Analytics Unstructured Visualization

In the chart above, we are plotting two variables from yesterday’s data, Average Age on the x-axis and frequency of mentions (or popularity) on the y-axis. Visualizations like this are a great way to very quickly explore and understand unstructured data. Even without getting into the detail, often just the overall shape of a chart will tell you something about your data. In the case above, we have a triangle type shape where the most popular brands, such as Samsung, Sony and Coca-Cola, tend to appear in the middle. Two outliers are Apple and Nike who are not only our most popular brands, but also skew a bit younger.

But let’s leave Age and take a look at Gender. Though a nominal variable because gender typically only has two values (male and female), it is also a dichotomous variable and thus lends itself nicely to more advanced visualization. Basically, any dichotomous variable including Yes or No (present vs. not present) can be very useful in OdinText patented text analytics process. What can we tell about brands when we look at gender?

Brand Anallytics and Gender 600by300 Text Analytics Tips

Brands by Gender

There are certain stereotypes that seem validated. You’re far more likely to be a guy than a gal if, when you think of brands, the first things that pop into your mind are Software and Electronics such as Microsoft (15% vs. 6%) and Sony (17% vs. 11%), or if you think of an auto brands like Ford (11% vs. 7%) or a McDonalds (4% vs. 1%).

Text Analytics Gender by Brand with OdinText

Perhaps as expected, women are far more likely to think of consumer packaged goods brands like Kraft (14% vs. 7%), Johnson & Johnson (6% vs. 2%), Kellogg’s (5% vs. 0.4%), General Mills (6% vs. 2%), Dove (4% vs. 0.2%), and P&G (4% vs. 1%).

Interestingly the list doesn’t stop there though, women tend to be able to mention several more brands than men including many less frequently mentioned brands such as Coach (3% vs. 0.3%), Gap (3% vs. 1%), Colgate (3% vs. 0.5%), Tide (2.6% vs. 0.4%), Victoria’s Secret (2.5% vs. 0.2%), Michael Kors (2.8% vs. 0.7%), Kleenex (2.5% vs. 0.5%), Tommy Hilfiger (2$ vs. 0.2%), Huggies (1.7% vs. 0%), Olay (1.7% vs. 0%), Hershey’s (1.7% vs. 0.2%), Mattel (1.5% vs. 0%) and Bath and Body Works (1.3% vs. 0%).

That’s it for today but come back tomorrow and we’ll look at one last data point related to this one unstructured branding question we’ve been looking at to see whether brands can have a political skew.

Of course in the meantime, please feel free to request more information on how you too can become a data scientist with OdinText. Text Analytics can be a great tool for brand analytics including answering brand positioning, brand loyalty and brand equity questions.

-Tom @OdinText


[NOTE: Tom H. C. Anderson is Founder of Next Generation Text Analytics software firm OdinText Inc.]

Brand Analytics Tips – How Old is Your Brand?

Text Analytics Tips Text Analytics Tips Answers, How Old Is Your Brand? - Using OdinText on Brand Mention Type Comment Data By Tom H. C. Anderson

[METHODOLOGICAL NOTES (If you’re not a researcher feel free to skip down to ‘Brands & Age’ section below): In our first official Text Analytics Tips I’ve started with exploring one of the arguably simplest types of unstructured/text data there is, the unaided top-of-mind ‘brand mention’ open-ended survey question. These kinds of questions are especially important to brand positioning, brand equity, brand loyalty and advertising effectiveness research. In this case we’ve allowed for more than one brand mention. The questions reads “Q. When you think of brand names, what company’s product or service brand names first come to mind? [Please name at least 5]”. The question was fielded to n=1,089 US Gen Pop Representative survey respondents in the CriticalMix Panel in December of 2015. The confidence interval is +/-2.9% at the 95% confidence level]

Making Good Use Comment Data Can Be Easy and Insightful

An interesting and rather unique way to look at your brand is to understand for whom it is most likely to be top-of-mind.

Unfortunately, though they have proven more accurate than structured choice or Likert scale rating questions in predicting actual behavior, free form (open end) survey questions are rare due to the assumed difficulty in analyzing results.  Even when they are included in a survey and analyzed, results are rarely expressed in anything more useful than a simple frequency ranked table (or worse, a word cloud). Thanks to the unique patented approach to unstructured and structured data in OdinText, analyzing this type of data is both fast and easy, and insights are only limited to the savviness of the analyst.

The core question asked here is rather simple i.e. “When you think of brand names, what company’s product or service brand names first come to mind?”. However, asking this question to over a thousand people, because of the share volume of brands that will be mentioned (in our case well over 500), even this ‘small data’ can seem overwhelming in volume.

The purpose of this post is to show you just how easy/fast yet insightful analysis of even more specific and technically more basic comment data can be using Next Generation Text AnalyticsTM.

After uploading the data into OdinText, there are numerous ways to look at this comment data, not only the somewhat more obvious frequency counts, but also several other statistics including any interesting relationships to available structured data. Today we will be looking at how brand mentions are related to just one such variable, the age of the respondent. [Come back tomorrow and we may take a look at a few other statistics and variable relationships.]

Text Analytics Tips Age OdinText

Brands by Age

Below is a sortable list of the most frequently mentioned brands ranked by the average age of those mentioning said brand. This is a direct export from OdinText. The best way to think about lists like these is comparatively (i.e. how old is my brand vs. other brands?). If showing a table such as this in a presentation I would highly recommend color coding which can be done either in OdinText (depending on your version), or in excel using the conditional formatting tool.

[NOTE: For additional analytics notes and visualizations please scroll to the bottom of the table below]


Brand Name Average Age
Maxwell House 66
Hunts 66
Aspirin 66
Chrysler 64.6
Stouffers 63.7
Marie Callender's 63.7
Walgreen 63.7
Cooper (Mini) 63.7
Bayer 62.6
USAA 62.5
Epson 62.5
Brother 61.3
Aol 61.3
Comet 61.3
Snapple 61.3
Lowes 61.2
Marriott 60.3
Ritz 60.3
Hellman's 60.3
Ikea 60.3
Belk 60.3
State Farm 60.3
Oscar Mayer 60
Folgers 59.8
Libby's 59.8
Hormel 59.2
Depot 59.2
Heinz 59.2
Electric 59.2
Bordens 59.2
Nestles 59
Green Giant 59
Sargento 58.3
Del Monte 58
Prego 58
Kashi 58
Westinghouse 58
Stouffer 58
Taylor 58
Home Depot 57.6
Publix 57.5
Banquet (Frozen Dinners) 57.5
Buick 57
Krogers 57
Hellman's 57
Safeway 56.5
Purex 56.4
Hewlett 56.4
Unilever 56.1
RCA 56.1
Post 56.1
P&G 55.9
Budweiser 55.9
Yoplait 55.8
Chobani 55.7
Ragu 55.7
Campbell's 55.5
Wells Fargo 55.2
Hershey 55.1
Betty Crocker 55
Sharp 55
Hines 55
Trader Joe's 55
Palmolive 54.9
Kia 54.7
Lexus 54.7
Life 54.7
Hotpoint 54.7
Campbells 54.6
Oscar Mayer 54.5
Dial 54.4
Nissan 54.4
Hillshire Farms 54.3
Motorola 54.1
Keebler 54
CVS 53.8
Canon 53.8
Lakes 53.7
Pillsbury 53.3
Hilton 53.3
Faded Glory 53.3
Friskies 53.3
Duncan Hines 53.3
Puffs 53.3
Olay 52.8
Sketchers 52.5
Fred Meyer 52.5
Delta 52.5
Hunt 52.3
Bose 52.3
Ocean Spray 52.3
Ivory 52.3
Swanson 52.3
Dewalt 52.3
Firestone 51.8
Estee Lauder 51.5
Miller 51.5
Tide 51.4
Honda 51.3
Meijer 51.3
Perdue 51.3
Jeep 51.3
Head 51.3
Lee Jeans 51.3
Pantene 51
Chevrolet 51
Cannon 50.8
Chef Boyardee 50.8
Frito Lay 50.6
Avon 50.5
Motors 50.4
Kodak 50.4
General Mills 50.2
BMW 50
Lipton 49.8
Kohl's 49.8
Goodyear 49.7
Kraft 49.6
Craftsman 49.5
Sunbeam 49.4
IBM 49.3
Frigidare 49.1
Sears 49.1
Ford 49.1
Walgreens 49.1
Dole 49.1
Chevy 49
Wonder (Bread) 49
Dannon 49
JVC 49
Hyundai 49
Clinique 49
Marlboro 49
Mercedes 49
Gerber 49
Acme 49
Kleenex 48.8
Kelloggs 48.7
JC Penney 48.6
Louis Vuitton 48.5
Calvin 48.4
LL Bean 48.4
Gillette 48.4
Johnson & Johnson 48.3
Shell 48.3
Kenmore 48.1
Dawn 48
Hanes 48
Macdonalds 48
Tylenol 48
Colgate 47.5
Wrangler (Jeans) 47.3
Burger King 47.3
Whirlpool 47.1
GMC 47
Yahoo 46.9
Dish Network 46.8
Verizon 46.7
Hersheys 46.6
Whole Foods 46.5
Sara Lee 46.5
Hostess 46.5
Mazda 46.5
Toyota 46.4
Arm & Hammer 46.4
Nabisco 46.3
Tyson 46.1
Starbucks 46
Wal-Mart 45.9
Western Family 45.8
Wegmans 45.8
Dr Pepper 45.7
Hulu 45.7
Time Warner 45.7
Maybelline 45.7
MLB 45.7
Iams 45.7
Cox 45.7
Country Crock 45.7
Compaq 45.7
Sonoma 45.7
Quaker Oats 45.7
Nordstrom 45.4
Coca 45.3
Champion 45.3
Bass 45
Chrome 44.7
Coors 44.7
iPhone 44.6
Bounty 44.5
Dodge 44.4
Maytag 44.3
Black & Decker 44.2
Pfizer 44.2
Suave 44.2
HP 44
Scott 44
Subway 44
Skechers 44
Geico 44
Panasonic 43.9
Lays 43.8
KFC 43.8
Charmin 43.8
Dell 43.8
Polo 43.8
Windex 43.7
Burts Bees 43.5
Purina 43.5
Clorox 43.5
Columbia 43.3
Ralph Lauren 43.2
Visa 43.2
Pepsi 43
Crest 43
NFL 43
Sanyo 43
Dove 42.9
Intel 42.9
Wendy's 42.8
Kroger 42.8
Remington 42.3
Phillips 42.3
Mars 42.3
Cover Girl 42.3
Heb 42.3
Twitter 42.3
Amazon 42
Body Works 42
Best Buy 41.8
Costco 41.8
Banana Republic 41.8
Disney 41.7
Amway 41.7
Levi 41.5
Sony 41.4
Samsung 41.4
Macy's 41.1
Glade 41.1
Boost 41
Boost Mobile 41
Toshiba 40.8
Ebay 40.8
Comcast 40.7
Facebook 40.6
Walmart 40.5
Microsoft 40.5
Google 40.4
Kitchen 40.4
Nestle 39.8
Mcdonalds 39.5
Gucci 39.5
Vons 39.3
Philip Morris 39.3
Loreal 39.3
Mattel 39.1
Apple 39
Pepperidge Farm 39
Vizio 39
Lysol 39
Ugg 39
Tropicana 39
Sure 39
Fila 39
Tmobile 39
Coach 38.9
Acer 38.8
Tommy Hilfiger 38.6
Nike 38.1
Target 38
Old Navy 37.9
Chase 37.8
Michael Kors 37.7
K-Mart 37.5
Lenovo 37.5
Equate 37.2
Hoover 36.8
Under Armour 36.6
Windows 36.5
Asics 36.5
Kitchenaid 36.5
Victoria's Secret 36.2
Mac 36.1
Reebok 36.1
Android 36
Direct TV 36
Sprint 36
Netflix 35.9
Adidas 35.7
Citizen 35.7
New Balance 35.6
Guess 35.4
Bic 35.2
Great Value 35.2
Pizza Hut 35
Puma 34.9
Asus 34.4
Fox 34.3
Justice 34.3
North Face 34.1
Xbox 33.6
Gap 33.4
Doritos 33.4
HTC 33.4
Converse 33.3
Sprite 33.2
Febreeze 33
Axe 33
Kay 32.7
Glad 32.7
Mary Kay 32.7
Viva 32.7
Reese's 31.8
Lego 31.7
Amazon Prime 31.5
Nintendo 31.2
Vans 31.2
Taco Bell 31
Fisher Price 30.4
Chanel 29.7
Old Spice 29.7
Playstation 29.4
Eagle 29.4
Hamilton Beach 29.3
Footlocker 29.3
Pink 29.3
Swiffer 29.3
Timberlands 29.3
Naked Juice 29
Youtube 29
Bing 29
Air Jordans 28.4
Huggies 28.2
Aeropostale 27.7
Hollister 27.3
Prada 27.3
Carters 26.8
Kirkland 26.3
Forever 26.3
Aeropostle 26.3
Arizona 25.6
Pampers 24.5
Versace 24.5
Urban Outfitters 24.5


A few interesting points from the longer list of brands are:

The oldest brand, “Maxwell House Coffee”, has an average age of 66. (If anything, this mean age is actually conservative, as the age question gets coded as 66 for anyone answering that they are “65 or older”). This is a typical technique in OdinText, choosing the mid-point to calculate the mean if the data are in numeric ranges, as is often the case with survey or customer entry form based data.

The Youngest brand on the list, “Urban Outfitters”, with an average age of 24 also probably skews even younger in actuality for the same reason (as is standard in studies representative of the US General Population, typically only adults aged 18+ are included in the research).

Dr Pepper is in the exact middle of our list  (46 years old). Brands like Dr. Pepper which are in the middle (with an average age close to the upper range of Generation X) are of course popular not just among those 46 years old, but are likely to be popular across a wider range of ages. A good example, Coca-Cola also near the middle, mentioned by 156 people with an average age of 45, is pulling from both young and old. The most interesting thing then, as is usual in almost any research, is comparative analysis. Where is Pepsi relative to Coke for instance? As you might suspect, Pepsi does skew younger, but only somewhat younger on average, mentioned by 107 consumers yielding an average for the brand of 43. As is the case with most data, relative differences are often more valuable than specific values.

If there are any high level category trends here related to age, they seem to be that Clothing brands like Urban Outfitters and Versace (both with the youngest average age of 24), Aeropostale (26), and Forever 21 (Ironically with an average age of 26), and several others in the clothing retail category tend to skew very young. Snack Food especially drinks like Arizona Ice Tea (age 25), and Naked Juice (29), as well as web properties (Bing and YouTube both 29), and electronics (obviously PlayStation 29 and slightly older Nintendo 31 being examples), are associated with a younger demographic on average.

In the middle age group, other than products with a wide user base like major soda brands, anything related to the home, either entertainment like Time Warner Cable or even Hulu (both 45), or major retailers like Wegmans and Wal-Mart (also both 45), are likely to skew more middle age.

The scariest position for a brand manager is probably at the top of the list, with average age for Maxwell House, and Hunts (both 66), Stouffers and Marie Callender's (both 64), the question has got to be, who will replace my customer base when they die? What we see by looking at the data are in fact that a slight negative correlation between age and number of mentions.

Again, it’s often the comparative differences that are interesting to look at, and of course the variance. Take Coca-Cola VS Pepsi for instance, while their mean ages are surprisingly close to each other at 45 and 43 respectively, looking at the variance associated with each gives us the spread (i.e. which brand is pulling from a broader demographic). Coca-Cola with a standard deviation of 14.5 years for instance is pulling from a wider demographic than Pepsi which as a standard deviation of 12.9 years. There are several ways to visualize these data and questions in OdinText, though some of our clients also like to use OdinText output in visualization software like Tableau which can have more visualization options, but little to no text analytics capabilities.

Co-Occurrence (aka Market Basket Analysis)

Last but not least, looking at which brands are often mentioned together, either because they are head to head competitors going after the exact same customers or because there may be complimentary (market basket analysis type opportunities if you will) can also certainly be interesting to look at. Brands that co-occur frequently (are mentioned by the same customers), and are not competitors may in fact represent interesting opportunities for ‘co-opetition’.  You may have noticed more cross category partnering on advertising recently as marketers seem to be catching on to the value of joining forces in this manner. Below is one such visualization created using OdinText with just the Top 20 brand mentions visualized in an x-y plot using multi-dimensional scaling (MDS) to plot co-occurrence of brand names.

Text Analytics of Brands with OdinText

Hope you enjoyed today’s discussion of a very simple text question and what can be done with it in OdinText. Come back again soon as we will be giving more tips and mini analysis on interesting mixed data. In fact, if there is significant interest in today’s post we could look at one or two other variables and how they relate to brand awareness comment data tomorrow.

Of course if you aren’t already using OdinText, please feel free to request a demo here.


Text Analytics Tips

Text Analytics Tips, with your Hosts Tom & Gosia: Introductory Post Today, we’re blogging to let you know about a new series of posts starting in January 2016 called ‘Text Analytics Tips’. This will be an ongoing series and our main goal is to help marketers understand text analytics better.

We realize Text Analytics is a subject with incredibly high awareness, yet sadly also a subject with many misconceptions.

The first generation of text analytics vendors over hyped the importance of sentiment as a tool, as well as ‘social media’ as a data source, often preferring to use the even vaguer term ‘Big Data’ (usually just referring to tweets). They offered no evidence of the value of either, and have usually ignored the much richer techniques and sources of data for text analysis. Little to no information or training is offered on how to actually gain useful insights via text analytics.

What are some of the biggest misconceptions in text analytics?

  1. “Text Analytics is Qualitative Research”

FALSE – Text Analytics IS NOT qualitative. Text Analytics = Text Mining = Data Mining = Pattern Recognition = Math/Stats/Quant Research

  1. It’s Automatic (artificial intelligence), you just press a button and look at the report / wordcloud

FALSE – Text Analytics is a powerful technique made possible thanks to tremendous processing power. It can be easy if using the right tool, but just like any other powerful analytical tools, it is limited by the quality of your data and the resourcefulness and skill of the analyst.

  1. Text Analytics is a Luxury (i.e. structured data analysis is of primary importance and unstructured data is an extra)

FALSE – Nothing could be further from the truth. In our experience, usually when there is text data available, it almost always outperforms standard available quant data in terms of explaining and/or predicting the outcome of interest!

There are several other text analytics misconceptions of course and we hope to cover many of them as well.

While various OdinText employees and clients may be posting in the ‘Text Analytics Tips’ series over time, Senior Data Scientist, Gosia, and our Founder, Tom, have volunteered to post on a more regular basis…well, not so much volunteered as drawing the shortest straw (our developers made it clear that “Engineers don’t do blog posts!”).

Kidding aside, we really value education at OdinText, and it is our goal to make sure OdinText users become proficient in text analytics.

Though Text Analytics, and OdinText in particular, are very powerful tools, we will aim to keep these posts light, fun yet interesting and insightful. If you’ve just started using OdinText or are interested in applied text analytics in general, these posts are certainly a good start for you.

During this long running series we’ll be posting tips, interviews, and various fun short analysis. Please come back in January for our first post which will deal with analysis of a very simple unstructured survey question.

Of course, if you’re interested in more info on OdinText, no need to wait, just fill out our short Request Info form.

Happy New Year!

Your friends @OdinText

Text Analytiics Tips T G

[NOTE: Tom is Founder and CEO of OdinText Inc.. A long time champion of text mining, in 2005 he founded Anderson Analytics LLC, the first consumer insights/marketing research consultancy focused on text analytics. He is a frequent speaker and data science guest lecturer at university and research industry events.

Gosia is a Senior Data Scientist at OdinText Inc.. A PhD. with extensive experience in content analytics, especially psychological content analysis (i.e. sentiment analysis and emotion in text), as well as predictive analytics using unstructured data, she is fluent in German, Polish and Spanish.]


Coca-Cola Social Media Team Chooses OdinText Text Analytics Software and Wins 2015 ARF Award

Coca-Cola Leverages OdinText analytics to develop new approach to social listening and wins ARF’s 2015 RE:Think Your Future “Make Your Mark” Award!

At the Advertising Research Foundation’s (ARF) annual RE:Think Conference this week?

Please join OdinText and The Coca-Cola Company this coming week at the 2015 Advertising Research Foundation’s annual RE:THINK event to learn more about the future of big data and text analytics.


On Monday in SuttTomCokeTextAnalyticson South, 2nd Floor at 3:30, OdinText Founder and CEO Tom H. C. Anderson will be joined by several other industry leaders in what promises to be a very interesting panel entitled “Applying Mobile Big Data at the Intersection of Content, Context & User Analytics”. As unstructured data is the key to the future of mobile, text analytics promises to be a key part of the discussion.



On Wednesday afternoon in the Grand Ballroom, The Coca-Cola Company’s Digital Anthropologist, Allison Barnes and Global Media Insights Director, Justin De Graaf, are being honored for their very interesting work and award winning paper entitled “Digital Anthropology: Researching Audiences Online”.



Allison and Justin will briefly discuss how Coca-Cola is leveraging OdinText to understand specific consumer groups, developing an innovative new approach that works across the Coca-Cola system. An approach that is scalable, utilizes existing tools, works across all markets and adapts to custom requirements.

“The new technique moves beyond word clouds [and other first generation social listening features] which represent the old way, and towards true analytics, which we’re getting from OdinText”

As Allison points out in their paper, great text analytics is critical for The Coca-Cola Company since “Social is a new avenue our consumers use to share their lives, and tapping in to those shared thoughts allows us to become more closely tied to the things that matter most to them. Our ability to continue innovating communications, products and happiness is at the heart of our future and harnessing social allows us to do exactly that”.

Congratulations to Justin, Allison and the rest of the social media team at Coca-Cola for a job well done!

We look forward to seeing you at The ARF Monday or Wednesday. And as always we welcome your questions about OdinText and text analytics anytime!

Selecting the Best Text Analytics Software

The Non-Dummies Guide for Selecting a Text Analytics (or any other) Partner WhatToLookForWhenBuyingBestTextAnalyticsSoftwareSolution

Text Analytics is a Process, Not and End!

What would you say should be the goal of good text analytics software?

Based on the questions we get from clients investigating text analytics solutions there seems to be no small amount of confusion. The fault isn’t theirs, it’s the fault of the early text analytics and social media monitoring vendors who overpromised and under delivered.

Rather than explaining to clients what kind of analysis and insights they should rightfully expect they choose instead to hide the fact that they know very little themselves about how text analytics can and should actually be applied, instead most text analytics sales staff preferred to talk theoretically using as many technical buzzwords like “natural language processing” as possible.

Here are questions you can safely set aside when investigating the right text analytics solution. They have next to no meaning whatsoever in terms of efficacy for your use case:

-How do you handle xyz stemming, semantic ABC, Ontologies and ______? [Insert other favorite buzz word you’ve heard but don’t really understand]

-What does the output look like, do you have a pretty dashboard? [If you buy text analytics software for pie charts and word clouds you’ll be in trouble. Dashboards, even if you find they make sense need serious customization]

-Do you have a cool black sci-fi looking background with neon colored maps? [If you plan to put a bunch of monitors up and pretend you or on the bridge of starship enterprise I suppose this may make sense?!?!]

Instead, these kinds of questions are what you should be asking:

-Tell me about a client with the same kind of data that I have. How have they benefited from the tool? [They better be darn specific]

-Show me how it works with my own data!? [It’s easy to give a demo of poorly working software with canned data. Always make then use your data and never give them more than a day or two max to set it up]

Even better Text Analytics tools are becoming easier to use, and I admit, keeping OdinText intuitive as we add more features is challenging. However, one of the biggest single misconceptions about text analytics software is that they somehow have this magical “artificial intelligence” power. Some sort of power to discern everything and automatically write the report for you. I’m really not exaggerating.

Text analytics is not an end, it is a process. Find a vendor who understands this and whose software is not black box. Here simple is better. If how the software does its coding is hidden in a black box, and the sales person throws buzz words at you to make you feel safe/confused about the fact you have no idea about how the sausage is made, it’s not because they have valuable “linguistic” or “machine learning” rules (more buzz words) -those can only be developed after carefully studying your own data, it’s because their software doesn’t actually work too well and will require a lot of expensive and time consuming customization for unproven performance.

After choosing a text analytics software tool that is powerful and intuitive, a software that you can trust, then the fun begins. You or your analyst should be able to learn how to use the tool relatively quickly, but as with anything, you should expect to get better with experience.

Remember the early statistical software tools like SPSS and SAS. They worked very well on smaller data and you could trust that they actually did what you expected them to. However you still needed to know what clustering and factor analysis was, and why to look at a mean VS. a median. Just like these tools text analytics software also requires an analyst who can think about the data and how to get the most valuable insights for management.

Unfortunately, people who have never analyzed big data or conducted text analytics for real clients are building text analytics and “social listening” software. Find a vendor who understands your business. Their products will make you a data scientist. You’ll have to do a little more than press one button to understand the data, but since when has anything worthwhile been that easy?

To answer the question I posed earlier - what should be the goal of good text analytics software? – the answer depends on what field you’re in…

If you’re a marketer, then the main question you should be asking is how will this text analytics software help me sell more product to more customers less expensively?



[Full Disclosure: Tom H. C. Anderson is Managing Partner of Anderson Analytics, developers of patented Next Generation Text Analytics™ software platform OdinText. For more information and to inquire about software licensing visit ODINTEXT INFO REQUEST]

[Above also posted on the Next Gen Market Research blog]

What is Text Analytics?

Do You Know What Text Analytics Is?(Next Generation Text Analytics Explained)

[Anderson Analytics explains Text Analytics and the difference between First Generation approaches and Next Generation software OdinText]

While text analytics has been around for quite some time and has reached mainstream to the point of becoming a buzz word, few really know what it is. It’s not a word cloud. It is not a qualitative tool. IT IS data mining.

We’ve long felt the need to clear the confusion around text analytics in our industry. Surprisingly there aren’t really any good videos on the subject.

In making a video to explain what text analytics is, we first had to decide who our audience was. So many business videos these days are trying to reach such a broad audience that they become totally void of any real information. On the other hand, we didn’t want to make a geeky video just for ‘data scientists’ either.

We hope that the middle ground we chose to communicate to here, basically our core customer base (the consumer insights analyst/manager/research director), will provide the right level of detail within a reasonable amount of time, about 4 minutes (down from our original 7 minute version).

Thank you in advance for watching and sharing our video. Should you want to discuss text analytics in greater detail or have a question around OdinText specifically, please don’t hesitate to reach out or request a demo.

Your friends @OdinText




Text Analytics World Interview

The Future Directions for Text Analytics [Text Analytics World Pre Conference Interview with Tom H. C. Anderson, CEO of Anderson Analytics - OdinText and Jeremy Bentley CEO, Smart Logic. April 2, 2013 Q&A Reposted with permission from Text Analytics World]

We asked two leading text analytics experts, Tom Anderson of Anderson Analytics – Odin Text and Jeremy Bentley of Smart Logic, what their take was on some possible future directions for the field. Their answers are shown below:

Tom Reamy: What do you see as the major trends in text analytics in the next year or two?

Tom Anderson: Realizing that customization is key. I think we’re only at the tip of the iceberg. It’s great that we’re starting to finally leverage all the data (CRM, Survey etc.) that we’ve spent so much time and money collecting and storing. But over the next two years I predict we’ll be using it in several other areas that are hard for us to foresee now.

Tom Reamy: What are the problems and issues that are slowing down the field?

Tom Anderson: The infatuation with “Social Media Monitoring” which really mainly is “Twitter Monitoring”. Until walled gardens around Facebook and LinkedIn data come down (I’ve been waiting and waiting), there really is limited usefulness in this area and we may be better off concentrating more of our efforts elsewhere. As clients start realizing they’re just listening to 8% of the population on Twitter or blogs, whom really often are somewhat different than normal customers they begin to question the ROI here.

The reason this can be problematic is that clients are so wrapped up thinking that they need to listen to “what people are saying about us on the Internet” that they don’t think about all the valuable data sources text analytics companies can help them with today.

For instance many are already paying a lot of money to field incoming customer calls and emails, storing this data, and yet don’t take the time to listen to what these very real customers are saying.

This in my opinion is hindering the advancement of text analytics in some ways. The focus needs to be broader.

Tom Reamy: What new technologies and developments in text analytics or related fields (predictive analytics, machine learning, artificial intelligence, etc.) do you see or want to see in the next year or two?

Tom Anderson: I think data visualization today is incredibly poor. I can’t believe many of our competitors in the text analytics field still offer simple “word clouds” as output.

Conversely, I think clients have to realize that data visualization techniques are generally best used as exploration tools, and not one click export to a management level PowerPoint slide.

There is currently an opportunity in best ways to communicate insights from text analytics. Having powerful software and the right data is half the battle. But we also need more creative analysts who understand the respective business and data and who can communicate the findings effectively. This more of a shortage of good analysts with the time to use these tools problem than a need for additional technology.

Tom Reamy: Do you see any revolutionary changes for text analytics on the horizon?

Tom Anderson: Yes, what I’ve been talking about a lot is domain expertise. OdinText for instance is focused on the use of text analytics for consumer insights. That is a very different thing than using text analytics for engaging with twitters or detecting terrorists or fraud etc. All these require special knowledge, rule and code modification.

I think there will be less “Enterprise” as well as “Twitter Monitoring” firms, and a lot more domain and industry specific text analytics tools/firms.

Also this technology will be incorporated by most of the companies that own sizeable amounts of unstructured data. So there will be more licensing and acquisitions going on.

Tom Reamy: Is there anything else you would like to say about the future of text analytics?

Tom Anderson: I’m so glad I got into text analytics as early as I did. It’s still in its infancy, not in terms of what we can do with it already/the power, but in terms of adoption and creatively thinking about how to leverage it in different ways. Very exciting times ahead!


Tom Reamy: What do you see as the major trends in text analytics in the next year or two?

Jeremy Bentley: To borrow from Big Data parlance – Velocity, Volume and Variety mean text analytics in real time over a lot of it, in different formats and from different places. Content Intelligence (which includes text analytics) brings structure to unstructured information so it can be joined with the data world. Data tells you what happened, and content tells you why. Associating the what with the why is the major requirement for organizations that protect, value and make money from their information.

Tom Reamy: What are the problems and issues that are slowing down the field?

Jeremy Bentley: The reality check that content is not clean, properly managed or sufficiently findable today. Information overload (the often cited big issue) is nothing but a filter problem – the problem is that the filter parameters are not present in the current information management systems of CMS, ERDMS and search engines. Until it is recognized that the gritty and unglamorous task of metadata management and automatic application of whatever metadata is needed for a particular view of the content at any particular point in time. Once addressed content becomes process-able and valuable.

Tom Reamy: What new technologies and developments in text analytics or related fields (predictive analytics, machine learning, artificial intelligence, etc.) do you see or want to see in the next year or two?

Jeremy Bentley: There is a balance to be drawn between what is fully automatic and what requires some human oversight – Classification and text analysis should be fully automatic – the methods and rules used to drive the analysis should be subject to user oversight. Machine learning and AI have a role to play in the latter – as software become more sophisticated so the effort needed to achieve quality analytics and metadata derivation will go down.

Tom Reamy: Do you see any revolutionary changes for text analytics on the horizon?

Jeremy Bentley: Most users see text analytics as pretty cutting edge as it is, so to this question we have to widen it from Text to Content – in all of its forms to see where the revolution comes.

Content Intelligence for Big Data will revolutionize how organizations use their information to gain insight and competitive advantage. This is already happen ing in forward thinking enterprises- inclreasingly it will not just be the larger organizations that benefit from such an approach.

Tom Reamy: Is there anything else you would like to say about the future of text analytics?

Jeremy Bentley: Being able to process content, as we do data in a database will seem standard in a decades time.

What NASA and OdinText Have in Common

Word Clouds Don't Fly at NASA Either

This year I participated on the closing panel at the Text Analytics Summit in Boston. The most interesting presentation at the event was one given by NASA's Ashok Srivastava on how they are using text analytics to make air travel safer in the US, and in fact around the world, by monitoring what pilots and ground staff report in regard to various safety and mechanical issues.

Using text analytics in this way is obviously a bit different than how we at Anderson Analytics use our software OdinText. Understanding how different consumers view products and the problems these groups experience requires a different approach with different risk tolerances. While I believe understanding VOC to create the best products and services is very important, imagine the complexity and inherent danger involved in the system NASA analyzes.

Text Analytics Objective: Make Air Traffic Safer

[24 Hour System View]

After the presentation I spoke to Ashok and found it interesting that while we obviously have different objectives, data and therefore approach text analytics somewhat differently, one of the several areas that our approaches have in common is data visualization. It has always surprised me how many ‘text analytics' firms use simplistic word clouds as output; one of my many pet peeves in our industry. I wasn't surprised that NASA didn't use them either. Instead I found that we both include correlations of unstructured data and handle these with very similar statistical/visualization techniques for insights far more meaningful than word clouds.

Ahead of Text Analytics Summit West in San Francisco next month, Text Analytics News contacted us both for an interview on how we think text analytics and big data can become a default business solution. Obviously the ROI needs to be better understood by many companies before even greater adoption takes place. In the private sector, this means there is still a first mover advantage, and firms who implement text analytics early can gain an information advantage over their competition.

Again, industry and domain specifics aside I was pleased to see we are in agreement on many of these issues as well. You can read the Q&A on the Text Analytics News website here.

@TomHCAnderson @OdinText

PS. I understand Ashok will be speaking Text Analytics Summit West as well. You can get a glance at his presentation here: