39
1 WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR HELP! I THINK I MIGHT BE A SCIENTIST. Nick Buckley Social Media Director GfK NOP

WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

  • Upload
    varian

  • View
    46

  • Download
    0

Embed Size (px)

DESCRIPTION

WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist. Nick Buckley Social Media Director GfK NOP. 1. What are we talking about?. Newsgroups. What exactly are we talking about ?. News sites. Blogs/ Microblogs. Video sites. Forums. - PowerPoint PPT Presentation

Citation preview

Page 1: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

1

WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH?OR HELP! I THINK I MIGHT BE A SCIENTIST.

Nick BuckleySocial Media Director GfK NOP

Page 2: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

2

1. What are we talking about?

Page 3: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

3

Definition* of social media monitoring:“Social Media Monitoring (SMM) means the identification, observation, and analysis of user-generated social media content for the purpose of market research.”

What exactly are we talking about?

What they say

Newsgroups

PublicCommunities

Video sites

Review sitesProfessional & Consumer

Blogs/MicroblogsForums

Client sites

News sites

* http://www.social-media-monitoring.org

Page 4: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

4

What was that 2.0 thing again?

Before the rise of the internet Web 2.0

The “era of shout marketing” is over*:

* Marshall, 2012

Eh?

Page 5: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

5

Web Mining, Social Media Monitoring or Social Media Mining?

I like “Mining”. User generated content in social media lays down a rich seam of activity, opinion, thought and information… mess, echoes and ‘whimsy’.

For some time marketing and PR professionals have been monitoring Social Media to capture headline ‘buzz’ in real time, and to detect sudden changes requiring a response.

But collecting and counting this content is only the beginning of a process which can add value via many techniques… including integration with other sources such as market research data.

Page 6: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

6

Rapid supply-side evolution. What has driven it?

For the original PR and Marketing Users…

• Boring outputs – flat lining “buzz share”• Commoditisation [seeming] of the core process by technology

newcomers• Differentiation by interface… the “Dashboard” – to emphasise

use-cases• Making user self-service easier – for all kinds of reasons• Increasingly sophisticated users… looking for outputs

suggestive of insights• The ‘social CRM’ branch

http://blog.glennz.com/evolution/

Page 7: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

7

2. What happens when Market Researchers get hold of it?

Page 8: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

8

Sony brand damage was driven by PlayStation breach (2011)

sony buzz this year

sony sentiment this year

sony buzz in april

sony sentiment in april

playstation buzz

playstation sentiment

Page 9: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

9

Market Researchers believe that SMM can also give clients a window on other dimensions of online conversations

• Category Dynamics Consumer needs Problems and issues consumer discuss Product usage discussions New product entries

• Corporate Corporate mentions related to reputation Crises Social issues

• Brand Brand/sub-brand mentions, brand “buzz” Number of positive vs. negative sentiments for

each brand Brand content analysis, what’s being said

about brand Advertising noticed most and related

discussion Source of mentions (specific sites.) and the

most influential sites

• Competition All the above for competition

SMM provides insights into:

Page 10: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

© 2012 GfK NOP 10

Inevitably they think about comparison with surveys…

Strengths• Very immediate• Unconditioned by participant awareness of

a research process Often more emotive than considered survey responses

• Spontaneously generated content - unconstrained by research frame.

• Offers insight into active social media users

• Potentially global• You can ‘ask a new question’ without

having to issue a new questionnaire*• Low cost – under certain circumstances

Weaknesses• Not necessarily representative of the general

population• Difficult to weight back to general population,

as demographic data is sparse• Automated sentiment analysis only as good

as the algorithms [and these vary greatly]• Automated harvesting can capture a lot of

‘noise’ for certain words or brands• No guarantee of sufficient data• Costs rise when we use supplementary

analysis to overcome some of these issues

*within certain technical limitations

Page 11: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

© 2012 GfK NOP 11

Different client needs indicate different SMM approaches For example - Precision Extraction vs ‘Trawl & Filter’

Crude mention & mood tracking

Quantitative - Brand tracking and integration with traditional researchIndicative Qual

e.g. using trends and volumes to guide focus of analysisExploratory Qual – more

complex collection. Manually manageable volumes and ‘tuning’

Higher data volumesfrom simple search terms

Lower data volumesfrom targeted & compound search terms

More post processing, applied to data by MR agency - to reduce noise and refine sentiment attribution

Accept raw data output from application

Page 12: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

12

3. Too Abstract?

Page 13: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

13

The raw material - Results from search terms

SMM applications extract results from wholesale supplies of data, conducting searches defined by “search terms”

• These can be anything from a simple and distinctive brand or product name, to a complex expression configured to capture discussions about a category or concept.

• A search term combines words or phrases via logical instructions such as AND, OR, NOT. They may also employ functions such as WITHIN to detect words in a certain proximity to each other. Finally – just as in mathematical equations – brackets can dictate the sequence in which the instructions are applied, e.g.

• “word1” AND ( “word2” OR “word3” )

Page 14: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

14

Typical SMM application offers a dashboard view of data returned by these search terms – and the facility to export the underlying data

Page 15: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

15

Analyses

Whatever the Search Terms define – here is what can be measured about the results returned… in combination or in isolation

Volume – “how much is it talked about, and how is this changing over time”

Channels – “where on the web is it being talked about… twitter, blogs, forums, comments?”Location – “where in the

world is it being talked about?”

Themes – “what other words and phrases are most regularly associated with it?”

People – “who is talking about it?” That may be by influence – according to various proprietary indices – or by demographics [to be used with caution]

Sentiment: Across all of these variables is superimposed automatically generated “Sentiment” analysis – positive, negative or neutral language associated with the subject of the posts…

Verbatims - drill-down to individual posts, in their own words – “what do people actually say?”

Page 16: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

16

Examples of outcomes from SMM studies

FINDING: Focus on the right social media channels at the right time. A manufacturer used a video from a high profile pop star to drive a major campaign. Predictably, when aired, the video generated a ‘spike’ of twitter activity. BUT – looking back down the timeline showed there had also been a burst of activity on forums, and some blogs, from fans of the artist when the video was being shot.

FINDING: Differentiate ‘trade press’ buzz from real engagement. A manufacturer used a novel approach, through Facebook, to support advice and collaboration between users of its product. This appeared to have some success in stimulating social media conversations about the product. However – deeper scrutiny revealed that this traffic was almost exclusively blogging by sector and marketing industry press, attracted by the novel approach, with further blog, forum and link-tweeting activity amongst sector insiders and social media enthusiasts.

Page 17: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

17

Examples of outcomes from SMM studies (2)

FINDING: Consumers don’t always talk about the product features that you highlight. Analysis of conversations about a newly launched electronics product revealed that the functional features most discussed [particularly those with largely positive sentiment attached] were not those which the manufacturer had chosen to highlight. Subsequent marketing was able to adjust to take account of these ‘more loved’ features.

FINDING: ‘The world’ can sometimes throw up more interesting stories about you than you could hope to generate for yourself… but not always with the connotations you would like. An automotive manufacturer which had enjoyed modest online buzz as a result of its own sponsorship activities experienced a ‘spike’ in online mentions which was 10 times the size – as a result of a much repeated witty comment. A high profile celebrity had appeared on TV news being interviewed from the drivers’ seat of one of their vehicles. The comment – linking the celebrity to a negative ‘folk image’ of the vehicle – spread rapidly across a range of social media channels. The moral is that spontaneous, and genuinely social, media can currently still outperform marketers.

Page 18: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

18

BUT!

Page 19: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

19

There are many forces* which erode this nice model…

Accuracy?

Reach?...................................................

Relevance?

Reach image from titletrack.com

Page 20: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

20

AccuracyIs the searched-for phrase even in the returned “snippet”?

Is it ‘content’ – or is it• Navigation?• Ticker or title content?• Ad Content?• Various species of spam [overlaps with ‘Relevance’]?

Is meta-data about the poster• Present?• Reliable?

Understanding this, apart from making your own manual checks, is about understanding your third party suppliers’ processes and content and, often, that of their ‘wholesale data suppliers’ – each of which may differ from the others.

Page 21: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

21

Reach

[T]here are known knowns; there are things we know that we know.There are known unknowns; that is to say there are things that, we now know we don't know.But there are also unknown unknowns – there are things we do not know, we don't know.

Donald Rumsfeld

• Are these results from scrutiny of the entire [English speaking] social web No• Are they results from a very large, sometimes stated, number of social sources? Yes• Could this range be skewed relative to the subject under scrutiny? Yes• Where it’s Twitter data – is it from the whole of Twitter Maybe• Is historical data always the same basis as current data, or data gathered since the search was defined? Not always• Do we always have a good idea of what the ‘Reach’ is? No

Page 22: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

22

Relevance

Even when the application has collected exactly what we asked for, and it is legitimate content, with some nice useful data about the poster… it might not be relevant

“Cats are great company.”

“#EMT Bolt one cool cat!”

“Also, the Cat is a great resort”

“I love my aunt Cat!”

“I think Cat Stark is worse than any Lanister.”

“I think this hurricane was a scam cooked up by the fat cats in Big Grocer.”

Page 23: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

23

Challenges include

However , commencing too early public smoking facts will just overstress your pet ; quite a fresh pet will not learn everything from services. Just after he has ended up perched for some a few moments, supply him with the particular take care of, plus for instance in advance of, make sure you compliment the pup. When dog house teaching your dog, continue to keep the dog house in the vicinity of the spot where you as well as the canine are usually conversing.

Page 24: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

24

And I haven’t mentioned automated Sentiment Analysis yet!

Irony – really?

Slang/Dialect/Register

Multiple meanings – “50 strong”

Adjacent subjects – “My beautiful FIAT next to a BMW”

Page 25: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

25

4. And what is Good, and what is not Good?

Page 26: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

26

To Recap• SMM tools make it very easy to “Super Google” certain Brands, people, objects and even

categories or concepts – quickly generating convincing-looking tables and charts.

• But underneath there’s a complex story about accuracy, reach and relevance… which becomes apparent on scrutiny of drilled-down text samples – and can only fully be understood by getting inside the provider’s systems and sources.

• It doesn’t mean they are misleading users – it just means that they started out somewhere else.

• The conclusion is that you have to carefully consider use cases, or build your own better mouse trap, or wait for proprietary solutions to get better at certain things

• Sentiment analysis is part of this story – but doesn’t define it.

Page 27: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

27

Natural Language Processing [NLP] to the rescue?

Definition

“Specifically, it is the process of a computer extracting meaningful information from natural language input and/or producing natural language output”*

Most SMM applications claim some level of NLP.

*Warschauer, M., & Healey, D. (1998). Computers and language learning: An overview

Whilst this may be legitimately contrasted with simple vocabulary, combination and probabilistic methods, it can end up meaning little. It may only mean that some rules of language have been ‘attended to’ in what is still essentially a pattern-matching exercise

Page 28: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

28

But clearly sophisticated NLP would make a big difference

• Improved Accuracy – including filtering out of unstructured spam

• More tools available to achieve/check Relevance

• Much-improved Sentiment Analysis

Some commercial tools have become available in the last 12 months which offer an assessment of their confidence in their own NLP analysis – dividing snippets into those with Low, Medium and High confidence.

Significantly, ‘High’ is a minority of the output.

Page 29: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

29

Barking up the wrong Tree?

The recap assumes that the Market Researcher’s instinct is correct… to make the fuzzy working of the social web itself… the collection mechanisms and enterprises, and the analytical engines… into a familiar data collection process, somehow isomorphic with surveys.

But “what is good” is, as many of the ancient philosophers would tell us, about function and purpose.

I think we’ve now learned enough,

• and experienced enough un-straightforwardness

• and contemplated enough need for manual evaluation or augmentation - dispelling the notion that this is a self-evident labour saving device along the way…

to stop and ask, “what was it we were trying to do?”

Page 30: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

30

To Recap• SMM tools make it very easy to “Super Google” certain Brands, people, objects and even

categories or concepts – quickly generating convincing-looking tables and charts.

• But underneath there’s a complex story about accuracy, reach and relevance… which becomes apparent on scrutiny of drilled-down text samples – and can only fully be understood by getting inside the provider’s systems and sources.

• It doesn’t mean they are misleading users – it just means that they started out somewhere else.

• The conclusion is that you have to carefully consider use cases, or build your own better mouse trap, or wait for proprietary solutions to get better at certain things

• Sentiment analysis is part of this story – but doesn’t define it.

Page 31: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

31

What are we trying to do?• Use the social web as a proxy for the population?

• Understand how the social web is responding – for the benefit of those solely interested in this sub-set of the population as a channel or marketplace?

• Access particularly niches which are more concentrated online than off?

• Detect significant events?

• Measure shifts and changes?

• Make rough comparisons?

• Discover new insights, themes and connections?

Page 32: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

32

How useful is extracted Social Media content?Mechanically extracted content is inevitably imperfect as regards:• relevance • comprehensiveness relative to ‘total web’• accuracy of classification, sentiment etc• representativeness of general population

In general web mining is therefore useful for:• relative measures• measuring and detecting

change or discontinuity• iterative discovery of

related concepts and drivers

• comparing channels• matching to events and

schedules

It’s important to know when this matters, and how much. It is vital to work honestly with the constraints and exploit the strengths…

…and, of course, integration with other sources of data.

Page 33: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

© 2012 GfK NOP 33

Different client needs indicate different SMM approaches For example - Precision Extraction vs ‘Trawl & Filter’

Crude mention & mood tracking

Quantitative - Brand tracking and integration with traditional researchIndicative Qual

e.g. using trends and volumes to guide focus of analysisExploratory Qual – more

complex collection. Manually manageable volumes and ‘tuning’

Higher data volumesfrom simple search terms

Lower data volumesfrom targeted & compound search terms

More post processing, applied to data by MR agency - to reduce noise and refine sentiment attribution

Accept raw data output from application

Not radical enough!

Too much like hard work

Sensible

Page 34: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

34

Rather than wait for NLP utopia…

Settle for:

1. SMM as a powerful and novel Qual exploration tool

2. Do big number crunching on brands but take a “hyena” approach.Accept all* occurrences of a brand or product name in posts as an indication of significance… even the spam and the adverts and the competitions

Similarly look for pure correlations between words/phrases and other word/phrases

Or between trends in these numbers and classes of offline events – such as sales, complaints and other behaviours… with a view to predicting, explaining or causing such events in the future.

*Except for the most obvious duplication errors such as over-indexing

Page 35: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

35

5. Some Conclusions

Page 36: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

36

I am not a scientist

OK – I’m a scientist amongst researchers, and possibly amongst programmers

But amongst scientists – and text analysis specialists – I’m a mere researcher.

Because I couldn’t use these tools “as is” with confidence I had to start delving…

… and delving is time consuming in a commercial environment.

Our technology suppliers have become more like partners… increasingly transparent as they’ve understood, but not challenged, what we tried to do. The software and services will now adapt to us – whether they should or not.

PR monitors, real time trackers and ‘social CRM’ folks will carry on using the tools the same way they always have… and may even benefit from changes my industry has now initiated.

Page 37: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

37

But

How will commercial SMM applications and services with the best accuracy, reach and relevance capabilities be recognised, validated and promoted?

Is the ‘bit in the middle’ just a holy grail until such time as the NLP part of the reckoning makes a step change – driven by all its other exploitations, such as ordinary language driven IT interfaces.

If you’re a researcher and you want to use this stuff tomorrow… what must be done?

Fortunately – there’s enough to learn by “super-googleing”, browsing and crude trend tracking to keep us going… and learning… for some time to come.

Page 38: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

38

Page 39: WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR Help! I think I might be a scientist

39

Dr Nick Buckley

Social Media Director

GfK NOP

M: 07958 516967 T: @grimbold

E: [email protected]

[from August 2012. E: [email protected]]