Text analytics for Google Spreadsheets using dataTXT add-on

Preview:

DESCRIPTION

This add-on allows Google Spreadsheet users to enhance their textual spreadsheets content by automatically extracting named entities (such as places, persons, events or concepts) and linking them to Wikipedia by using dataTXT semantic API.

Citation preview

Doing text analysis inside Google Spreadsheets,

using dataTXT add-on

food for thoughts

wait, wait: what’s text analysis?

what’s text analysis?

turn text into data for analysis

turn text into data for analysis

why it’s useful

• Enterprise Business Intelligence/Data Mining, Competitive Intelligence • E-Discovery, Records Management • National Security/Intelligence • Scientific discovery, especially Life Sciences • Sentiment Analysis Tools, Listening Platforms • Natural Language/Semantic Toolkit or Service • Publishing • Automated ad placement • Search/Information Access • Social media monitoring

#textanalysis #dataTXT#gdrive

turn text into data for analysis

usually you have to be a developer, but now you can do a lot of things directly inside Google Spreadsheet, thanks to dataTXT add-on

http://bit.ly/dataTXT-googleSheets

#textanalysis #dataTXT#gdrive

but why is it useful?

turn text into data for analysis

infographics, tag clouds, mind maps,graphs, charts…

#textanalysis #dataTXT#gdrive

let’s start from an example…

extract useful informations from a news article published on

#textanalysis

http://edition.cnn.com/2014/09/10/world/rosetta-philae-landing-site/index.html?hpt=hp_t3

#dataTXT#gdrive

#textanalysis #dataTXT#gdrive

copy & paste this text on a Google Sheet…

#textanalysis #dataTXT#gdrive

this is just text: we call it “unstructured data”

if we select the cell, launch dataTXT add-on, and click “Analyze text”…

#textanalysis #dataTXT#gdrive

… we are performing named entity extraction with dataTXT-NEX APIs,

inside the Google Sheet

#textanalysis #dataTXT#gdrive

#textanalysis #dataTXT#gdrive

now, we find something else: a new sheet titled “Analysis” with

a lot of useful stuff…

TEXT -> it’s the original content SPOT -> the label of an “entity”, taken from the original text CONFIDENCE -> it’s a quality score of the matching

WIKIPEDIA URL -> it’s the URL of the entity on Wikipedia

#textanalysis #dataTXT#gdrive

TYPES -> the type of the entity extracted from DBpedia

CATEGORIES -> extracted from DBpedia, it’s useful as tag

so why is it useful?

before: it’s only text

now it’s contextual data

in other words:

#textanalysis #dataTXT#gdrive

the text “67P/Churyumov-Gerasimenko” has now some structured details, like

“categories”: a sort of tag set very useful:

you can do a lot of things with dataTXT add-on for Google Sheets

#textanalysis #dataTXT#gdrive

make a tag cloud using concepts labels ( typed concept )

extract persons cited in a lot of content

build some graph/chart using types found inside the content

extract some data from a lot of tweets (useful for Social Media

consultants and not so many data)

find useful keywords to enrich your content (a better SEO?)

enrich your content with useful links to contextual Wikipedia pages

and all of this without programming :) and inside your own Google Spreadsheet!

#textanalysis #dataTXT#gdrive

democratizing text analytics!

and if you are a smart guy, or a data journalist for example,

you can do something better…

#textanalysis #dataTXT#gdrive

use your Google SpreadSheet as a little database, to build smart interactive web pages

Google Spreadsheet unstructured

data

Google Spreadsheet structured

data

+ dataTXT

and don’t forget: you are using some data taken from the Linked Open Data Cloud without knowing anything

about it!

How-to install dataTXT add-on for Google Sheets

#textanalysis #dataTXT#gdrive

#textanalysis #dataTXT#gdrive

inside a Google Sheet, looking for “dataTXT” inside the store…

http://bit.ly/dataTXT-googleSheets

or using this link at the bottom…

there is a tutorial on dandelion.eu to setup it

http://bit.ly/howto-dataTXT-on-google-sheet

Unleash your creativity, give it a try!

#textanalysis #dataTXT#gdrive

http://bit.ly/dataTXT-googleSheets

@SpazioDati

Recommended