23
“The use of visual analytics tools for unstructured content analysis” David Whitehead TBA, Vancouver Visual Analytics

Proposal 12 - Visual Analytics

Embed Size (px)

DESCRIPTION

The use of visual analytics tools for unstructured content analysis

Citation preview

Page 1: Proposal 12 - Visual Analytics

“The use of visual analytics tools for unstructured content analysis”

David WhiteheadTBA, Vancouver

Visual Analytics

Page 2: Proposal 12 - Visual Analytics

Visual Analytics

Page 3: Proposal 12 - Visual Analytics

My Proposal

Project• Investigate adopting visual analytics tools for unstructured content analysis and provide these tools as a service for TBA's and IS's

in support of client and sector needs.

Stage 1: needs analysis• What unstructured content analysis does CISTI perform? • What types of client questions could/should we use VA tools to help answer? • How would different CISTI teams make use of VA tools? • What experience does CISTI have with VA and unstructured content analysis tools?

Stage 2: survey available tools• What tools are available? • What are the tradeoffs for each tool? • How well do they meet CISTI's needs?

Stage 3: pilot study• Select and deploy 1 or 2 VA tools for use in a pilot study • Select and train up to 3 IS/TBA teams in the use of Visual Analysis techniques and tools • Each team to use VA tools to work with at least 2 client projects • Teams to report on the effectiveness, usability, strengths and weaknesses of the selected VA tools. • Clients to feedback on the usefulness of the VA tool analysis/findings

Deliverables1. Unstructured content analysis tools needs assessment2. Survey of available visual analytics tools3. Train 1 or 2 IS/TBA teams on Visual Analytics techniques4. Pilot Study report

Page 4: Proposal 12 - Visual Analytics

My Proposal

Project• Investigate adopting visual analytics tools for unstructured content analysis and provide these tools as a service for TBA's and IS's

in support of client and sector needs.

Stage 1: needs analysis• What unstructured content analysis does CISTI perform? • What types of client questions could/should we use VA tools to help answer? • How would different CISTI teams make use of VA tools? • What experience does CISTI have with VA and unstructured content analysis tools?

Stage 2: survey available tools• What tools are available? • What are the tradeoffs for each tool? • How well do they meet CISTI's needs?

Stage 3: pilot study• Select and deploy 1 or 2 VA tools for use in a pilot study • Select and train up to 3 IS/TBA teams in the use of Visual Analysis techniques and tools • Each team to use VA tools to work with at least 2 client projects • Teams to report on the effectiveness, usability, strengths and weaknesses of the selected VA tools. • Clients to feedback on the usefulness of the VA tool analysis/findings

Deliverables1. Unstructured content analysis tools needs assessment2. Survey of available visual analytics tools3. Train 1 or 2 IS/TBA teams on Visual Analytics techniques4. Pilot Study report

OVERLY

AMBITIOUS

Page 5: Proposal 12 - Visual Analytics

Reality

One Visual Analytics tool:

• “Starlight” from Futurepoint Systems

Two projects:

1. Catalog of the BC Wireless Industry

2. Finding an appropriate partner for an enterprise USB Flash drive solution

Page 6: Proposal 12 - Visual Analytics

I tried one Visual Analytics tool out on two projects:

1. Catalog of the BC Wireless Industry

2. Finding an appropriate partner for an enterprise USB Flash drive solution

Reality

Practical

Page 7: Proposal 12 - Visual Analytics

Why Visual Analytics

I don’t look for terrorists, but I do reduce risk

Page 8: Proposal 12 - Visual Analytics

Visual Analytics

“People use visual analytics tools and techniques to synthesize information and derive insight from massive, dynamic, ambiguous, and often conflicting data; detect the expected and discover the unexpected; provide timely, defensible, and understandable assessments; and communicate assessment effectively for action.”

From: http://en.wikipedia.org/wiki/Visual_analytics

Page 9: Proposal 12 - Visual Analytics

Visual Analytics

Visual Analytics Sounds a lot like CISTI!

• synthesize information • derive insight from massive, dynamic, ambiguous, conflicting data• detect the expected and discover the unexpected• provide timely, defensible, and understandable assessments• communicate assessment effectively for action

Page 10: Proposal 12 - Visual Analytics

Why explore Visual Analytics

• Clients come to IS’s and TBA’s expecting "the magic answers" and "a crystal ball“

• Our success depends on our ability to meet that un-written brand promise, as unrealistic as it may be.– IS’s and TBA’s need to be the experts in information retrieval and

analysis, including sophisticated analysis of unstructured content.– IS’s and TBA’s have the expertise to interpret and convey the

results to business clients.

Page 11: Proposal 12 - Visual Analytics

Visual Analytics

Turns This Into This1 - 10 of about 3,660,000 for enterprise usb content distribution

Page 12: Proposal 12 - Visual Analytics

Demo

Incredibly Amazing Demo

Page 13: Proposal 12 - Visual Analytics

Multiple Views

Page 14: Proposal 12 - Visual Analytics

Multiple Views

Page 15: Proposal 12 - Visual Analytics

Subject Views

Page 16: Proposal 12 - Visual Analytics

Good

Clustering of Data• Helps identify key

topics for analysis• Quickly eliminates

results of no value• Provides some

confidence in the uniqueness of a solution

• Allows broader coverage than manual approaches

Page 17: Proposal 12 - Visual Analytics

Bad

• Deeper Analysis requires a lot of preparation and data manipulation

• Getting the data from here to something consumable is hard

• Requires a lot of learning to get meaningful results

• Best used by experts in a domain

Page 18: Proposal 12 - Visual Analytics

Awesome!

Data input tools• Automatically crawl

sites• Convert

unstructured documents into data

• Look for multiple parameters simultaneously

• Automatically extract entities like people, places and companies

Page 19: Proposal 12 - Visual Analytics

Test 1

Enterprise USB Flash Application Test

Traditional Approach Visual Analytics Approach

•7 hours •Read 53 web pages•Identified 20 potential partners•Narrowed to 4 likely partners•Identified 2 key competing approaches

•2 hours•Analyzed about 600 web pages•Identified 1 likely partner•Identified 7 competing approaches

Page 20: Proposal 12 - Visual Analytics

Test 2

Wireless Industry Profile

Traditional Approach Visual Analytics Approach

•1 month •Read about 1000 web pages•Classified 93 pre-defined technologies across 277 mobile industry firms in 2 cities (Vancouver & Ottawa)

•8 hours•1st web crawler pass•Analyzed about 11000 web pages starting with links to 137 companies•Identified 123 separate technologies

Page 21: Proposal 12 - Visual Analytics

General Observations

• Analytics result in much faster generation of garbage out– Many trials needed to get the automation right– Once right – automation dramatically helps in large environmental

scans.

• The two techniques complement each other well, help to double check findings– Helps find more outliers– Helps identify useful items to look for

• Investigative techniques are essential to drawing real conclusions– Very hard to communicate the process and the comparative value of

results using the tools alone.

Page 22: Proposal 12 - Visual Analytics

Visual Analytics

“Visual Analytics is the integration of interactive visualization with analysis techniques to answer a growing range of questions in science, business, and analysis. It can attack certain problems whose size, complexity, and need for closely coupled human and machine analysis may make them otherwise intractable.”

From: http://en.wikipedia.org/wiki/Visual_analytics

Page 23: Proposal 12 - Visual Analytics

Why learn more…

Anyone can use Google…

1. VA tools are the tools used by leaders in large scale information analysis such as government security forces

2. Visual Analytics help make sense of information overload– Making sense of information overload is the essential skill for

information analysts.

3. Information overload is every industries problem– VA tools are moving out of the security world into the realm of

business and scientific analysis– expertise in applying tools like VA to analysis tasks in the fields of

business and science is what will set CISTI IS’s and TBA’s apart.