27
1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Embed Size (px)

Citation preview

Page 1: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

1

Peter Fox

Xinformatics 4400/6400

Week 11, April 16, 2013

Information Audit and dealing with Unstructured Information

Page 2: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Reading• Information Discovery

– Information discovery graph(IDG)– Projects using information discovery– Information discovery and Library Sciences– Information Discovery and retrieval tools– Social Search

• Metadata

– http://en.wikipedia.org/wiki/Metadata– http://www.niso.org/publications/press/UnderstandingMetadata.pdf– http://dublincore.org/ 2

Page 3: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Contents

• Information Audit

• Unstructured Information

3

Page 4: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Businessdictionary.com

• Analysis and evaluation of a firm's information system (whether manual or computerized) to detect and rectify blockages, duplication, and leakage of information.

4

Page 5: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Objective?• The objectives of this audit

are to improve accuracy, relevance, security, and timeliness of the recorded information.

5

Page 6: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

What is an information audit?

• An information audit is a process that effectively determines the current information environment within an organization by identifying and mapping:– What information is currently available?

– Where the information lives?

6

Page 7: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Results/ format (e.g.)

• The results of an information audit are twofold: there is a detailed report which includes:– What information do staff acquire? Where

from? At what cost? How is it used?

– What information do staff create? What happens to it? Where does it go?

7

Page 8: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Results/ format (e.g.)– What information is stored and why? What

purpose will it serve?

– What information is passed on or delivered? To whom? For what purpose? In what form?

8

Page 9: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Results/ format (e.g.)– Is there a gap, or a match,

between that which is available and that which is needed?

– What are the skills and responsibilities of the people who carry out these tasks?

– What equipment and tools do they have available (hardware, software, filing cabinets, web sites, etc)?

9

Page 10: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Results/ format (e.g.)– Are there any control documents, such as policy

statements, guidelines, service level agreements, procedures, manuals?

– Is any of the information (produced, acquired, processed, re-delivered, or stored) superfluous to needs?

– Are any of the information-handling activities non-productive?

10

Page 11: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Results/ format (e.g.)• There is also a detailed flow chart:

– A visual map that show the areas, processes, functions and activities through which information passes, clarifying gaps or fault-lines that need to be plugged or bottlenecks and overflows that need to be unblocked

• Sound familiar?

11

Page 12: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

How to use?• An information audit can be used as a

baseline for making major improvements to the business process of an organization.

• It is extremely helpful in the identifying, buying, and implementation of enterprise systems– finance systems, portfolio management systems,

document management systems, learning and knowledge management systems, etc.

12

Page 13: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Developed for NASA TIWG

Remember the use case doc?

Page 14: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Developed for NASA TIWG

Event/application

Page 15: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Remember• It never hurts to know what you have

• Build it into the routine and do not leave it as an after-thought (yep, just like documenting your code!)

15

Page 16: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

16

Page 17: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

17

Sources and uses of unstructured information

- audio, video, graphics, social media messages, etc. – that which fall outside the purview of traditional databases

Page 18: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Data<->Information<->Knowledge• Where is the structure?

18

Data Information Knowledge

Context

PresentationOrganization

IntegrationConversation

CreationGathering

Experience

Page 19: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Informatics• Oh, wait – people structure information!

• Cognitive processes

– Semiotics– Mental representation– Intuition– Expertise

• But not in the same way computers can! 19

Page 20: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

20

Page 21: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

So what happens?• If a structured representation of

fundamentally unstructured information is useless?– Why would it be?

• What role does visual representation play in structuring information? Hint:

21

Page 22: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

More than 10 years ago…• Unstructured Information Management Architecture

(UIMA) from IBM– “Unstructured information management (UIM) applications are software

systems that analyze unstructured information (text, audio, video, images, and so on) to discover, organize, and deliver relevant knowledge to the user. In analyzing unstructured information, UIM applications make use of a variety of analysis technologies, including statistical and rule-based Natural Language Processing (NLP), Information Retrieval (IR), machine learning, and ontologies.

– IBM's Unstructured Information Management Architecture (UIMA) is an architectural and software framework that supports creation, discovery, composition, and deployment of a broad range of analysis capabilities and the linking of them to structured information services, such as databases or search engines.

– The UIMA framework provides a run-time environment in which developers can plug in and run their UIMA component implementations, along with other independently-developed components, and with which they can build and deploy UIM applications.”

22

Page 23: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

From way back…

23

Page 24: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

24

Page 25: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Data<->Information<->Knowledge• Future?

25

Data Information Knowledge

Context

PresentationOrganization

IntegrationConversation

CreationGathering

Experience

Page 26: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

Reading for this week• http://en.wikipedia.org/wiki/Information_audit

• http://www.librijournal.org/pdf/2003-1pp23-38.pdf

• UIMA - http://www.ibm.com/developerworks/data/downloads/uima/

• SPAR - http://tw.rpi.edu/web/inside/ideas/SPAREvaluation

26

Page 27: 1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information

What is next

•Today – project group meetings/ check in

•April 23 – TBD

•April 30 – written part of group project due

•May 7 – final project presentations (BE ON TIME, i.e. 5-10mins BEFORE 9AM)

– Be prepared to be asked (and answer) questions27