1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with...

Preview:

Citation preview

1

Peter Fox

Xinformatics 4400/6400

Week 11, April 16, 2013

Information Audit and dealing with Unstructured Information

Reading• Information Discovery

– Information discovery graph(IDG)– Projects using information discovery– Information discovery and Library Sciences– Information Discovery and retrieval tools– Social Search

• Metadata

– http://en.wikipedia.org/wiki/Metadata– http://www.niso.org/publications/press/UnderstandingMetadata.pdf– http://dublincore.org/ 2

Contents

• Information Audit

• Unstructured Information

3

Businessdictionary.com

• Analysis and evaluation of a firm's information system (whether manual or computerized) to detect and rectify blockages, duplication, and leakage of information.

4

Objective?• The objectives of this audit

are to improve accuracy, relevance, security, and timeliness of the recorded information.

5

What is an information audit?

• An information audit is a process that effectively determines the current information environment within an organization by identifying and mapping:– What information is currently available?

– Where the information lives?

6

Results/ format (e.g.)

• The results of an information audit are twofold: there is a detailed report which includes:– What information do staff acquire? Where

from? At what cost? How is it used?

– What information do staff create? What happens to it? Where does it go?

7

Results/ format (e.g.)– What information is stored and why? What

purpose will it serve?

– What information is passed on or delivered? To whom? For what purpose? In what form?

8

Results/ format (e.g.)– Is there a gap, or a match,

between that which is available and that which is needed?

– What are the skills and responsibilities of the people who carry out these tasks?

– What equipment and tools do they have available (hardware, software, filing cabinets, web sites, etc)?

9

Results/ format (e.g.)– Are there any control documents, such as policy

statements, guidelines, service level agreements, procedures, manuals?

– Is any of the information (produced, acquired, processed, re-delivered, or stored) superfluous to needs?

– Are any of the information-handling activities non-productive?

10

Results/ format (e.g.)• There is also a detailed flow chart:

– A visual map that show the areas, processes, functions and activities through which information passes, clarifying gaps or fault-lines that need to be plugged or bottlenecks and overflows that need to be unblocked

• Sound familiar?

11

How to use?• An information audit can be used as a

baseline for making major improvements to the business process of an organization.

• It is extremely helpful in the identifying, buying, and implementation of enterprise systems– finance systems, portfolio management systems,

document management systems, learning and knowledge management systems, etc.

12

Developed for NASA TIWG

Remember the use case doc?

Developed for NASA TIWG

Event/application

Remember• It never hurts to know what you have

• Build it into the routine and do not leave it as an after-thought (yep, just like documenting your code!)

15

16

17

Sources and uses of unstructured information

- audio, video, graphics, social media messages, etc. – that which fall outside the purview of traditional databases

Data<->Information<->Knowledge• Where is the structure?

18

Data Information Knowledge

Context

PresentationOrganization

IntegrationConversation

CreationGathering

Experience

Informatics• Oh, wait – people structure information!

• Cognitive processes

– Semiotics– Mental representation– Intuition– Expertise

• But not in the same way computers can! 19

20

So what happens?• If a structured representation of

fundamentally unstructured information is useless?– Why would it be?

• What role does visual representation play in structuring information? Hint:

21

More than 10 years ago…• Unstructured Information Management Architecture

(UIMA) from IBM– “Unstructured information management (UIM) applications are software

systems that analyze unstructured information (text, audio, video, images, and so on) to discover, organize, and deliver relevant knowledge to the user. In analyzing unstructured information, UIM applications make use of a variety of analysis technologies, including statistical and rule-based Natural Language Processing (NLP), Information Retrieval (IR), machine learning, and ontologies.

– IBM's Unstructured Information Management Architecture (UIMA) is an architectural and software framework that supports creation, discovery, composition, and deployment of a broad range of analysis capabilities and the linking of them to structured information services, such as databases or search engines.

– The UIMA framework provides a run-time environment in which developers can plug in and run their UIMA component implementations, along with other independently-developed components, and with which they can build and deploy UIM applications.”

22

From way back…

23

24

Data<->Information<->Knowledge• Future?

25

Data Information Knowledge

Context

PresentationOrganization

IntegrationConversation

CreationGathering

Experience

Reading for this week• http://en.wikipedia.org/wiki/Information_audit

• http://www.librijournal.org/pdf/2003-1pp23-38.pdf

• UIMA - http://www.ibm.com/developerworks/data/downloads/uima/

• SPAR - http://tw.rpi.edu/web/inside/ideas/SPAREvaluation

26

What is next

•Today – project group meetings/ check in

•April 23 – TBD

•April 30 – written part of group project due

•May 7 – final project presentations (BE ON TIME, i.e. 5-10mins BEFORE 9AM)

– Be prepared to be asked (and answer) questions27

Recommended