17
Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

  • View
    219

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

Module 2a: Information Systems

IMT530: Organization of Information Resources

Winter, 2007

Michael Crandall

Page 2: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 2

Recap

• Information objects need representation to be used in a context– physical or digital

• For storage and retrieval, some characteristics need to be agreed upon

• How those characteristics are defined determines the usability of the information

• In a digital environment, many views are possible, but not all are economically feasible

• How do you decide what to do?• That’s what we’ll be looking at the rest of the

quarter

Page 3: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 3

Exercise Results• Fit in the backpack• Use during the day• Use daily• Lightweight• Circle/semi-circle shape• Manufacturing products• Giving some type of

information• More than one color• Able to make unique noise• Having line/parallel line• No overwhelming smell• Hand oriented• Inedible• Require light to use

• Shape• Weight• Size• Color• Texture• Scent• Edible• Price• Disposability• Durability• Functionality• Perishability• Number of components• Rigidity

Page 4: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 4

Module 2a Outline

• Tacit vs. explicit information

• Information systems and functions

• Concepts of constructing IR systems

• Traditional IR systems

• Content vs. users

• The impact of the web on traditional approaches

Page 5: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 5

The Big Picture

Selamat & Choudrie, 2004

We’re here this

quarter

But don’t forget

the rest

Page 6: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 6

Information Systems

• Three functions of an information system– Storage– Retrieval – Presentation

• As size of collection grows, tools for retrieval become more complex

• Deciding on what approach to take in designing and implementing system needs a methodology to be successful

• We’ll look at traditional methods today and see how the web has impacted them

Page 7: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 7

Information Systems

Soergel, 1985

Page 8: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 8

Constructing IR Systems

• Simplest case is to search all objects for every query– not very effective at scale

• Soergel suggests four economy measures– Cutting examination time– Batching queries– Proactively collecting queries and matching

entity descriptions to anticipated needs– Providing a retrieval mechanism

Page 9: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 9

The Workings of the Black Box

Soergel, 1985

Page 10: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 10

Traditional IR Systems

• Operate on closed domains of information• Rely on surrogates for access to materials

– Physical objects are not embedded in the system, and often are not accessible digitally

• Are supported by humans – Intermediaries provide access– Intermediaries interpret results

• Are generally developed based on collections– Reflect intermediary needs, not end-user needs

Page 11: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 11

Content vs. Users

• Two types of metadata: extrinsic and intrinsic – Extrinsic comes from outside the object--used primarily for

management and administration– Intrinsic is derived from the object itself– used for description,

identification and discovery

• Soergel further breaks down intrinsic– Descriptive indexing is used to define the object– Subject indexing is used to place the object in relation to

other objects in the system’s representation of the world

• All types of metadata are important for retrieval, but the most complicated is that used to represent subjects

• Constructing this subject representation is where user needs become key

Page 12: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 12

The Effect of the Web

• Users become king• Requires a shift in thinking about how objects are

represented in a system• Removal of intermediaries means that the system

now has to fulfill that function• Means that representation of the objects from a user’s

viewpoint is critical to success of the system• Interoperability has become a major concern as

systems begin to integrate– Requires agreement on standards and models– We’ll look at some of these approaches on Thursday

Page 13: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 13

Another View of an ISAR

Indexing

User

Other Users

Query Preprocessing

Result Set Manipulation

Searching Index(es)User

Interface

Indexer

Independent Metadata

Data Stores

Data Analysis

Index Metadata

database schemasthesauri

file systemhttpmessaging storesDocument storeDatabasesDirectory stores

string manipulationsynonym sets &thesauristemmingwordbreaking

adaptive crawlingword breakingword stemmingNLP

dedupingconcatenationranking

Result Refining

User Metadata

Page 14: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 14

Questions?

• If not, take a break!!!

Page 15: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 15

Exercise 2a

• Find your groups

• Spend the next 45 minutes exploring the examples in Exercise 2a

• Ask questions and talk!!!

• Be sure to hand in completed work at the end of class for credit!!!

Page 16: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 16

IR Systems Wrapup

• Examples showed different approaches– Based on content AND audience– Emphasis may be different based on

origins

• Did you find commonalities among the examples?

• What were the main differences?– What do you think these were driven by?– Where do you think metadata fits in these

systems?

Page 17: Module 2a: Information Systems IMT530: Organization of Information Resources Winter, 2007 Michael Crandall

IMT530A- Organization of Information Resources 17

Next Time

• We’ll look at modeling information objects and relationships

• Remember to read the assignments BEFORE class

• Complex reading, but don’t panic- try to get a sense of what the articles are about, and come with questions

• See you Thursday!!