24
Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge Management January 2014

Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Embed Size (px)

Citation preview

Page 1: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Introduction to Data, Information and Knowledge

Management

Dr. Bhavani Thuraisingham

The University of Texas at Dallas

Data, Information and Knowledge Management

January 2014

Page 2: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Data Management

Concepts in database systems Types of database systems Distributed Data Management Heterogeneous database integration Federated data management

Page 3: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

An Example Database System

Database

Database Management SystemApplicationPrograms

Users

Adapted from C. J. Date, Addison Wesley, 1990

Page 4: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Metadata

Metadata describes the data in the database

- Example: Database D consists of a relation EMP with attributes SS#, Name, and Salary

Metadatabase stores the metadata

- Could be physically stored with the database Metadatabase may also store constraints and administrative

information Metadata is also referred to as the schema or data dictionary

Page 5: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Functional Architecture

User Interface Manager

QueryManager

Transaction Manager

Schema(Data Dictionary)Manager (metadata)

Security/IntegrityManager

FileManager

DiskManager

Data Management

Storage Management

Page 6: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

DBMS Design Issues

Query Processing

- Optimization techniques Transaction Management

- Techniques for concurrency control and recovery Metadata Management

- Techniques for querying and updating the metadatabase Security/Integrity Maintenance

- Techniques for processing integrity constraints and enforcing access control rules

Storage management

- Access methods and index strategies for efficient access to the database

Page 7: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Types of Database Systems

Relational Database Systems Object Database Systems Deductive Database Systems Other

- Real-time, Secure, Parallel, Scientific, Temporal, Wireless, Functional, Entity-Relationship, Sensor/Stream Database Systems, etc.

Page 8: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Relational Database: Example

Relation S:

S# SNAME STATUS CITYS1 Smith 20 LondonS2 Jones 10 ParisS3 Blake 30 ParisS4 Clark 20 LondonS5 Adams 30 Athens

Relation P:

P# PNAME COLOR WEIGHT CITYP1 Nut Red 12 LondonP2 Bolt Green 17 ParisP3 Screw Blue 17 RomeP4 Screw Red 14 LondonP5 Cam Blue 12 ParisP6 Cog Red 19 London

Relation SP:

S# P# QTYS1 P1 300S1 P2 200S1 P3 400S1 P4 200S1 P5 100S1 P6 100S2 P1 300S2 P2 400S3 P2 200S4 P2 200S4 P4 300S4 P5 400

Page 9: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Example Class Hierarchy

DocumentClass

D1 D2

Book Subclass

B1# of Chapters Volume #

Print-doc-att(ID)

Method1:

JournalSubclass

J1

Print-doc(ID)

Method2:

ID Name

Author

Publisher

Page 10: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Example Composite Object

CompositeDocument

Object

Section 1Object

Section 2Object

Paragraph 1Object

Paragraph 2Object

Page 11: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Distributed Database System

Communication NetworkDistributed Processor 1

DBMS 1

Data-base 1 Data-

base 3

Data-base 2 DBMS 2

DBMS 3

Distributed Processor 2

Distributed Processor 3

Site 1

Site 2

Site 3

Page 12: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Data Distribution

EMP1

SS# Name Salary

1 John 20 2 Paul 303 James 404 Jill 50

605 Mary6 Jane 70

D#

102020 201020

DnameD# MGR

10 30 40

Jane David Peter

DEPT1

SITE 1

SITE 2EMP2

SS# Name Salary9 Mathew 70

D#50

DnameD# MGR

50 Math John

Physics

DEPT2

David 80 30

Peter 90 40

7

8

C. Sci. English French

20 Paul

Page 13: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Interoperability of Heterogeneous Database Systems

Database System A Database System B

Network

Database System C(Legacy)

Transparent accessto heterogeneousdatabases - both usersand application programs;Query, Transactionprocessing

(Relational) (Object-Oriented)

Page 14: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Different Data Models

Node A Node B

Database Database

RelationalModel

NetworkModel

Node C

Database

Object-Oriented Model

Network

Node D

Database

HierarchicalModel

Developments: Tools for interoperability; commercial productsChallenges: Global data model

Page 15: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Federated Database Management

Database System A Database System B

Database System C

Cooperating databasesystems yet maintainingsome degree ofautonomy

Federation F1

Federation F2

Page 16: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Federated Data and Policy Management

ExportData/Policy

ComponentData/Policy for

Agency A

Data/Policy for Federation

ExportData/Policy

ComponentData/Policy for

Agency C

ComponentData/Policy for

Agency B

ExportData/Policy

Page 17: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Outline of Part I: Information Management Information Management Framework Information Management Overview Some Information Management Technologies Knowledge Management

Page 18: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

What is Information Management?

Information management essentially analyzes the data and makes sense out of the data

Several technologies have to work together for effective information management

- Data Warehousing: Extracting relevant data and putting this data into a repository for analysis

- Data Mining: Extracting information from the data previously unknown

- Multimedia: managing different media including text, images, video and audio

- Web: managing the databases and libraries on the web

Page 19: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Data Warehouse

OracleDBMS forEmployees

SybaseDBMS forProjects

InformixDBMS forMedical

Data Warehouse:Data correlatingEmployees WithMedical Benefitsand Projects

Could beany DBMS; Usually based on the relational data model

UsersQuerythe Warehouse

Page 20: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Data Mining

Data MiningKnowledge Mining

Knowledge Discoveryin Databases

Data Archaeology

Data Dredging

Database MiningKnowledge Extraction

Data Pattern Processing

Information Harvesting

Siftware

The process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data, often previously unknown, using pattern recognition technologies and statistical and mathematical techniques(Thuraisingham 1998)

Page 21: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Multimedia Information Management

VideoSource Scene

ChangeDetection

SpeakerChange

Detection

SilenceDetection

CommercialDetection

Key FrameSelection

StorySegmentation

NamedEntityTagging

Broadcast News Editor (BNE) Broadcast NewsNavigator (BNN)

Video and

Metadata

MultimediaDatabase

ManagementSystem

Web-based Search/Browse by Program, Person, Location, ...

Imagery

Audio

ClosedCaptionText

Segregate VideoStreams

Analyze and Store Video and Metadata

StoryGIST Theme

FrameClassifier

ClosedCaption

Preprocess

Correlation

Token Detection

BroadcastDetection

Page 22: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Image Processing:Example: Change Detection:

Trained Neural Network to predict “new” pixel from “old” pixel

- Neural Networks good for multidimensional continuous data

- Multiple nets gives range of “expected values” Identified pixels where actual value substantially outside range of

expected values

- Anomaly if three or more bands (of seven) out of range Identified groups of anomalous pixels

Page 23: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Semantic Web

0 Some Challenges: Security and Privacy cut across all layers

XML, XML Schemas

Rules/Query

Logic, Proof and TrustTRUST

OtherServicesRDF, Ontologies

URI, UNICODE

PRIVACY

0Adapted from Tim Berners Lee’s description of the Semantic Web

Page 24: Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge

Knowledge Management Components

Components:StrategiesProcessesMetrics

Cycle:Knowledge, CreationSharing, Measurement And Improvement

Technologies:Expert systemsCollaborationTrainingWeb

Components ofKnowledge Management: Components,Cycle and Technologies