20
Concise Preservation by combining Managed Forgetting and Contextualized Remembering

Managed Forgetting (WP3 - ForgetIT 1st year review)

Embed Size (px)

Citation preview

Page 1: Managed Forgetting (WP3 - ForgetIT 1st year review)

Concise Preservation by combining Managed

Forgetting and Contextualized Remembering

Page 2: Managed Forgetting (WP3 - ForgetIT 1st year review)

Nattiya Kanhabua

L3S Research Center

WP 3 Presentation

Managed Forgetting

ForgetIT 1st Review Meeting, April 29-30, 2014

Kaiserslautern, Germany

Page 3: Managed Forgetting (WP3 - ForgetIT 1st year review)

WP3 Objectives

• Conceptual model for managed forgetting Foundations of human-brain inspired managed forgetting

• Development of managed forgetting methods Information value assessment

Set of methods for Preserve-or-Forget

Policy-driven approach to managed forgetting (Y2)

Focus of Year 1

• Conceptual model for managed forgetting

• Design and implement the core managed forgetting process

• Exploratory research of information value assessment

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Objectives of WP3 and Year 1 Focus

Page 4: Managed Forgetting (WP3 - ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Role in Preserve-or-Forget Architecture

Page 5: Managed Forgetting (WP3 - ForgetIT 1st year review)

Research questions and first ideas for complementing human memory

(co-worked with WP2, D3.1) • Episodic memory: reconstruct lifetime memories and support reminiscence

• Working memory: better focus in current information use

Information value assessment (co-worked with WP9, D3.2)

• Data model and a computation method based on Semantic Web technologies

• Integration to PIMO semantic desktop and Preserve-or-Forget middleware

Exploratory studies (D3.2)

• Analyzing collective memory of public events in Wikipedia

• Analyzing high-impact features for content retention in the Social Web

• Feature selection for efficiency and scalability

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Achievements in Year 1

Page 6: Managed Forgetting (WP3 - ForgetIT 1st year review)

Goal: understand how to complement human memory processes

Focus on two types of memories:

• Episodic memory: support reminiscence of long-term autobiographical events

• Working memory: better focus in current information use, e.g. de-cluttering

personal information spaces

Two information values: memory buoyancy, and preservation value

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Complementing Human Memory: Our First Ideas

Page 7: Managed Forgetting (WP3 - ForgetIT 1st year review)

Memory buoyancy

• Information objects sinking down with decreasing importance, usage, etc.

Preservation value

• Used to decide which information object will be preserved or archived

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Information Value Assessment

Memory Buoyancy Preservation Value

Short-/Mid-term current interests

E.g. meeting or travel documents

Long-term need for future use

E.g. important life events

Subjective metrics

+ usage logs (views, edits, modifies)

+ time, e.g., aging or recency

+ social context, external influences

Objective metrics

+ diversity, coverage, quality

Page 8: Managed Forgetting (WP3 - ForgetIT 1st year review)

Rapidly forget details -> “less redundancy”

Reconstruct from similar events, context

Rely on common patterns -> “false memory”

Our first ideas:

• Store details differing among similar event types forgotten in human memory

• Event-centric organization of digital items can play an important role

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Forgetting in Episodic Memory

Page 9: Managed Forgetting (WP3 - ForgetIT 1st year review)

Memory bumps or peaks in the forgetting curve

Reminded or triggered the original memory by:

• A physical object (e.g. a printed photo)

• A digital memory system

• Different subsequent events

Our ideas:

• Propagate increased interest in an event to related events

• Consider common things, e.g., same entities, or similar event types

• Increase relevance level or use of memory buoyancy

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Triggering of Memories

Page 10: Managed Forgetting (WP3 - ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Analyzing Collective Memory in Wikipedia

Identify catalysts for reviving memories

Analyze re-visiting behaviors

• Page views of a large set of events

• Time series analysis

11 Wikipedia categories

• Number of triggering events

• Number of events possibly triggered

Page 11: Managed Forgetting (WP3 - ForgetIT 1st year review)

Temporal and spatial distributions

• Strong focus on more recent events

• Better coverage with increasing popularity

• Most frequent locations depending on event types

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Temporal and Spatial Distributions

Page 12: Managed Forgetting (WP3 - ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Our Approach and Results

Remembering score as a function (e.g., detecting co-peaks in views) of revisiting behavior

Correlate remembering scores vs. time and location similarities

Hurricane Sandy Findings:

• Hurricane Sandy triggers 1991 Perfect Storm

initially formed around Canada area, which is

high impact (most destructive and costly) ones

• 2011 Christchurch earthquake triggers recent

events in the same region, i.e., 2010 Canterbury

earthquake

Page 13: Managed Forgetting (WP3 - ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Our Approach and Results

Remembering score as a function (e.g., detecting co-peaks in views) of revisiting behavior

Correlate remembering scores vs. time and location similarities

Hurricane Sandy 2011 Christchurch earthquake Findings:

• Hurricane Sandy triggers 1991 Perfect Storm

initially formed around Canada area, which is

high impact (most destructive and costly) ones

• 2011 Christchurch earthquake triggers recent

events in the same region, i.e., 2010 Canterbury

earthquake

Page 14: Managed Forgetting (WP3 - ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Memory Buoyancy: Simplified Computation

Me

mo

ry B

uo

ya

nc

y

Time

Compute: MB(D, t)

Time

Ac

ce

ss

Lo

gs

t1 t2

Page 15: Managed Forgetting (WP3 - ForgetIT 1st year review)

Proposed MB assessment framework:

• Initialize MB values of resources

using a time-decay forgetting function:

• Incrementally update MB using

Random Walk on resource graph:

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Memory Buoyancy Assessment

|'|)( )( ttt DecayRatermb

r

e2

Edfringe photo (2011)

Photos @ iPhone

e3

Folder @ computer

e1

Shortcut folder @ desktop

e4 e6

Photo @ ForgetIT Meeting (2013)

contains

contains

contains

hasSamePlace

hasSamePlace

e5 hasEntity

Whiskey photo (2012)

2

)(

1

)(

2

1)( 4

)(6

)()1( embemb

rmbt

Dasht

DashtDash

Averaged value over

two inlinked resources

Less propagation

account for two outlinks

hasSamePlace

e5

Whiskey Tour (2009)

hasSamePlace

Page 16: Managed Forgetting (WP3 - ForgetIT 1st year review)

Social Web apps gain popularity

Personal Web archives

Study: Identifying memorable content • 20 participants, 15 male and 5 female

• Rate (3,330) posts by relevance for future

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Content Retention in Social Web Applications

Year in Review: photo from the Internet

Page 17: Managed Forgetting (WP3 - ForgetIT 1st year review)

Machine learning techniques

• Support vector machine, Bayesian network, and decision tree (J48)

80 features from categories:

• Content types + meta data

• Social interactions

• Temporal

• Privacy

• Graph

Correlation-based feature selection (CFS) • Temporal: highest impact features

• Graph: low impact for memorable posts

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Learning to Classify Memorable Content

Page 18: Managed Forgetting (WP3 - ForgetIT 1st year review)

Classification results: • Baseline Features (CS): No. of likes, comments, and shares

• Baseline 69% (F-Measure)

• Top 9 features 79% (F-Measure)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Classification Results

Page 19: Managed Forgetting (WP3 - ForgetIT 1st year review)

1. M. Georgescu, D. D. Pham, N. Kanhabua, S. Zerr, S. Siersdorfer and W. Nejdl, Temporal Summarization of

Event-Related Updates in Wikipedia (demo), Proceedings of the 22nd International World Wide Web Conference

(WWW'13), May, 2013.

2. M. Georgescu, N. Kanhabua, D. Krause, W. Nejdl and S. Siersdorfer, Extracting Event-Related Information from

Article Updates in Wikipedia, Proceedings of the 35th European conference on Advances in Information Retrieval

(ECIR'13), March, 2013.

3. N. Kanhabua and C. Niederée, Preservation and Forgetting: Friends or Foes?, In the First International

Workshop on Archiving Community Memories (in conjunction with iPRES'2013), September, 2013.

4. N. Kanhabua, C. Niederée and W. Siberski, Towards Concise Preservation by Managed Forgetting: Research

Issues and Case Study, Proceedings of the 10th International Conference on Preservation of Digital Objects

(iPRES'2013), September, 2013.

5. K. D. Naini and I.S. Altingovde, Exploiting Result Diversification Methods for Feature Selection in Learning to

Rank, Proceedings of the 36th European conference on Advances in Information Retrieval (ECIR'2014), April, 2014.

6. A. Ceroni and M. Fisichella, Towards an Entity-based Automatic Event Validation, Proceedings of the 36th

European conference on Advances in Information Retrieval (ECIR'2014), April, 2014.

7. T. N. Nguyen and N. Kanhabua, Leveraging Dynamic Query Subtopics for Time-aware Search Result

Diversification, Proceedings of the 36th European conference on Advances in Information Retrieval (ECIR'2014),

April, 2014.

8. K. D. Naini, R. Kawase, N. Kanhabua and C. Niederée, Characterizing High-impact Features for Content

Retention in Social Web Applications (poster), Proceedings of the 23rd International World Wide Web

Conference (WWW'2014), Seoul, Korea, April, 2014.

9. T. A. Tran, M. Georgescu, X. Zhu and N. Kanhabua, Ars longa, vita brevis: Analysing the Duration of Trending

Topics in Twitter Using Wikipedia (poster), (To appear) Proceedings of the ACM Web Science 2014 Conference

(WebSci'2014), Bloomington, USA, June, 2014.

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Publications

Page 20: Managed Forgetting (WP3 - ForgetIT 1st year review)

Thank you for your attention!