41
Natural Computing: The Grand Challenges and Two Case Studies Leandro Nunes de Castro [email protected] @lndecastro Computing and Informatics Faculty & Graduate Program in Electrical Engineering Natural Computing Laboratory (LCoN) www.mackenzie.br/lcon.html 1

2012: Natural Computing - The Grand Challenges and Two Case Studies

Embed Size (px)

DESCRIPTION

Talk presented at BRACIS 2012. A discussion about the Grand Challenges in Natural Computing Research and two real-world applications, one in Social Media Mining and another in E-Commerce.

Citation preview

Page 1: 2012: Natural Computing - The Grand Challenges and Two Case Studies

Natural Computing: The Grand Challenges and

Two Case Studies

Leandro Nunes de Castro [email protected]

@lndecastro

Computing and Informatics Faculty &

Graduate Program in Electrical Engineering

Natural Computing Laboratory (LCoN)

www.mackenzie.br/lcon.html

1

Page 2: 2012: Natural Computing - The Grand Challenges and Two Case Studies

• Natural Computing

– An Overview

– The Grand Challenges in Natural Computing Research

• Case Studies

– Social Media Mining

– Mining Association Rules for Recommender Systems

• Discussion

2

Summary

Page 3: 2012: Natural Computing - The Grand Challenges and Two Case Studies

Natural Computing

An Overview*

3

* de Castro, L. N. (2007), “Fundamentals of Natural Computing: An Overview”, Physics of Life Reviews, 4(1), pp. 1-36.

Page 4: 2012: Natural Computing - The Grand Challenges and Two Case Studies

• 1940s: Study of automatic computing;

• 1950s: Study of information processing;

• 1960s: Study of phenomena surrounding computers;

• 1970s: Study of what can be automated;

• 1980s: Study of computation;

• 2000s: Study of information processes, both natural and artificial.

4

Computing: Yesterday, Today and Tomorrow*

* Denning, P. (2008), “Computing Field: Structure”, In B. Wah (Ed.), Wiley Encyclopedia of Computer Science and Engineering, Wiley Interscience.

Page 5: 2012: Natural Computing - The Grand Challenges and Two Case Studies

5

From the early days of computer science, by the 1940s, researchers

have been interested in tracing parallels and designing

computational models and abstractions of natural phenomena.

Page 6: 2012: Natural Computing - The Grand Challenges and Two Case Studies

The GCs aim at defining research questions that tend to be important in the long term, identifying and characterizing potential grand research problems. These may allow the formulation of projects capable of producing major scientific advancements, with practical applications for society and technology. Emphasis is in advancing science, a vision beyond specific projects, a clear and objective success evaluation and a great ambition.

6

The Grand Challenges (GCs)

Page 7: 2012: Natural Computing - The Grand Challenges and Two Case Studies

Theoretical Works

Empirical Works

Natural Computing

Mathematical Models

Bioinspiration

Computational Synthesis of Natural

Phenomena

Computing with Natural Materials

Natural Computing: The Old View

Page 8: 2012: Natural Computing - The Grand Challenges and Two Case Studies

Natural Computing: The New Perspective

Natural Computing

Computer Modeling of

Nature

Nature-Inspired

Computing

Computer Synthesis of

Natural Phenomena

Computing with New Materials

Natural computing is a

science concerned with

the investigation and design of information processing in natural and

computational systems.

Page 9: 2012: Natural Computing - The Grand Challenges and Two Case Studies

Natural Computing

The Grand Challenges*

9

* de Castro, L. N.; Xavier, R. S.; Pasti, R.; Maia, R. D.; Szabo, A.; Ferrari, D. G. (2012), "The Grand Challenges in Natural Computing Research: The Quest for a New Science", Int. J. Nat. Comp. Res., 2(4), p. 16.

Page 10: 2012: Natural Computing - The Grand Challenges and Two Case Studies

10

Natural Computing

Biology

Physics

Chemistry

Computer Science

Natural Computing

Biology

Physics

Chemistry

Computer Science

Multidisciplinarity

Interdisciplinarity

Page 11: 2012: Natural Computing - The Grand Challenges and Two Case Studies

11

Natural Computing

Biology

Physics

Chemistry

Computer Science

GC 1: How to transpose Natural Computing into a transdisciplinary context?

Page 12: 2012: Natural Computing - The Grand Challenges and Two Case Studies

12

“Computer science differs from physics in that it is not actually a science. It does not study natural

objects. Neither is it mathematics. It’s like engineering – about getting to do something, rather than dealing with

abstractions”.* “Biology is today an information

science”** * Feynman, R. P. (1996), “The Feynman Lectures on Computation”, In A. J. G. Hey and R. W. Allen (Ed.), (Reading, MA: Addison-Wesley). ** Denning, P. J., (2001) (Ed.), The Invisible Future: The Seamless Integration of Technology in Everyday Life, McGraw-Hill.

Page 13: 2012: Natural Computing - The Grand Challenges and Two Case Studies

13

GC 2: What is the Natural Computing role in this Informational Natural Sciences Era?

Overcoming this challenge will bring two important benefits to Computing and Nature: • A Rethinking (and probably Redesign) of Computing • A New Form of Interacting With and Using Nature

Page 14: 2012: Natural Computing - The Grand Challenges and Two Case Studies

14

Natural systems are open systems that communicate with the environment presenting a complex and emergent

behavior. Complex biological systems must be modeled as self-referential, self-

organizing, and auto-generative systems whose computational behavior goes far

beyond the TM/VN paradigm. The system restructures itself in a hardware-software non-dissociable interaction: the hardware

defines the software, and the software defines the hardware.

Page 15: 2012: Natural Computing - The Grand Challenges and Two Case Studies

15

Are there standards to design (engineer) natural computing systems?*

GC 3: To what degree defining standards for the engineering of Natural Computing systems is a limiting factor for the creative

development of the field? * Brueckner, S. A.; Serugendo, G. D. M.; Karageorgos, A.; Nagpal, R., (2005), Engineering Self-Organizing Systems, Lecture Notes in Artificial Intelligence, 3464, Springer. * de Castro, L. N. (2001), Immune Engineering: Development and Application of Computational Tools Inspired by Artificial Immune Systems, Ph. D. Thesis presented at the Computer and Electrical Engineering School, Unicamp, Brazil. * Fernandez-Marquez, J. L.; Serugendo, G. D. M.; Montagna, S.; Viroli M.; Arcos J. L (2012), “Description and Composition of Bio-Inspired Design Patterns: A Complete Overview”, Natural Computing, Online, DOI 10.1007/s11047-012-9324-y. * Nagpal, R.; Mamei, M. (2004), “Engineering Amorphous Computing Systems”, Multiagent Systems, Artificial Societies, and Simulated Organizations, 11, Part V, pp. 303-320.

Page 16: 2012: Natural Computing - The Grand Challenges and Two Case Studies

Case Studies

Applied Research

16

Page 17: 2012: Natural Computing - The Grand Challenges and Two Case Studies

Web Mining

Social Media

17

Page 18: 2012: Natural Computing - The Grand Challenges and Two Case Studies

18

110 billion minutes spent in social networks

13 years = 50 million people

9 months = 100 million users

250 million tweets/day

(Nielsen, 2011)

(Alé, 2012)

(Alé, 2012)

(Datasift, 2012)

Data and Social Media

Page 19: 2012: Natural Computing - The Grand Challenges and Two Case Studies

19

Qualitative analysis of tweets.

Methodology based on text mining, natural language processing and ontologies for Sentiment Analysis (SA).

Word Sense Disambiguation (WSD).

Research Focus

Social Media Analysis Tool

Text Mining; NLP; Web Semantics

Context Twitter

Page 20: 2012: Natural Computing - The Grand Challenges and Two Case Studies

20

Social media and Microblog.

Messages (tweets) with up to 140 characters.

Stimulates simultaneous activities.

Informal, allows the creation of new terms, slangs, mix

of languages, ironies.

Twitter Features

Page 21: 2012: Natural Computing - The Grand Challenges and Two Case Studies

21

Text Mining

Semi- or unstructured data

Data Mining

Structured Data

Unstructured Data Analysis

• Tokens •Stopwords removal • Stemming • Representation • Term (feature) selection

• Association • Classification • Clustering

• APIs • Crawlers

•Confusion Matrix • Accuracy • Precision • Recall • F-measure

Page 22: 2012: Natural Computing - The Grand Challenges and Two Case Studies

22

Text Analysis t1 t2 tc

d1 w11 w12 ... w1c

d2 w21 w22 ... w2c

... ... ... ... ...

dN wN1 wN2 ... wNc

Vector Space Model

Page 23: 2012: Natural Computing - The Grand Challenges and Two Case Studies

23

Objeto

Entrar

Trancar

Porta

Molho

Guardar

Abrir

Pessoa

Presidente

Ditador

Hugo

Venezuela

Pessoa

SBT

Madruga

Kiko

Chiquinha Bruxa do

71

TV

Girafales

Chaves

In Portuguese

Page 24: 2012: Natural Computing - The Grand Challenges and Two Case Studies

24

Sentiment Analysis:

Text classification based on the author’s opinion.

Word Sense Disambiguation:

Polysemic word: different meanings in different contexts.

Word Sense Disambiguation: appropriate meaning to a text with polysemic words.

WSD: words are classified according with a predefined set of meanings.

Research Focus

Page 25: 2012: Natural Computing - The Grand Challenges and Two Case Studies

25

Predicted Class

Correct Class

Positive Negative

Positive TP FN

Negative FP TN

FNTP

TP

P

TPTPR

TNFP

FP

N

FPFPR

FNTNFPTP

TNTPACC

TPFP

TP

Pr

TPFN

TP

Re

ered

eredlevantecision

covRe

covReRePr

levant

eredlevantcall

Re

covReReRe

Interest Measures

Page 26: 2012: Natural Computing - The Grand Challenges and Two Case Studies

26

Context-Based Word Sense Disambiguation (CBWSD):

Polysemic words: e.g. Chaves, Estrelas, Na Brasa, Agora é tarde.

Context (semantic graph): OntoGeneral; OntoSpecific.

Classification based on the semantic graph.

Sentiment analysis based on Emoticons, Ontologies and Natural Computing:

Need to train the classifier.

Emoticon: graphic representation of a facial expression.

Example: :) :( :| :D

Ontology: concepts and their relations within a domain.

Case Study: Social TV

Page 27: 2012: Natural Computing - The Grand Challenges and Two Case Studies

27

Materials and Methods: CBWDS

Tweets about “Agora é tarde”:

Total: 6030 tweets

Period: 6-7 July 2012 (24 hours).

Generation of the Semantic Graph.

Case Study: Social TV

Page 28: 2012: Natural Computing - The Grand Challenges and Two Case Studies

• INCLUDE NEW RESULTS

28

Partial Results Without the Neutral Class

Predicted Class Measure Result Measure Positive Negative Positivo Negativo ACC 0.9580 Precision 0.9558 0.0544

Correct

Class

Positive 2877 0 TPR 1 Recall 1 0.5521

Negative 133 164 FPR 0.4478 F-measure 0.9774 0.0991

Total: 142766 ms - Per tweet: 36 ms

Neutral as Positive

Predicted Class Measure Result Measure Positive Negative Positive Negative ACC 0.9689 Precision 0.9741 0.0318

Correct

Class

Positive 5015 33 TPR 0.9934 Recall 0.9934 0.5521

Negative 133 164 FPR 0.4478 F-measure 0.9837 0.0602

Total: 118310 ms - Per tweet: 30 ms

Page 29: 2012: Natural Computing - The Grand Challenges and Two Case Studies

Mining Association Rules for Recommender Systems

Artificial Immune Systems

29

Page 30: 2012: Natural Computing - The Grand Challenges and Two Case Studies

• Discovery of association relations between items (attributes) in transactional databases.

30

Association Rules

Milk Bread

Cereals Butter

Milk Biscuit

Cereals Chocolate

Bread Coffee

Eggs Sugar Bread Coffee

Yogurt Sweetener

Page 31: 2012: Natural Computing - The Grand Challenges and Two Case Studies

• Given a set of transactions, where each transaction is a set of items, na association rule is a rule X Y in which X and Y are itemsets.

• Concepts:

– Coverage or support: number of transactions for which the prediction rule is correct.

– Accuracy or confidence: number of objects that the rule predicts correctly, proportionally to the instances to which it applies.

support(A B) = P(A B) = (Freq. of A and B) / (Total of T).

confidence(A B) = P(B|A) = (Freq. of A and B) / (Freq. of A).

31

Association Rules

Page 32: 2012: Natural Computing - The Grand Challenges and Two Case Studies

The problem of mining association rules corresponds

to finding all the rules that satisfy a minimal support and

confidence.

32

Page 33: 2012: Natural Computing - The Grand Challenges and Two Case Studies

33

Evolutionary Design of ARs

• Approaches:

– Pittsburgh: each individual represents the whole set of rules.

– Michigan: each individual represents a single rule, and the whole population composes the set of rules.

• Encoding scheme: A B C D E F G H

11 00 01 10 00 11 10 00

00: antecedent 11: consequent 01 ou 10: not part of the rule

Page 34: 2012: Natural Computing - The Grand Challenges and Two Case Studies

• Comprehensibility:

• Interestingness:

• Operators:

– Binary encoding allos the use of standard operators, such as single-point mutation and crossover.

34

Interest Measures and Operators

C1(R) = log(1 + |C|)/log(1 + |A C|).

I(R) = (|A C|/|A|) * (|A C|/|C|) * (1(|A C|/|D|)).

C2(R) = log(1 + |C|) + log(1 + |A C|).

Page 35: 2012: Natural Computing - The Grand Challenges and Two Case Studies

35

Algorithms Evaluated

procedure [P] = eGA(pc,pm,pe,D)

initialize P

f := evaluate(P,D);

P := select(P,f,pe);

while not_stopping_criterion do,

P := reproduce(P,f,pc);

P := variate(P,pm);

f := evaluate(P,D);

P := select(P,f,pe);

t := t+1;

end while

end procedure

procedure [P] = CLONALG1-2(D,max_it,n1,n2)

initialize P

t := 1;

while t >= max_it do,

f := evaluate(P);

P1 := select(P,n1,f)**;

C := clone(P1,f);

C := mutate(C,f);

f1 := evaluate(C1);

P1 := select(C1,n1,f1);

P := replace(P,n2);

t ← t + 1;

end while

end procedure Evolutionary

Immune

Page 36: 2012: Natural Computing - The Grand Challenges and Two Case Studies

• SPECT Heart database from UCI.

36

Case Study: Recommendation for a Synthetic Dataset

Apriori eGA CLONALG1 CLONALG2

Support 0.35 ± 0.04 0.37 ± 0.03 0.46 ± 0.02 0.37 ± 0.02

Confidence 0.65 ± 0.16 0.86 ± 0.05 0.94 ± 0.01 0.92 ± 0.01

Compreheensibility 1 0.54 ± 0.06 0.50 ± 0.05 0.50 ± 0.01 0.46 ± 0.02

Compreheensibility 2 0.14 ± 0.03 0.14 ± 0.01 0.13 ± 0.00 0.14 ± 0.01

Interestingness 0.35 ± 0.08 0.35 ± 0.08 0.30 ± 0.00 0.26 ± 0.03

Unique Rule 17 ± 0.00 1.60 ± 0.60 1.50 ± 1.50 6.40 ± 2.30

Processing Time 6.5s ± 0.00 4.5s ± 1.01 9.3s ± 1.13 9.3s ± 1.16

Page 37: 2012: Natural Computing - The Grand Challenges and Two Case Studies

37

Case Study: Recommendation for E-Commerce

Apriori eGA CLONALG1 CLONALG2

Support 0.024 0.009 ± 0.002 (0.006; 0.014) 0.013 ± 0.002 (0.011; 0.016) 0.012 ± 0.003 (0.007; 0.016)

Confidence 1.000 1.000 ± 0.000 (1.000; 1.000) 1.000 ± 0.000 (1.000; 1.000) 1.000 ± 0.000 (1.000; 1.000)

Compreheensibility 1 0.800 0.770 ± 0.028 (0.744; 0.826) 0.787 ± 0.021 (0.747; 0.822) 0.811 ± 0.022 (0.774; 0.843)

Compreheensibility 2 0.030 0.684 ± 0.001 (0.682; 0.685) 0.087 ± 0.030 (0.035; 0.136) 0.110 ± 0.024 (0.059; 0.139)

Interestingness 0.994 0.997 ± 0.000 (0.997; 0.997) 0.982 ± 0.018 (0.941; 0.997) 0.997 ± 0.000 (0.997; 0.997)

Processing Time 639.026 s 82.281 s 112,636 s 99.116 s

Page 38: 2012: Natural Computing - The Grand Challenges and Two Case Studies

Discussion

Natural Computing: The Past, Present and Future

38

Page 39: 2012: Natural Computing - The Grand Challenges and Two Case Studies

• Focus on: – Designing novel nature-inspired algorithms.

– Synthesizing natural phenomena.

– Using natural materials for computing.

• Real-world applications are unquestionable, but the field seems to be stuck on the same types of algorithms.

• Researchers are taking efforts to look at and formalize information processing in natural and computational systems.*

39

The Past and Present

* Zenil, H. (2012) (Ed.), A Computable Universe: Understanding Computation & Exploring Nature as Computation, World Scientific.

Page 40: 2012: Natural Computing - The Grand Challenges and Two Case Studies

• Grand Challenges for the field:

– Transforming Natural Computing into a Transdisciplinary Discipline.

– Unveiling and Harnessing Information Processing in Natural Systems.

– Engineering Natural Computing Systems.

40

And the Future?

Page 41: 2012: Natural Computing - The Grand Challenges and Two Case Studies

Thank You! Questions? Comments?

Leandro Nunes de Castro

[email protected]

http://slideshare.net/lndecastro

@lndecastro

www.mackenzie.br/lcon.html

www.computacaonatural.com.br

41