37
rum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation Niloy Ganguly <[email protected] Immune System and Search Technology Designing a Fast Search Algorithm for P2P Network using concepts from Immune Systems

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation Niloy Ganguly Immune System and Search Technology Designing a Fast Search Algorithm

Embed Size (px)

Citation preview

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Immune System and Search Technology

Designing a Fast Search Algorithm for P2P Network using concepts from

Immune Systems

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Overview of the Presentation● P2P Network

– Paradigm for Decentralised Computing

● Immune System Features

● Experimental Setup

● Simulation Results

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Peer To Peer Network● Most Direct Method of Connecting Computers

– Simple

– Inexpensive

– No Boss

– No Regulation

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Peer To Peer Network● PCs at the edge of the network are called “Peers”● Peers can retrieve objects directly from each other

Advantages of a P2P Network

A large collection of peers may be available for content distribution--sometimes millions!

User takes advantage of the network’s currently available resources.

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Peer To Peer Network● Problem of Hugeness

– Emergence of Protocol

● Centralized Directory– Napster

● Decentralized Directory– KaZaA

● Query Flooding– Gnutella

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

P2P: Centralized Directory (Napster)When peer connects, it informs

central server:– IP address– content

Centralized

directory server

peers

Alice

Bob

1

1

1

1

3

Alice queries for

Das Wunder von Bern

Alice requests file from Bob

While file transfer is decentralized, locating content is highly centralized

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

P2P: Centralized Directory (Napster)● Fast ● Single point of failure

– Application crash● Performance bottleneck● Huge database to

maintain● Copyright infringement

– Legal proceedings may result in the company having to shut down directory server

Centralized

directory server

peers

Alice

Bob

1

1

1

1

3

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

P2P: Intermediate Arrangement (Kazaa)FeatureHas a centralized server that •maintains user registrations, •logs users into the systems to keep statistics, •provides downloads of client software.

Two client types are supported: Supernodes (fast cpus + high bandwidth connections)Nodes (slower cpus and/or connections)

Supernodes addresses are provided in the initial download. They also maintain searchable indexes and proxies search requests for users.

^

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

P2P: Totally Decentralized (Gnutella) Basic Feature● no hierarchy, peers have

similar responsibilities: no group leader

● no peer maintains directory info

● highly decentralized

Joining Algorithm ● use bootstrap node to

learn about others● Join message

^

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

P2P: Totally Decentralized (Gnutella) Message Query : ● Send query to neighbors● Neighbors forward query● If queried peer has object, it

sends message back to querying peer

● The queried peer forwards the query to its immediate neighbor.

● The resulting results are carried back to the user.

● A message Flooding occurs

^

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

P2P: Totally Decentralized (Gnutella) Pros : ● Totally Decentralized query ● Robust; Query doesn't stop

on break down of one of the nodes

● Fresh Results : No outdated Index

Cons ● Query radius: Query Radius

can be long● Excessive query traffic :

25% of the total traffic is query traffic

Courtesy : Limewire

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

P2P: Totally Decentralized (Gnutella) Challenges Ahead : ● Reduce Query time● Stop Flooding; use

Intelligent method for search to stop network congestion

Topology of Gnutella Network

Total Traffic in Gnutella Network is 1.7 Gbps1.7% of total traffic in US Internet Backbone

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

P2P: Totally Decentralized (Gnutella) Perspective● Introduce Intelligence in

the System through Bio-Inspired Techniques

● Ants, Immune System

Topology of Gnutella Network

Total Traffic in Gnutella Network is 1.7 Gbps1.7% of total traffic in US Internet Backbone

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Artificial Immune System● Relatively new branch of computer science

– Using natural immune system as a metaphor for solving computational problems

– Not modelling the immune system

● Variety of applications so far …– Fault diagnosis (Ishida)– Computer security (Forrest, Kim)– Novelty detection (Dasgupta)– Robot behaviour (Lee)– Machine learning (Hunt, Timmis, de Castro)

– AIS are computational systems, inspired by theoretical immunology and observed immune functions, which are applied to complex problem domains (Timmis, 2001)

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Why the Immune System?

● Recognition– Anomaly detection– Noise tolerance

● Robustness● Feature extraction● Diversity● Reinforcement learning● Memory● Distributed● Adaptive

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Role of the Immune System

● Protect our bodies from infection

● Primary immune response– Launch a response to

invading pathogens● Secondary immune

response– Remember past

encounters– Faster response the

second time around

MHC protein Antigen

APC

Peptide

T-cell

Activated T - cell

B- cell

Lymphokines

Activated B -cell (plasma cell)

( I )

( III )

( IV )

( V )

( VI )

( VII )

( II )

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Role of the Immune System

● Remembers encounters– No need to start from

scratch– Memory cells

Lymphatic vessels

Lymph nodes

Thymus

Spleen

Tonsils andadenoids

Bone marrow

Appendix

Peyer’s patches

Primary lymphoidorgans

Secondary lymphoidorgans

Epitopes

-B cell Receptors

Antigen

The immune recognition is based on the complementarily between the binding region of the receptor and a portion of the antigen called epitope.

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Role of the Immune System● Antibodies present a single

type of receptor, antigens might present several epitopes.

● This means that different antibodies can recognize a single antigen

Lymphatic vessels

Lymph nodes

Thymus

Spleen

Tonsils andadenoids

Bone marrow

Appendix

Peyer’s patches

Primary lymphoidorgans

Secondary lymphoidorgans

Epitopes

-B cell Receptors

Antigen

The immune recognition is based on the complementarily between the binding region of the receptor and a portion of the antigen called epitope.

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Clonal Selection (Burnet, 1978)

Foreign antigens

Proliferation

(Cloning)

Differentiation

Plasma cells

Memory cellsSelection

M

M

Antibody

Self-antigen

Self-antigen

Clonal deletion

(negative selection)

Clonal deletion

(negative selection)

● Elimination of self antigens

● Proliferation and differentiation on

contact of mature lymphocytes with

antigen

● Restriction of one pattern to one

differentiated cell and retention of

that pattern by clonal descendants

● Generation of new random genetic

changes, subsequently expressed as

diverse antibody patterns by a form

of accelerated somatic mutation

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

General Framework for AIS

Application Domain

Representation

Affinity Measures

Immune Algorithms

Solution

P2P Network Search

Search Item - Antigen

Similarity (message,search item)

ImmuneSearch Algorithm

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Reiterating the Perspective

Solution

P2P Network Search

Search Item - Antigen

Similarity (message,search item)

ImmuneSearch Algorithm

Design Search Algorithm● Stop Flooding; ● Reduce Query Time

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Modelling the Network

Design Search Algorithm● Stop Flooding; ● Reduce Query Time

Information Profile – Immune SystemSearch Profile – Fußball

User

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Modelling the Network

Design Search Algorithm● Stop Flooding; ● Reduce Query Time Zipf Law

(Information and SearchProfile)

1

1

1

1

1

1

1

1

0

0

0

0

0

2

2

3

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Search the Network – Flooding

Flooding essentially implies sending the message packet to all the neighboring nodes

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Search the Network – Random Walk

A Message packet travels at its will

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Search the Network – Immune Search

Algorithm Consists of two parts

1. The movement of Message Packets

2. Rearrangement of Topology

Proliferation

Mutation

High Concentration of Packets HomingAntibodies

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Search the Network – Immune Search

Algorithm Consists of two parts

1. The movement of Message Packets

2. Rearrangement of Topology

Aim Cluster Similar Nodes (Similar in Information and Search Profile)

AlgorithmMove nodes similar to user node closer to the user

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Search the Network – Immune Search

Movement Depends on1. The Distance from the

user node2. Amount of Matching3. Age

Aim Cluster Similar Nodes (Similar in Information and Search Profile)

AlgorithmMove nodes similar to user node closer to the user

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Search the Network – Immune Search

Movement Depends on1. The Distance from the

user node2. Amount of Matching3. Age

Aim Cluster Similar Nodes (Similar in Information and Search Profile)

AlgorithmMove nodes similar to user node closer to the user

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Search the Network – Immune Search

Movement Depends on1. The Distance from the

user node2. Amount of Matching3. Age

Aim Cluster Similar Nodes (Similar in Information and Search Profile)

AlgorithmMove nodes similar to user node closer to the user

No Movement

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Experimental Results

Experiment : • Run for 100

generation, without changing the participating nodes

• Each Generation 100 searches by users selected randomly

Efficiency • No. Of Search Items

found in 50 time steps

100

100

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Experimental Results (Clustering)

100

100

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Experimental Results

Experiment : Change 20 % of the node

100

100

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Experimental Results

Experiment : Change 5% of the node at

each generation

100

100

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Amount of Change in Neighborhood

Experimental Results

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

● Simulate the Results in Real Network

● Take into account the important concept of Network Traffic

● Test the algorithm with sophisticated Information Profile and Search Profile

● Building up mathematical framework through which the simulation results can be analytically justified

Future Work

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation

Niloy Ganguly <[email protected]>

Fragen und Antworten