123
Social-based Data Routing Strategies in Delay Tolerant Networks Dissertation for the award of the degree “Doctor of Philosophy” Ph.D. Division of Mathematics and Nature Science of the Georg-August-Universität Göttingen within the doctoral program in Computer Science (PCS) of the Georg-August University School of Science (GAUSS) submitted by Konglin Zhu from Shandong, China Göttingen, 2014

Social-based Data Routing Strategies in Delay Tolerant

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Social-based Data Routing Strategies in Delay Tolerant

Social-based Data Routing Strategies inDelay Tolerant Networks

Dissertationfor the award of the degree

“Doctor of Philosophy” Ph.D. Division of Mathematics and Nature Scienceof the Georg-August-Universität Göttingen

within the doctoral program in Computer Science (PCS)of the Georg-August University School of Science (GAUSS)

submitted byKonglin Zhu

from Shandong, ChinaGöttingen, 2014

Page 2: Social-based Data Routing Strategies in Delay Tolerant

Thesis Committee

Prof. Dr. Xiaoming Fu

(Institute of Computer Science/Computer Networks Group, University of

Göttingen)

Prof. Dr. Dieter Hogrefe

(Institute of Computer Science/Telematics Group, University of Goettingen)

Prof. Dr. Wenzhong Li

(Department of Computer Science/State Key Laboratory for Novel Software and

Technology, Nanjing University)

Members of the Examination Board

Reviewer: Prof. Dr. Xiaoming Fu

(Institute of Computer Science/Computer Networks Group, University of

Göttingen)

Second Reviewer: Prof. Dr. Mario Gerla

(Department of Computer Science/Network Research Lab, UCLA)

Further members of the Examination Board

Prof. Dr. Dieter Hogrefe

(Institute of Computer Science/Telematics Group, University of Göttingen)

Prof. Dr. Jens Grabowski

(Institute of Computer Science/Software Engineering for Distributed Systems,

University of Göttingen)

Prof. Dr. Carsten Damm

(Institute of Computer Science/Theory and Algorithmic Methods, University of

Göttingen)

Prof. Dr. Stephan Waack

(Institute of Computer Science/Theory and Algorithmic Methods, University of

Göttingen)

Date of the oral examiniation: 25.02.2014

Page 3: Social-based Data Routing Strategies in Delay Tolerant

Abstract

Delay Tolerant networks (DTNs) are intermittently connected mobile networks in

which the end-to-end paths do not exist. Data delivery in such networks relies

upon the contacts that use “store-carry-and-forward” paradigm to forward message

from one node to another. However, such intuitive methodology encounters low

message delivery ratio and high data transmission delay when applying to different

data routing strategies. The design of effective and efficient data routing strategies

based on limited knowledge of mobile nodes in DTNs is challenging.

In this dissertation, we explore several aspects of social information that can

be applied for data routing in DTNs. We discuss the problems of data routing in

DTNs and study the using of different social information andnetwork features to

facilitate data routing in DTNs. Specifically, we propose three different data routing

strategies relying on different types of social information obtained from mobile

nodes: (a) a location-based social routing strategy applying different aspects of

location-based social information; (b) an encounter-based social routing strategy

relying on several encounter-based social factors of mobile nodes in the network;

(c) a community-based routing strategy combining social and mobile factors as well

as community structure.

The proposed location-based social routing strategy is motivated by the fact

that location information can provide the geographic distance and the direction of

information propagation, which can guide the data to the destination effectively.

The proposed location-based social strategy considers both geographic distance and

user mobility pattern as factors and combines them into one utility function for data

forwarding.

We propose the encounter-based social routing strategy based on the fact that

users in DTNs are interactively connected by encountering events. The design

of encounter-based social strategy involves social centrality and social similarity.

Compared to location-based social strategy, the usage of encounter-based social

information is much less sensitive than location-based social information. By con-

voluting two social factors into utility function, the proposed algorithm can achieve

competitive performance with location-based routing strategy.

i

Page 4: Social-based Data Routing Strategies in Delay Tolerant

Abstract

The design of the community-based strategy is motivated by the observation

that the mobility of people concentrates on a local area and the communication

occurs in the form of communities. To apply such characteristics for elevating

data routing performance in DTNs, we propose a Social and Mobile Aware Rout-

ing sTrategy (SMART). It exploits a distributed community partitioning algorithm

to divide the DTN into communities regarding user locationsand interaction rou-

tines. For intra-community communication, a decayed routing metric convoluting

social similarity and social centrality is calculated, which is used to decide for-

warding node efficiently while avoiding the newly identifiedblind spot and dead

end problems. Meanwhile, to enable efficient inter-community communication, we

choose the fringe nodes which travel remotely as relays, andpropose the node-to-

community utilities for routing decision across communities.

The major contribution of the thesis is to compose comprehensive routing met-

ric to overcome the situation that is not addressed by using single routing metric,

and then identify and tackle the blind spot and dead end problem, which are severe

but not noticed in the existing studies. The proposed location-based strategy and

encounter-based strategy are to construct comprehensively routing metric in geo-

graphic and encountering perspectives, and the proposed SMART is to tackle the

blind spot and dead end problems.

Among all three strategies, the objective is to enhance the data delivery ratio,

reduce the average delay and meanwhile maintain the low costfor data delivery. We

present the simulation results regarding to the performance of the proposed routing

strategies with the state-of-the-art data routing strategies in DTNs. By comprehen-

sively consider multiple aspects of routing metrics, the proposed location-based and

encounter-based routing strategies outperforms the previous studies around 10% in

terms of different evaluation metrics. Through identifying and solve blind spot and

dead end problems, the proposed SMART resolves both of them and thus outper-

forms previous studies over 20%.

ii

Page 5: Social-based Data Routing Strategies in Delay Tolerant

Acknowledgements

This dissertation is the last journey of my Ph.D study in Computer Science at

Geoge-August-Universität Göttingen. I appreciate many people who have helped

me to go through this colorful and meaningful four years.

First of all, I would like to show my deepest gratitude to my supervisor Profes-

sor Xiaoming Fu. I appreciate him for his continuous supportand encouragement

throughout my research. I admire his open, rigorous and dedicated attitude towards

the scientific research and life. At every step of my thesis, he guided me with his

profound knowledge and insight. I sincerely thank him.

I also appreciate my co-supervisor Dr. Wenzhong Li. I am grateful for every

step that he has taught and guided me in the four years. I cannot accomplish my

Ph.D study without his help.

I would also like to show my thanks to Professor Mario Gerla, Professor Di-

eter Hogrefe, Professor Jens Grabowski, Professor CarstenDamm and Professor

Stephan Waack for being members of my committee. Thanks for their advice and

comments through forming my thesis work.

I wish to thank all the research staff in Computer Networks (NET) Group. I

really appreciate people who have studied or worked in NET group for their help

during my Ph.D study. We had wonderful discussions about academics and lives. I

learnt a lot from them.

Finally, I show my deep gratitude to my parents. Thanks for their support and

give me whatever they can. They always stand by me and encourage me. I appre-

ciate they provide me the life and let me live in a happily family. I love them and

would like to dedicate this thesis to them.

iii

Page 6: Social-based Data Routing Strategies in Delay Tolerant
Page 7: Social-based Data Routing Strategies in Delay Tolerant

Table of Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Delay Tolerant Network . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Data Routing Problem in DTNs: in Social Perspective . . . .. . . 2

1.3 Our Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1 Location-based DTN Routing . . . . . . . . . . . . . . . . . . . . 7

2.1.1 Geographic distance . . . . . . . . . . . . . . . . . . . . . 7

2.1.2 Mobility pattern . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Encounter-based DTN Routing . . . . . . . . . . . . . . . . . . . 10

2.2.1 Direct encountering . . . . . . . . . . . . . . . . . . . . . 10

2.2.2 Social information derived from encountering . . . . . .. 12

2.3 Community-based DTN Routing . . . . . . . . . . . . . . . . . . . 14

3 Conceptual Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.1.1 Motivation and challenges . . . . . . . . . . . . . . . . . . 19

3.1.2 Research statement . . . . . . . . . . . . . . . . . . . . . . 20

3.2 Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2.1 Modeling social graph . . . . . . . . . . . . . . . . . . . . 22

v

Page 8: Social-based Data Routing Strategies in Delay Tolerant

Table of Contents

3.2.2 Data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.3 Basic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4 Location-based Routing Strategy . . . . . . . . . . . . . . . . . . . . 27

4.1 Location-based Social Information . . . . . . . . . . . . . . . . .. 27

4.1.1 Location-based graph . . . . . . . . . . . . . . . . . . . . 27

4.1.2 Geographical distance . . . . . . . . . . . . . . . . . . . . 28

4.1.3 Similarity of mobility pattern . . . . . . . . . . . . . . . . 29

4.2 Strategy Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . 33

4.3.1 Experiment setup . . . . . . . . . . . . . . . . . . . . . . 33

4.3.2 Strategies in comparison . . . . . . . . . . . . . . . . . . . 33

4.3.3 Performance analysis . . . . . . . . . . . . . . . . . . . . 35

4.4 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . 38

5 Encounter-based Routing Strategy . . . . . . . . . . . . . . . . . . . 39

5.1 Encounter-based social information . . . . . . . . . . . . . . . .. 39

5.1.1 Encounter-based social graph . . . . . . . . . . . . . . . . 39

5.1.2 Social similarity . . . . . . . . . . . . . . . . . . . . . . . 40

5.1.3 Social centrality . . . . . . . . . . . . . . . . . . . . . . . 40

5.2 Strategy Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . 43

5.3.1 Experiment setup . . . . . . . . . . . . . . . . . . . . . . 43

5.3.2 Strategies in comparison . . . . . . . . . . . . . . . . . . . 44

5.3.3 Performance analysis . . . . . . . . . . . . . . . . . . . . 46

5.3.4 Compare with the Loc strategy . . . . . . . . . . . . . . . 50

5.4 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . 55

6 Community-based Routing Strategy . . . . . . . . . . . . . . . . . . . 57

6.1 Mobile and Social Characteristics of DTN . . . . . . . . . . . . .. 59

6.1.1 Distributed community partitioning . . . . . . . . . . . . . 61

6.1.2 Locality of user contacts . . . . . . . . . . . . . . . . . . . 64

6.2 Strategy Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.2.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.2.2 Intra-community communication . . . . . . . . . . . . . . 66

6.2.3 Inter-community communication . . . . . . . . . . . . . . 69

6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

vi

Page 9: Social-based Data Routing Strategies in Delay Tolerant

6.3.1 Tackling blind spot and dead end problems . . . . . . . . . 71

6.3.2 Efficiency of inter-community communication . . . . . . .72

6.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . 74

6.4.1 Experiment setup . . . . . . . . . . . . . . . . . . . . . . 74

6.4.2 Impact of community numbers . . . . . . . . . . . . . . . 74

6.4.3 Impact of community partitioning algorithms . . . . . . .. 75

6.4.4 Performance analysis . . . . . . . . . . . . . . . . . . . . 77

6.5 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . 82

7 Discussion and Future Works . . . . . . . . . . . . . . . . . . . . . . 85

7.1 A Comparison of Three Strategies . . . . . . . . . . . . . . . . . . 85

7.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

vii

Page 10: Social-based Data Routing Strategies in Delay Tolerant
Page 11: Social-based Data Routing Strategies in Delay Tolerant

List of Figures

1.1 Three snapshots a Delay Tolerant Network. A solid line suggests

the connectivity between two nodes. . . . . . . . . . . . . . . . . . 2

1.2 Location-based routing vs. Encounter-based routing. .. . . . . . . 4

4.1 An example of visiting locations . . . . . . . . . . . . . . . . . . . 30

4.2 Performance comparison of location-based strategies on MIT Real-

ity data trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.3 Performance comparison of location-based strategies on DieselNet

data trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.4 Performance comparison of location-based strategies on Cabspot-

ting data trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.5 Performance comparison of location-based strategies on synthetic

data trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.1 Cumulative distribution function of shortest path lengths . . . . . . 44

5.2 Performance of encounter-based social schemes on MIT Reality

data trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.3 Performance of encounter-based social schemes on DieselNet data

trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.4 Performance of encounter-based social schemes on Cabspotting data

trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.5 Performance of encounter-based social schemes on synthetic data

trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.6 Performance comparison of Soc and Loc strategies on MIT Reality

data trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.7 Performance comparison of Soc and Loc strategies on DieselNet

data trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.8 Performance comparison of Soc and Loc strategies on Cabspotting

data trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

ix

Page 12: Social-based Data Routing Strategies in Delay Tolerant

List of Figures

5.9 Performance comparison of Soc and Loc strategies on synthetic

data trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.10 Performance comparison of Soc and Loc strategies on synthetic

data traces as a function of node speed . . . . . . . . . . . . . . . . 53

5.11 Performance comparison of Soc and Loc on synthetic datatraces as

a function of network size . . . . . . . . . . . . . . . . . . . . . . . 54

6.1 Proportion of blinds spots and dead ends in utility-based strategies . 58

6.2 Two taxi trajectories in Cabspotting trace . . . . . . . . . . .. . . 60

6.3 The CDF and PDF of node movements (Cabspotting). . . . . . . .. 61

6.4 The number of encounters vs. distance (Cabspotting). . .. . . . . . 62

6.5 Local contact and remote contact . . . . . . . . . . . . . . . . . . . 64

6.6 Percentage of blind spot and dead end in SMART. . . . . . . . . .. 72

6.7 Delivery ratio of community-based strategies . . . . . . . .. . . . 73

6.8 The performance metrics as a function of community number and

time (MIT Reality). . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.9 The performance metrics as a function of community number and

time (DieselNet). . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

6.10 The performance metrics as a function of community number and

time (Cabspotting). . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.11 The performance of SMART under different community partition-

ing algorithms (MIT Reality). . . . . . . . . . . . . . . . . . . . . 78

6.12 The performance of SMART under different community partition-

ing algorithms (DieselNet). . . . . . . . . . . . . . . . . . . . . . 79

6.13 The performance of SMART under different community partition-

ing algorithms (Cabspotting). . . . . . . . . . . . . . . . . . . . . 80

6.14 The performance comparison of various strategies on MIT Reality

Mining trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6.15 The performance comparison of various strategies on DieselNet trace 82

6.16 The performance comparison of various strategies on Cabspotting

trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

7.1 Performance comparison of three social-based strategies on MIT

Reality data trace . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

7.2 Performance comparison of three social-based strategies on Diesel-

Net data trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

x

Page 13: Social-based Data Routing Strategies in Delay Tolerant

7.3 Performance comparison of three social-based strategies on Cab-

spotting data trace . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

xi

Page 14: Social-based Data Routing Strategies in Delay Tolerant
Page 15: Social-based Data Routing Strategies in Delay Tolerant

List of Tables

2.1 Comparison of three categories routing strategies . . . .. . . . . . 17

3.1 Statistics of three real data traces . . . . . . . . . . . . . . . . .. . 23

3.2 Parameters of synthetic data traces . . . . . . . . . . . . . . . . .. 23

3.3 Summary of three proposed routing strategies . . . . . . . . .. . . 25

4.1 Table of visiting history . . . . . . . . . . . . . . . . . . . . . . . . 31

5.1 Table of encounter history . . . . . . . . . . . . . . . . . . . . . . 41

6.1 Routing utilities in DTNs . . . . . . . . . . . . . . . . . . . . . . . 57

6.2 Proportion of local and remote contacts . . . . . . . . . . . . . .. 65

6.3 Local contact table . . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.4 Remote contact table . . . . . . . . . . . . . . . . . . . . . . . . . 66

xiii

Page 16: Social-based Data Routing Strategies in Delay Tolerant
Page 17: Social-based Data Routing Strategies in Delay Tolerant

Chapter 1

Introduction

A delay tolerant network (DTN) is a sparse dynamic wireless network where mobile

nodes work on ad hoc mode and forward data opportunisticallyupon contacts [27].

Since the DTN is sparse and nodes in the network are dynamic, the end-to-end path

rarely exists. The communication of nodes can only be conducted when they are

in the communication range of each other. When a node has a copy of message,

it will store the message and carry it until forwarding the message to a node in the

communication range which is more appropriate for the message delivery.

Since DTNs allow people to communicate without network infrastructure, they

are widely used in battlefield, wildlife tracking, and vehicular communication etc.

where setting up network infrastructure is hard and costly [56, 15, 7]. In recent

years, with the proliferation of social network applications and mobile devices, peo-

ple tend to share texts, photos and videos with others via mobile devices in DTNs

[86, 62, 55, 101].

1.1 Delay Tolerant Network

Delay Tolerant Networks (DTNs) are described as a kind of network where the

nodes in the network are mobile and the connections among nodes are changing

over time thus the communication between nodes is opportunistically occurs only

when they are in communication range. Due to network structure of DTNs, they

are characterized by large delays, frequent disruptions and lack of stationary paths

between nodes. Such network can be constructed by human beings [62, 34, 37, 66],

wildlife [43, 82], or even vehicles [99, 74, 81].

We use an example to illustrate the main characteristics of DTNs. Fig. 1.1

shows a sample delay tolerant network. It depicts the network topology snapshots

over three different time periodst1, t2 andt3 (t1 < t2 < t3). The movements of

nodes lead to the positions of them different from one snapshot to another. Node

mobility leads to several pairs of nodes moving into communication range (e.g.,

nodeA andB cannot communicate att1, but they run into communication range at

1

Page 18: Social-based Data Routing Strategies in Delay Tolerant

Chapter 1. Introduction

A

FC

D

E

B

A

F

C

D

E

B

A

F

C

D

E

B

t1 t2 t3

Figure 1.1: Three snapshots a Delay Tolerant Network. A solid line suggests theconnectivity between two nodes.

t2) or moving out of communication range (e.g., nodeC andD are in communica-

tion range att1 andt2, but they becomes unreachable att3). Therefore, the stable

end-to-end path does not exist between any couple of nodes. The communications

between a pair of nodes are often disrupted due to unstable connections. Besides,

if a node wants to send a message to another node, it may sufferfrom large delay.

This is because the data transmission between any pair of nodes needs that they

are in communication range. However, delay tolerant network does not guarantee

that two nodes are in communication range permanently. It may spend a long time

period for two nodes to move into communication range. Thus the communication

delay between two nodes is longer than wired networks like Internet. For instance,

if nodeA needs to send a message to nodeE in the sample DTN, it can only deliver

the data to nodeE at t3 when they are in communication range at this time period.

1.2 Data Routing Problem in DTNs: in Social

Perspective

Although the end-to-end path rarely exists in delay tolerant networks, the commu-

nication among nodes in such network is still desirable. Therefore, an effective and

efficient data routing strategy is needed to enable the communication in the inter-

mittent connected networks. Although there are numerous data routing schemes

designed for wireless network, they cannot be directly applied to DTNs.

In the well-connected wireless network, the data routing relies on end-to-end

path. Each node maintains routing table according to specific routing policy for

the selection of next data relay. According to a specific routing scheme, the entries

in the routing table can be maintained prior to the arrival ofdata. Also, since the

network is relatively stable, the routing entries are reliable and data routing in such

2

Page 19: Social-based Data Routing Strategies in Delay Tolerant

1.2. Data Routing Problem in DTNs: in Social Perspective

networks can achieve significantly high data delivery ratio. In contrast, the connec-

tion in DTN is transient. It is difficult to maintain a complete path during the data

forwarding procedure. Therefore, the probability of successful data delivery and

time used for data delivery are not guaranteed.

To achieve effective communication without setting up end-to-end communica-

tion paths, data transmission in DTNs employs the “store-carry-forward” manner,

where a node stores and carries data while moving, forwards the data to a relay

node on encountering, and propagates the data to further relays until the destination

is reached. The main concern of data routing strategies is todecide whether to for-

ward the data to the counterpart when two nodes encounter. Different schemes are

devised for the relay selection.

The most naive strategy such as Epidemic routing [89] is proposed to send data

epidemically as long as two nodes encounter until the destination is reached. Based

on such epidemic principle, many routing schemes [83, 59, 64, 14, 43, 3] using

limited copies of messages are developed. Such epidemic based routing strategies

suffer from extremely high network cost.

Since nodes in DTNs are mostly controlled by humans, such as mobile phones

and vehicles, there are plenty of social relationships and properties which may be

used to reveal the network characteristics and facilitate the data routing. For ex-

ample, people with similar social properties may spend longtime together, and be

willing to sharing information and resources with each other [96]. By exploring

social features in DTNs, the prediction of encountering opportunities of nodes will

be more effective. Therefore, many routing schemes are developed based on social

information. Generally speaking, there are two kinds of social information widely

used for data routing in DTNs: location-based social information and encounter-

based social information. The location-based social information refers to as the

geographical related data, including the geographical coordinates [87], the distance

between individuals [45] and etc., which represents the physical property of human

activity. The encounter-based social information is defined as the inferred human

relationship from encountering events. It can be the contacts of individuals [56],

social centrality [40], social similarity [24] and etc., which represents the logical

information of human interactions. Accordingly, routing strategies in DTNs can be

divided into two categories: the location-based routing strategies [18, 49, 45, 87]

and the encounter-based routing strategies [7, 40, 33, 24, 94, 83].

There are advantages and disadvantages of both kinds of strategies. On the

one hand, location-based strategies forward data to the nodes geographically closer

3

Page 20: Social-based Data Routing Strategies in Delay Tolerant

Chapter 1. Introduction

A CB

S D

GF

E

S

Location-based

routing

Encounter-based

routing

D

Figure 1.2: Location-based routing vs. Encounter-based routing.

to the destination, which tend to achieve geographical shortest routing path. An

example is shown in Fig. 1.2, where nodeS wants to send a message to nodeD.

The lower layer indicates the physical locations of the mobile devices. Based on

the measurement of geographical distance, it tends to choose S → A → B →

C → D as the shortest routing path. However, since encounter opportunity is not

taken into account, such routing path may not be efficient andthe delivery ratio is

not guaranteed. On the other hand, the encounter-based strategies forward data to

the nodes logically closer to the destination, which tend toachieve logical shortest

routing path. As in the example of Fig. 1.2, the upper layer indicates the encounter-

based social connections (in solid lines) of nodes. Based onthe measurement of

their connections,S tends to forward data via the pathS → E → F → G → D

(since the number of connections areS < E < F < G). Such forwarding strategy

seems to enhance the chance of data delivery, but due to unawareness of location, it

may also lead to a longer routing path and higher delay.

Besides, the social structure, such as community, is also important for data rout-

ing in DTNs. People from the same community may contact each other more fre-

quently. The community-based strategies forward data according community struc-

ture based on the fact that people tend to group into communities by their social

relationships. By dividing the network into multiple communities, the nodes within

a community have strong connections, while their links across communities are

weak ties. The community structure favors intra-communitycommunication where

nodes are closely connected, but also encounters the difficulty of inter-community

communication via weak links.

In this dissertation, we apply different types of social information and structure

to devise data routing strategies in DTNs. In each strategy,we exert different routing

4

Page 21: Social-based Data Routing Strategies in Delay Tolerant

1.3. Our Contributions

mechanisms to enhance data delivery ratio, reduce average delay and meanwhile

maintain low cost.

1.3 Our Contributions

In this thesis, we apply social information to enhance performance of data routing

efficiency in delay tolerant networks. We firstly propose a location-based routing

scheme which comprehensively combines multiple factors ingeographic perspec-

tive to elevate routing performance. Considering the privacy of location-based so-

cial information, we then devise an encounter-based routing strategy. Since the

encountering event is the fundamental information for dataforwarding in DTN, the

leverage of encounter information is less sensitive. The proposed encounter-based

routing scheme considers several encounter-based social factors to achieve compet-

itive performance. Finally, to solve several existing problems in current DTN data

routing schemes, we propose a new algorithm called SMART, which relies on the

social properties and community structure of DTNs to facilitate data routing in such

networks.

Contributions of this dissertation are summarized as follows:

• To confront the challenge that routing metrics relying on one aspect of node or

network features cannot fully adapt to diverse network situations, we devise a

comprehensive location-based data routing scheme in DTNs.We model DTN

using a location-based social graph and define geographic distance and simi-

larity of mobility pattern relying on the graph. The proposed location-based

social strategy utilizes the combination of similarity of mobility pattern and

geographic distance to enhance the data routing efficiency.The simulation re-

sults show that the proposed location-based social strategy outperforms other

location-based strategies by around 10%.

• Due to the fact that location-based social information is sensitive to users in

DTNs, while encounter-based social information is less sensitive, the encounter-

based social information is applied for data routing in DTN.We model DTN

using an encounter-based social graph and define social similarity and social

centrality based on the graph. We propose an encounter-based routing scheme

which comprehensively combines social similarity and social centrality by

convolution. The proposed strategy outperforms other encounter-based rout-

5

Page 22: Social-based Data Routing Strategies in Delay Tolerant

Chapter 1. Introduction

ing schemes by 15% and has competitive performance with location-based

routing strategy.

• We identify blind spot and dead end problems that exist in most of utility-

based routing schemes in DTNs. By measuring the blind spot and dead end

problems in several existing routing strategies, we discover that the propor-

tion of these two problems reaches about 20%.

• We design a distributed community detection algorithm based on node en-

counters and carry out a community-based routing strategy called SMART. It

divides a DTN into several communities, and exploits different principles for

data routing depending on whether the source and the destination are in the

same community. Routing utilities integrating different metrics with convo-

lution and decay function are explored to overcome the difficulties of intra-

and inter-community communications.

1.4 Thesis Structure

The remainder of the thesis is constructed as follows. We summarize the related

works in literature in Chapter 2. We classify data routing schemes into differ-

ent kinds of groups according to different aspects of socialinformation. Specif-

ically, the routing schemes are categorized into location-based routing strategies,

encounter-based routing strategies and community-based routing strategies. In Chap-

ter 3, we discuss the motivation and challenges of this thesis. We also depict the ba-

sic approach of this work. In Chapter 4, we devise a location-based routing strategy

to enhance the efficiency of data routing. In Chapter 5, we propose an encounter-

based routing scheme based on the fact that encounter-basedsocial information

is less sensitive compared with location-based social information. We study the

problems in existing data routing schemes, carry out a distributed community de-

tection method and propose a community-based routing scheme named as SMART

in Chapter 6 to elevate the performance of data routing in DTNs. In Chapter 7,

we conduct a comprehensive comparison of three proposed routing strategies and

discuss the future works. We conclude the dissertation in Chapter 8.

6

Page 23: Social-based Data Routing Strategies in Delay Tolerant

Chapter 2

Related Works

Delay tolerant networks have been proposed for more than onedecade [27]. Re-

searchers focus on the data routing, one primary issue in DTNs, and many stud-

ies have been carried out to handle the data delivery in the intermittent connected

environment. This thesis mainly focuses on the routing strategies relying on so-

cial information. Specifically, we divide social information into three categories:

location-based social information, encounter-based social information and social

community. Hence, the corresponding routing strategies are divided as location-

based routing strategies, encounter-based strategies andcommunity-based strate-

gies.

2.1 Location-based DTN Routing

Geographic information, as one aspect of social information, is well applied for

data delivery in DTNs. The derived location-based routing strategies in wireless

networks were widely studied in the past decade. They make forwarding decision

according to the geographic information. Specifically, thelocation-based routing

strategies are categorized based on the exerted geographicinformation: (1) ge-

ographical distance, and (2) mobility pattern. In the following, we present two

groups of location-based routing strategies respectively.

2.1.1 Geographic distance

As one of the earliest works on wireless routing strategies using geographic in-

formation, B. Karp proposed Greedy Perimeter Stateless Routing (GPSR) [45] for

wireless mobile ad hoc networks. It makes greedy forwardingusing the geographic

positions of a router’s neighbors in the network topology. Specifically, a node ob-

tains its neighbors’ positions by information exchange. Then it locally chooses the

optimal next hop with the neighbor geographically closest to the packet’s destina-

tion. Forwarding in this regime follows successively closer geographic hops until

7

Page 24: Social-based Data Routing Strategies in Delay Tolerant

Chapter 2. Related Works

the destination is reached. When a packet reaches a region where greedy forward-

ing is impossible (i.e. packets reaches the local maxima), the algorithm recovers

by routing around the perimeter of the region. It exploits the right-hand rule which

traverses the interior of a closed polygonal region in clockwise edge order to seek

for the next hop. However, the application of GPSR needs planar graph.

Besides, authors in [57] proposed Greedy Perimeter Coordinator Routing (GPCR)

by taking advantage of the fact that streets and junctions form a natural planar graph

to handle data routing. It contains a restricted greedy forwarding procedure and a

repair strategy. In the greedy mode, the data packets are forwarded to a node at a

junction. Then junction node forwards packets by choosing one neighbor which has

the shortest distance to destination. In the perimeter mode, it also applies right-hand

rule [28] when the greedy forwarding is impossible. GPCR assumes that there is

always a node at a junction. When the junction node is missing, it causes routing

loops and packet loss.

A work [93] named Mobility-Centric Data Dissemination Algorithm for Vehi-

cle Network (MDDV) applies similar idea for data forwarding. It exploits Global

Position System (GPS) embedded in each node and the bundles are forwarded to

neighboring nodes that are geographically closest to the region containing the des-

tination. There is no route recovery mechanism in this work.It does not consider

the situation that the geographic routing is not applicable.

GeOpps [50] takes the advantages of the suggested routes by navigation systems

to seek for the data relay that is likely to move closer to the destination node. It

calculates the shortest path between the destination node and the nearest point of

the path and estimate the time for arriving at the destination. During the routing

process, if a node with shorter estimated arrival time appears, the data packet will

be forwarded to the node. The process is conducted repeatedly until the destination

is reached. Obviously, GeOpps requires navigation information and the estimation

of arrival time needs global view of the network. Both of themare difficult to be

accomplished in DTNs.

The hybrid routing strategy GeoDTN+Nav [21] is proposed with three different

modes: a greedy mode, a perimeter mode, and a DTN mode. It switches between

non-DTN mode and DTN mode by evaluating the network connectivity based on

the number of hops a packet has traveled, neighbor’s delivery quality and neighbor’s

direction to the destination. To achieve this, it uses Virtual Navigation Interface

(VNI) to abstract information from underlying hardware andprovides necessary in-

formation for the strategy to determine its routing mode. Besides, VNI provides the

8

Page 25: Social-based Data Routing Strategies in Delay Tolerant

2.1. Location-based DTN Routing

option to protect nodes’ private information. Such hybrid routing strategy brings

the longer delay and higher packet loss issues when it conducts the switching oper-

ation.

Locus [87] proposes a location-based overlay for DTNs. It utilizes devices

nearby to keep the data in a specific area. To enable the data routing, it defines a

utility function based on the geographic distance from a specific location to a data’s

home location, and finds the node having the closest geographic distance with the

data item for data forwarding. Locus requires there always some nodes in the data

home location and multiple copies of data are needed.

In summary, the data routing strategies rely on geographical distance mainly

initiate their relay selection by greedy forwarding. When greedy forwarding is

failed, different repair strategies are applied to reduce the chance of packet loss.

2.1.2 Mobility pattern

Compared with geographic distance, mobility pattern is defined in a more sophis-

ticated manner. It may refer to many different characteristics of user movements,

such as the node trajectories, visiting histories and etc.

The Utility-based Distributed routing algorithm with Multi-copies (UDM) [52]

defines “home community” where the nodes passed by or stayed close to most

frequently. It selects the data relay as the node with the highest utility value to

the “home community”. Besides, it exploits binary transmission, which means that

when it finds a proper relay, the node sends half number of its packet copies to the

new node. The process continues until the destination is reached.

A similar work named MV routing was proposed in [18]. It also proposes to

forward data packet to a stationary location. The difference is that MV routing uses

the meeting frequencies and visits to the stationary location to construct the utility

function. Both methods need multiple copies of data packetsand the relays used

for data delivery are also difficult to be determined.

Besides, both MoVe [48] and VeRo [44] apply movement vectorsfor data rout-

ing schemes design. Specifically, MoVe exerts nodes velocity and direction to esti-

mate the shortest path to the destination. When two nodes encounter, they exchange

the trajectory and bundles decide whether to be forwarded bythe direction and dis-

tance between the candidate node and destination. Similarly, nodes in VeRo records

their position and angle changes, and exchange the data to a node that is moving

away from it. The limitation of both movement vector based strategies is that they

9

Page 26: Social-based Data Routing Strategies in Delay Tolerant

Chapter 2. Related Works

need to exchange the trajectory of nodes and the load balanceis difficult to achieve.

Furthermore, CAR [65] uses the probability that two nodes visit the same loca-

tion (colocation) combining the degree change of the node tocalculate the proba-

bility that a node can successfully deliver a message to a destination. However, the

colocation information cannot represent the mobility pattern of a node well. For

example, a node may visit a location with time duration one hour, while another

node only comes to the same location in10 seconds. Although they are co-located,

the probability for successful data delivery is not high.

To represent mobility pattern more accurately in DTNs, Mobyspace [49] calcu-

lates the similarity of mobility pattern by Euclidean distance of two nodes’ visit-

ing history and chooses the node with shorter distance (i.e., more similar mobility

pattern) with the destination node as the relay for data delivery. Specifically, it

considers a node’s visiting history as a vector. Each value in the vector represents

the percentage of time that the node stays at the location. The distance between

two nodes is computed by the Euclidean distance between two vectors. Although

Mobyspace can represent mobility pattern of nodes more specifically, it ignores the

temporal factors that lead to the change of mobility patterns in DTNs.

Overall, the routing strategies based on mobility pattern construct the utility

value according to statistical results of the mobility characteristics. It may be more

accurate than directly using geographic distance. However, due to its requirement

for detailed location information, the privacy concern of users is still a great con-

cern.

2.2 Encounter-based DTN Routing

Generally, encounter-based routing strategies make forwarding decision relying on

the encounters of nodes. In this thesis, we investigate the encounter information

in social perspective. They are also mainly divided into twotypes: (1) directly

encountering, and (2) social information derived from encountering.

2.2.1 Direct encountering

There are some strategies directly using encounter information for data routing.

For instance, Prophet [56], RAPID [7], MaxProp [15] and etc.were studied in past

years. They forward data items according to node contacts, and choose the node

with higher contact probability as the relay for data delivery.

10

Page 27: Social-based Data Routing Strategies in Delay Tolerant

2.2. Encounter-based DTN Routing

The Probabilistic ROuting Protocol using History of Encounters and Transitiv-

ity (Prophet) [56] applies the predictability for data delivery as the metric for relay

selection. Specifically, the predictability is a probabilistic metric that is calculated

by encounter patterns. Each node calculates such predictability for the specified

destination. There are three major characteristics of the predictabilityP . First, the

value ofP is iteratively determined by the previous value ofP , denoted byP(a,b)old

for nodea andb:

P(a,b) = P(a,b)old+ (1− P(a,b)old

) ∗ Pinit, (2.1)

wherePinit is an initialized constant in [0,1]. Second, the value ofP decreases if

there is no encounter for a certain time interval, which is specified as:

P(a,b)old= P(a,b)old

∗ γκ, (2.2)

whereγ ∈ [0, 1] is a constant andκ is the time interval that have been elapsed

from last update. Finally, the transitivity ofP is explained as, ifa meetsb with pre-

dictability valueP(a,b) andb meetc with predictability valueP(b,c), the predictability

value betweena andc will be:

P(a,c) = P(a,c)old∗ P(a,b) ∗ P(b,c) ∗ β. (2.3)

The scheme works as follows. When two nodes encounter, they exchange pre-

dictability values as well as encounter vectors to evaluatethe quality of the node. If

the predictability value of the counterpart is higher for a destination specified by a

piece of data, the data will be transferred to the encounter node.

Jain et al. [42] presented a routing metric named as Minimum Expected Delay

(MED) by assuming future contact periods are known. They modify the Dijkstra

algorithm to compute the path for DTN with minimum delay. However, such calcu-

lation can only adapt to certain types of DTNs. To address this limitation, they pro-

pose a new metric, named as Minimum Estimated Expected Delay(MEED), which

is calculated by past encounter history and then floods the metric value within the

whole network. It introduces too much control overhead.

Spyropoulos et al. proposed a series of multi-copy data delivery schemes, such

as Spray and Wait [83] and Spray and Focus [84]. Spray and Waitsimply spread

the messages to nodes it encounters and each data carrier waits until it meets des-

tination. Obviously, it has significant waste of data and also the there is no any

11

Page 28: Social-based Data Routing Strategies in Delay Tolerant

Chapter 2. Related Works

criteria for the selection of data relays. To address this issue, the Spray and Focus

is proposed to limit the data carriers. The spray phase is similar as Spray and Wait

and simply forwards the data to nodes encountered. In the focus phase, a utility

value is used to determine whether the node is a good relay fordata delivery. If its

utility value is larger than the data carrier, then the bundles will be forwarded.

The MaxProp [15] is proposed based on prioritizing both the schedule of packets

transmitted to other nodes and the schedule of packets that will be deleted from the

buffer. Specifically, the packets are transmitted to other nodes when node meetings

are addressed by ranking the packets. The packets will be deleted if the buffer is full

according to the packet ranking. The ranking mechanism is initialized by a certain

value. When two nodes meet, the ranking value will be increased by 1 and it will

be exchanged when nodes encounter. Afterwards, a cost for the possible path is

calculated, and the path with the lowest cost will be selected for the data delivery.

The Resource Allocation Protocol for Intentional DTN (RAPID) [7] is proposed

by taking the constraint resource into account. It calculates utility functions ac-

cording to different resource constraints. The bundles areforwarded to nodes with

higher utility.

In summary, the directly encounter based routing schemes enhance the perfor-

mance for data delivery by calculating encounter-based utilities. However it re-

quires exchanging encounter information of nodes in the network, which introduces

large amount of control overhead.

2.2.2 Social information derived from encountering

Another group of routing strategies rely on the social information derived from

encounter-based social graph. Although they do not directly use encounter infor-

mation, most of these works are based on the encounter-basedsocial graph.

SimBet [24] takes the linear combination of social similarity and social central-

ity as the forwarding utility to construct the data forwarding path. Instead of only

considering single social property, the SimBet scheme considers the utility func-

tion as the sum of social similarity and social centrality, which measures both the

social closeness with destination node and social positionof the node in the net-

work. In this work, the social similarity is represented by the number of common

friends. The social centrality is calculated by local betweenness. Two separated

12

Page 29: Social-based Data Routing Strategies in Delay Tolerant

2.2. Encounter-based DTN Routing

utility functions are formulated in the following:

SimUtiln(d) =Simn(d)

Simn(d) + Simm(d), (2.4)

BetUtiln =Betn

Betn +Betm. (2.5)

The overall utility is combined as:

SimBetUtiln(d) = αSimUtiln(d) + βBetUtiln, (2.6)

whereα andβ are two parameters defined by authors andα + β = 1. The scheme

chooses the node with higher combination utility value as the relay for data for-

warding. The similar idea that uses the concept of social centrality can also be

found in [32].

An et al. believe people with similar interest have more likelihood to access

the same content. Based on this assumption, they proposed a social relation aware

routing protocol [4]. It uses the similarity of users’ interest as the routing metric

and chooses the node with higher similarity of interest as the data relay to increases

the utilization of content replication in intermediate nodes.

Zhang et al. proposed a data diffusion strategy based on “homophily” [100].

The “homophily” phenomenon is explained as the trend that two nodes share com-

mon characteristics (i.e. interest). It utilizes the friendship and “homophily” to

diffuse data pieces. Specifically, it spreads most similar data items among friends

and most different data items to strangers. In this way, datacan be diffused in a

further wide area, thus achieve shorter data delivery delay.

Social greedy [41] proposed by Jahanbakhsh et al. makes the data forwarding

decisions by comparing the social distance between two nodes. The social distance

is calculated by the similarity of attributes (such as address, affiliation, school, city,

country, etc.) between two nodes. Two nodes with more commonattributes, they

are closer to each other, and more likely to be chosen as relays for data delivery.

Social feature-based algorithm [94] takes the multi-dimension social properties

and chooses the node with most similar social features as thedestination for data

forwarding. Specifically, it conducts hypercube by varioussocial properties and

uses the feature distance to measure the closeness between two nodes. The node

with the closest social features will be selected as the relay for data delivery.

Alternatively, PeopleRank [63] tries to ranks nodes in a social graph in a dis-

13

Page 30: Social-based Data Routing Strategies in Delay Tolerant

Chapter 2. Related Works

tributed manner. It measures the relative importance of a node in the network and

the message is forwarded to nodes with same or higher rankings.

Fabbri and Verdone proposed a sociability-based routing strategy in [26]. It

exerts the nodes with high degrees of sociability as data relays. The sociability

indicator is defined by counting the number of encounters with other nodes in the

network. The message will be forwarded to the node with higher sociability.

A social-aware and stateless routing (SANE) [60] is proposed by the observa-

tion that people with similar interest are more often to meeteach other. It uses

k-dimension vector to represent the interests of nodes and calculate the similarity

of interest by a cosine function. The cosine similarity calculates the interest similar-

ity between data and the node. Data will only be forwarded to the node if the cosine

similarity between them is larger than a threshold. Compared with state routing

strategies, SANE does not need to store additional information for the calculation

of cosine similarity.

Li and Shen proposed a duration utility-based social routing scheme named

SEDUM [53]. It exploits both contact frequency and durationin node mobility pat-

terns of social networks to define the duration utility. It increases routing throughput

and reduces routing delay by building an effective buffer scheme which maintains

the messages by their life time. Those messages with longer lifetime have higher

priority to be sent out from buffers. In this scheme, it discovers the minimum num-

ber copies of messages to achieve a desired routing delay by using an optimal tree

replication algorithm.

In summary, social routing based on different kinds of social properties derived

from encounter-based graph. It enhances the performance from social perspective.

However, the enhancement is still limited only based on these social properties. In

the last section of this chapter, we will review routing schemes relying on another

important structure feature in social networks: community-based routing strategies.

2.3 Community-based DTN Routing

Community as a very important social structure is applied toenhance the perfor-

mance of data routing in DTNs. Community-based strategies make data forward-

ing decision according to the community structure of the network. By dividing

the network into multiple communities, they use different routing strategies to han-

dle the intra-community and inter-community data deliverydue to the fact that the

connections within a community are rich while the connections between different

14

Page 31: Social-based Data Routing Strategies in Delay Tolerant

2.3. Community-based DTN Routing

communities are weak. There are several routing strategies[39, 40, 33, 51, 13, 1]

exploiting community structure for data routing in DTNs.

One of the earliest works named label routing using community structure for

DTN routing is proposed by Hui and Crowcroft [39]. The data routing mechanism

is built on Pocket Switched Networks (PSNs) [38], a type of DTN in which the

mobile devices are portable by human beings and two devices can communicate

when the carriers meet each other. The proposed routing strategy exploits the label

affiliated to people to select forwarding relay. The label isassigned according to the

community where a person belongs. The general idea of the label routing works as

follows. Each person in the network is assigned with a label based on community

structure. When people meet, they exchange the label information. For the selection

of the relay, it chooses the node with the same label as the destination node until

the destination is reached.

Later, they devised the Bubble Rap algorithm. Bubble Rap [40] considers the

data routing in PSN which consists of several communities and there are social

relationships among users. It uses k-clique percolation asthe basic community

detection method. There are two steps of routing in Bubble Rap. The first step is

to forward data to the destination community. It delivers data items from outside of

the destination’s community according to a node’s global social centrality. A node

with higher global social centrality will be selected as therelay for data forwarding.

Within the destination’s community, the forwarding utility is based on a node’s local

social centrality. The data item will be forwarded to a node with higher local social

centrality.

A work related to social-based data multicasting was proposed by Gao et al.

[33]. It presents multicasting path selection based on social centrality and social

community. In the case of single data multicasting, it measures the social centrality,

and chooses the node with higher centrality value as the successor for data forward-

ing. In the case of multiple data multicasting, it takes the community structure into

consideration. It finds the nodes with destination awareness and forwards the data

to the node with highest delivery probability within the community. It continues the

forwarding procedure by social properties to find the destination.

LocalCom [51] uses the degree sum of a node and its neighbors as the metric for

community detection. It considers that nodes with high degree sum should belong

to the same community. The intra-community routing takes the single hop source

routing to forward data. The packet will be directly forwarded along a proposed vir-

tual link. This scheme is based on the high similarity and short hop-count distance

15

Page 32: Social-based Data Routing Strategies in Delay Tolerant

Chapter 2. Related Works

within the community. For inter-community data routing, itdefines nodes can reach

other communities as bridges. Then the marking and pruning schemes are used:

static pre-pruning schemes and dynamic pruning. In case that the source and desti-

nation nodes of a packet reside in different communities, the source first forwards

the packet to the bridges of the current community by intra-community forwarding

mechanism. Each bridge is decided by the pre-pruning process and then further

forwards the packet based on the dynamic information. It needs multiple replicas

for the inter-community data forwarding.

A work taking the friendship community for information propagation was pro-

posed as Friendship-based routing (FBR) [13]. It clusters the nodes which can com-

municate with short delays as one community. FBR considers the friendship com-

munity of varied periods of time. For intra-community communication, it sprays

several copies of messages to a number of nodes in the community. For inter-

community communication, the data is forwarded only when the destination is in

the same periodical community as the relay, which uses the temporal direct connec-

tion between communities to tackle the relay selection issue.

Homing spread [95] is a zero-knowledge multi-copy routing algorithms. It de-

fines community homes, which are considered as the common locations visited by a

group of people with same interest. The messages are spread to community homes

at the first place. Then the copies of messages are spread to other homes and mobile

nodes. The destination fetches the message when it meets anymessage holder.

Community-aware opportunistic routing [97] uses similar community home

concept for single-copy routing algorithm design. It chooses the community home

by calculating the centrality of nodes. The node with the highest centrality is con-

sidered as the community home. The messages then are forwarded to those homes.

By maintaining an optimal set of relays, each home determines the best relay and

meanwhile computes the minimum excepted delivery delay. Afterwards, the home

nodes send the messages to the optimal selected relays untilthe destination home

is reached.

Abdelkader et al. proposed a routing protocol named as SGBR using social

grouping for DTNs [1]. It assumes that there is a global observer which can collect

the information from the entire network. SGBR uses social relations to build groups

and spreads message copies to those nodes with higher metricvalues to the message

carrier. By this manner, it reduces the need of collecting network wide information,

maximizes the delivery ratio and meanwhile minimizes the overhead.

In summary, community-based routing strategies try to improve data forwarding

16

Page 33: Social-based Data Routing Strategies in Delay Tolerant

2.3. Community-based DTN Routing

efficiency by community structure. However, most existing community partitioning

is complicated and static when applied to DTNs. Furthermore, data transmission

between communities is difficult task due to rare efficient routing schemes are pro-

posed for inter-community communication.

Overall, the major characteristics of three categories aresummarized in Table

2.1.

Table 2.1: Comparison of three categories routing strategies

Metrics Routing Strategies EncounterInforma-tion

LocationInforma-tion

Feature

Location-based:geographicdistance

GPSR [45], GPCR[57], MDDV [93],GeOpps [50],GeoDTN+Nav [21],Locus [87]

Yes Yes Forward data tonode with closerdistance to des-tination node orlocation

Location-based:mobilitypattern

UDM [52], MV [18],MoVe [48], VeRo [44],CAR [65], Mobyspace[49]

Yes Yes Forward data tonode with moresimilar mobil-ity pattern withdestination

Encounter-based: directencounter

Prophet [56], RAPID[7], MaxProp [15],MED [42], Spray andFocus [84]

Yes No Forward data tonode with higherencounter fre-quency or durationwith destination

Encounter-based: de-rived socialinformation

SimBet [24], socialrelation aware rout-ing protocol [4], SDM[33], Social greedy[41], PeopleRank [63],SANE [60], SEDUM[53]

Yes No Forward data tonode more so-cially similar withdestination

Communitystructure

LABEL [39], BubbleRap [40], MDM [33],LocalCom [51], FBR[13], Homing spread[95], Community-aware opportunisticrouting [97], SGBR[1]

Yes No Forward data ac-cording to commu-nity structure

17

Page 34: Social-based Data Routing Strategies in Delay Tolerant
Page 35: Social-based Data Routing Strategies in Delay Tolerant

Chapter 3

Conceptual Framework

In this chapter, we describe the motivation and challenges for data routing in de-

lay tolerant networks. We present the research statement ofthe thesis and give an

overview of the network model and basic approach that we use for routing strategies

designing.

3.1 Problem Statement

We discuss the motivation and challenges for data dissemination in DTNs and out-

line the research statement in this section.

3.1.1 Motivation and challenges

A key problem in DTN is data dissemination. The accomplishment of data dis-

semination requires effective data routing strategies that can address the following

challenges in DTN:

• dynamic network. Nodes in the network are mobile. The movements of nodes

are not controlled. Network topology changes from time to time. The con-

tinuous changing topology leads to arbitrary disconnections. Thus, the end-

to-end path is difficult to be maintained, which results in large delays and

unpredictable data dissemination paths.

• limited network information. Due to the fact of dynamic network and unstable

connections among nodes, they cannot obtain all network information from

DTN. It makes the traditional mobile ad hoc routing protocols (such as AODV

[76], DSDV [75] and etc.) cannot adapt to DTN directly. The limited network

information leads to the static routes not applicable for dynamic topologies.

Besides, the lack of updated and whole information of the network makes the

calculation of best paths for different destinations become challenging.

19

Page 36: Social-based Data Routing Strategies in Delay Tolerant

Chapter 3. Conceptual Framework

• uncertain connection duration and limited resources. Data dissemination in

DTNs also refers to the size of the data. Due to node movements, the con-

nection duration between two nodes is unknown and difficult to be predicted.

Long connection duration can help to transmit a large numberof messages

or messages with large sizes. Therefore, to enhance the capability of data

delivery, node needs to decide how much data will be delivered or which

piece of data needs to be delivered when it encounters another peer. In delay

tolerant networks, deciding the number of messages and the size of data for

transmission is also affected by the resource of nodes. Nodes in DTNs are

portable mobile devices (such as mobile phones), which normally have lim-

ited energy supply, storage, CPU and etc. that directly affect the efficiency of

data dissemination.

We use an example in workplace to show these challenges in DTNs. Consider

the DTN scenario that people with mobile devices working in the same company.

They move from one place to another, which makes the network become dynamic.

The connection between two nodes may keep connecting when they stay in the same

office while the connection is disrupted when they go to otherplaces, which makes

the end-to-end path be difficult to maintain. From the point view of each node, it

only has partial information about other peers. Due to the movements of nodes,

the changing connection status makes two nodes exist no constant route between

them. Any developed routing strategies need to rely on the encountering events.

Besides, due to the movement of nodes, the encountering duration is unpredictable.

The delivery of data is determined by the size of data and the technology applied

for data transmission, as well as the routing policy. Moreover, each mobile device

held by people has limited battery, storage and etc. When theenergy or the storage

is about to run out, people will consider which message should carry for the further

data transmission.

3.1.2 Research statement

In this dissemination, we investigate data routing strategies for data dissemination

in DTNs from social perspective. The previous proposed routing strategies are

proposed to address the above-mentioned challenges in DTNs. They are divided

into three main categories based on the social information they used for data rout-

ing, as explained in Chapter 2: location-based, encounter-based and community-

based. Location-based routing schemes make the forwardingdecision based on the

20

Page 37: Social-based Data Routing Strategies in Delay Tolerant

3.1. Problem Statement

location-based social information, such as geographic distance, colocation and etc.

Encounter-based routing schemes construct the social graph based on encounter

information and the derived social properties for data routing. Community-based

strategies exert community structure to facilitate data routing in DTNs.

The key issue of existing social-based routing schemes is lack of researches

that utilize comprehensive social information for data routing. Thus, the perfor-

mance needs further enhancement. Specifically, current researches construct the

utility metric based on one aspect of information. It may notadapt to different

situations of network topologies and dynamics. For example, the location-based

routing schemes relying on geographic distance (e.g., GPSR[45]) only consider

the distance between nodes temporally, which is not efficient when the update of

distance information is not frequent, while location-based routing strategies relying

on mobility pattern (e.g., Mobyspace [49]) take statistical mobility patterns as the

major concern. Due to lack of distance information, the delivery ratio cannot be

guaranteed. Similar situation can also be found in encounter-based social routing

schemes that only consider one aspect of encounter information thus cannot fully

represent the situation of the network. In this thesis, we propose two comprehensive

routing strategies from the geographical and encounteringsocial perspective.

Furthermore, the existing routing schemes rely on utility to make forwarding

decision. That is, for a certain destination, a node calculates a utility value based

on network structures or node features. When the node encounters another node

in DTN, it compares the utility value with the encountering node. If the encoun-

tering node has higher utility value, data will be forwardedto it. However, such

greedy forwarding schemes run into two issues: blind spot and dead end. Blind

spot results from the utility value of a node so close to utility values of its neighbors

that the node cannot find the next data relay. Dead end is because of the highest

local utility value that the data is stuck into the node untilit expires. By dividing

the network into multiple communities, the nodes within a community have strong

connections, while their links across communities are weakties [35]. The com-

munity structure favors intra-community communication where nodes are closely

connected. Although community structure is applied to reduce the chance of blind

spot and dead end, it brings new issue that the communicationamong communi-

ties becomes difficult. We propose a social and mobile aware routing strategy that

addresses both blind spot and dead end problem, and meanwhile achieves efficient

inter-community communication.

21

Page 38: Social-based Data Routing Strategies in Delay Tolerant

Chapter 3. Conceptual Framework

3.2 Network Model

In this section, we provide the basic network model for design of various social-

based routing schemes. We also describe the data sets we use throughout the thesis.

3.2.1 Modeling social graph

Delay tolerant networks can be described as a graph according to different charac-

teristics of nodes and network structure [61, 42, 19, 73, 5].In this dissertation, we

use social features to characterize the network graph. We model a DTN as a social

graphG = (V,E,W ) whereV is the set of mobile nodes in the network, the set of

social links is represented byE and the set of links’ weights is depicted byW . The

social links indicate the social relations between two nodes and the weight of a link

suggests the social strength.

Delay tolerant networks possess two basic elements: the encountering events

between nodes and the geographic information of each node. These two elements

describe the fundamental channel for communication as wellas the dynamics of

the network. In social perspective, people moves in the network leading to encoun-

ters. Both location information and encountering events are characterized as social

information. According to the location-based and encounter-based social informa-

tion, nodes in the network are grouped into different communities. This community

structure makes nodes in one community are highly social related while nodes in

different communities are less socially connected. In the rest of the thesis, we dis-

cuss the social graph in aspects of geographical locations and encountering events,

as well as the community structure in the network.

3.2.2 Data sets

We use two types of data sets for evaluating the proposed routing strategies: real

data traces and synthetic data traces.

Real data traces

We use the MIT Reality [25], DieselNet [16] and Cabspotting [77] three real data

traces to characterize delay tolerant networks. The MIT Reality data trace consists

of 97 users equipped with smart phones at MIT over the course of the 2004-2005

academic year. It records the information such as call logs,Bluetooth devices in

proximity, cell tower IDs, application usage and phone status. Over the whole

22

Page 39: Social-based Data Routing Strategies in Delay Tolerant

3.2. Network Model

experimental period, it covers more than30 thousand cellular towers. DieselNet

logs mobility traces of 34 buses in Amherst, covering an areaof more than350

km2. Each bus is equipped with a computer and a GPS. It records theGPS logs for

all the buses during the 20 days experiment time period from October to November

in 2007. Cabspotting is a mobility trace of taxi cabs in San Francisco. Each taxi

is outfitted with a GPS tracking device. It contains GPS coordinates of 536 taxis

collected over 30 days in San Francisco Bay Area, which covers over 2,000km2.

The statistics of the three mobility traces are summarized in Table 3.1. The three

traces cover a large diversity of mobility patterns and environment, from human

movements on campus (MIT Reality) to vehicles mobility in cities (DieselNet and

Cabspotting), with experimental periods from a few days to several months. All

three data sets present the human being activity, includinguser location information

and encountering events between nodes. We consider them as representatives of

delay tolerant networks.

Table 3.1: Statistics of three real data traces

Traces MIT Reality DieselNet Cabspotting

No. of devices 97 34 536No. of contacts 54,667 2,284 111,153Duration (days) 246 20 30

Contact rate 0.024 0.10 0.013Field size (km2) N/A 358 2,367

Synthetic data traces

To provide general assessing of routing strategies, we produce a group of synthetic

data sets to conduct the comprehensive comparisons.

Table 3.2: Parameters of synthetic data traces

No. of nodes 20 to 100

Node speed 0.5m/s to 2.5m/s

Duration 14 daysField size 48km2

We use SUMO simulator [9] to mimic nodes’ movements by generating ran-

dom trips during a period of two weeks. The experiment area ofthe synthetic trace

is chosen as MIT campus and its surroundings with a rectanglecovering 48km2

(6km ∗ 8km). The node speed (ns) (by walking) in one trace is constant and starts

23

Page 40: Social-based Data Routing Strategies in Delay Tolerant

Chapter 3. Conceptual Framework

from 0.5 m/s with a 0.5 m/s increment for each trace. Therefore, we generate 5 syn-

thetic data traces with different node speeds. Meanwhile, we generate 5 synthetic

data traces with different number of nodes (nn) ranging from20 to 100. For each

synthetic trace, we record the movements of each node in the experiment area and

extract the encounter-based and location-based social information. The parameters

of synthetic data traces are summarized as Table 3.2.

3.3 Basic Approach

In order to cope with intermittent connectivity, and use predicted and opportunistic

connectivity to serve for data routing in Delay Tolerant Networks, known as Bundle

Protocol [17], we proposed three social-based data routingstrategies. Specifically,

to enhance the performance of DTN routing, we propose two comprehensive rout-

ing strategies with one exploiting location-based social information and the other

one exerting encounter-based social information. Then we propose a third data

routing strategy using community structure to solve blind spot and dead end prob-

lems as mentioned in the above section, and meanwhile it achieves efficient inter-

community communication.

The location-based social routing strategy [103], named asLoc, combines two

metrics, similarity of mobility pattern and geographic distance to construct the com-

prehensive location-based data routing scheme. For the design of Loc, we assume

that each node needs to know the realtime position of its own,which means that

every node in the network is equipped with additional devices (such as Geographic

Positioning System (GPS)) to aware of its position. They exchange location in-

formation when they encounter. The efficient manner for location information ex-

change can be found in [36].

However, the utilization of location-based social information must be very care-

ful since it is much sensitive and private concern to users inthe network. Using

location information may violate user privacy. Malicious node can apply the col-

lected location information to realize the mobility patterns of others in the network,

which may be used for node tracking. Meanwhile, collecting location information

needs additional equipments, such as GPS. Compared with location-based social in-

formation, encounter-based social information is less sensitive and easy to obtain.

Therefore, we propose the encounter-based social routing strategy [103], called

Soc, consisting of two social properties, social centrality and social similarity as

basic factors for data routing.

24

Page 41: Social-based Data Routing Strategies in Delay Tolerant

3.3. Basic Approach

Additionally, blind spot and dead end problems lead to the message expired

before reaching the destination, which significantly decreases the delivery ratio of

routing strategies. To reduce the chance of blind spot and dead end and meanwhile

achieve efficient inter-community communication. We devise a community-based

social routing strategy named as SMART [102]. The proposed community-based

strategy works as follows. It first introduces a distributedcommunity partition-

ing method based on the observation that movements of DTN nodes are regular

and restricted in local areas where more encounters occur than that in remote ar-

eas. With distributed community partitioning, mobile nodes can flexibly adjust

their community IDs to assign with the group they most frequently encounter, and

the community structure is formed by exchanging only local information, which is

easy to be implemented in DTNs. For intra-community communications, the rout-

ing utility is calculated by integrating the convolution ofsocial similarity and social

centrality with a decay function. For inter-community communications, nodes fre-

quently traveling across communities are chosen as “fringenodes”, and the utilities

of communicating between fringe nodes and communities are measured for routing

decision, which enhances the delivery ratio effectively. Table 3.3

Table 3.3: Summary of three proposed routing strategies

Strategy Metric Location orEncounter

Blind Spot& DeadEnd

Remarks

Loc Geographicdistance +Mobilitypattern

Location No Achieves higher performancethan single location-based utilitymetric

Soc Socialcentrality+ Socialsimilarity

Encounter Partially Achieves better performancethan single encounter-basedutility metric and while less sen-sitive and easier for collectionthan location information

SMART Social in-formation +Community

Encounter Significantly Resolves blind spot and deadend problems by efficient intraand inter community communi-cation strategy

We use the following metrics to evaluate the performance of various data routing

strategies [33, 13].

• Delivery ratio: the ratio of the number of destinations that have received the

delivered data to the total number of destinations.

25

Page 42: Social-based Data Routing Strategies in Delay Tolerant

Chapter 3. Conceptual Framework

• Average delay: the average time delay used for destinations to receive the

data.

• Average cost: the average number of relays used for the messages success-

fully delivered to destinations.

A good routing strategy is supposed to have high delivery ratio, low average delay

and low average cost. The objective of our devised strategies is to enhance data

delivery ratio, reduce average delay and meanwhile maintain low cost.

26

Page 43: Social-based Data Routing Strategies in Delay Tolerant

Chapter 4

Location-based Routing Strategy

Location-based social information provides accurate position of nodes in the net-

work. If such data is used properly, it can be applied to enhance data routing per-

formance in DTN. In this chapter, we propose a comprehensiverouting metric that

combines several aspects of location-based features. The previous literatures con-

sider only one aspect of node characteristics or network features for constructing

routing metric, which cannot fully describe the network situation along the entire

period of routing process. For example, routing strategiesbased on geographic

distance cannot make accurate forwarding decision in the case that the change of

the network topology is frequent and unpredicted, or it cannot forward data farther

when nodes are with similar geographic distance with destinations. Similarly, rout-

ing strategies relying on mobility pattern also need to consider the situation that

the data forwarding choice when nodes with similar mobilitypattern. For instance,

two nodes often stay in the same location do not suggest that they meet each other

frequently. The sole mobility pattern metric cannot represent the routing criteria in

such situation.

To overcome above-mentioned problems in location-based routing, the pro-

posed routing scheme combines both geographic distance andmobility pattern to

comprehensively select data relay for further data delivery. The detailed design of

the routing scheme is represented in the following sections.

4.1 Location-based Social Information

We consider a network consisting of multiple mobile nodes that may travel in dif-

ferent locations. Each node is equipped with GPS device for awareness of geo-

graphical coordinates.

4.1.1 Location-based graph

We have presented the basic social graph model in Chapter 3. The nodes in the

graph represent the mobile nodes in DTNs. The link between two nodes suggests

27

Page 44: Social-based Data Routing Strategies in Delay Tolerant

Chapter 4. Location-based Routing Strategy

the social relationship of them and the weight on the link indicates the strength of

the relationship. The involving of location information enriches the graph in spatial

aspect. Specifically, the network area is with geographic coordinates and we convert

it to a grid with squares. Each square is considered as a location and is assigned with

a unique location ID. The network area is represented byL = l1, l2, ..., ln wherelimeans locationi.

To describe location information and mobility of nodes, we consider their move-

ments as discrete time-varying events. Each event suggeststhe location of a node

with time label. In particular, an event is described by fourelements: node ID,

location ID, start time and time duration. The start time is the time when the node

enters the location and the time duration is the time length that the node stays at

the location. Based on the event sequences, we can obtain thestatistical location

information such as the similarity of mobility pattern which is described as time

proportion that a node stays at a location, and temporal location information like

the distance between two nodes at a certain time. The statistical location informa-

tion suggests the user mobility pattern. For example, if a user stays at a location

for a large proportion of time, he will likely to visit the same location in the future.

The temporal location information, on the other hand, represents the instant user

behavior at a certain time. We describe two types of locationinformation in the

following.

4.1.2 Geographical distance

Thegeographical distance between two nodes measures their physical separation.

Several schemes [87, 45] take the distance as utility for data forwarding based on

the fact that the closer geographical distance two nodes have, the more likely they

will meet with each other. We calculate the distance of a pairof locations(x, y)

whereni andnj have visited as:

gxy = ||sx − sy||,

wheresx is the GPS coordinates of locationx, andgxy is the distance betweenx

andy. In line with the characteristics of utility value that is the larger the better, we

make a conversion as:

dxy =1

1 + gxy,

28

Page 45: Social-based Data Routing Strategies in Delay Tolerant

4.1. Location-based Social Information

The largergxy, the smallerdxy is. Each node visits several locations. To measure

the distance between two nodes, we use a distance matrix to represent it. Suppose

nodeni has visited a series of locationsLi = {li1, li2, · · · , lim}, and nodenj has

visitedLj = {lj1, lj2, · · · , ljn}. The distance matrix betweeni andj therefore is

written as:

Dij =

dli1,lj1 · · · dli1,ljn...

. . ....

dlim,lj1 · · · dlim,ljn

.

4.1.3 Similarity of mobility pattern

The similarity of mobility pattern measures the extent that different nodes visiting

the same places. It is calculated by the time proportion thattwo nodes spend in the

same locations [49]. The larger time proportion that nodes stay at the same places,

the more similar their mobility patterns are. We show the measurement of similarity

of mobility pattern as follows.

For each nodeni, it spends different proportions of time at various locations

over a defined time interval. Similar to [49], we use a vectormi to present its time

proportions of different locations:

mi = (c1i, c2i , ..., cni), with

n∑

k=1

cki = 1,

wherecki is the time proportion that nodeni at locationk. The product of time

proportion between nodeni andnj in a pair of locations(x, y) is:

sxi,yj = cxi∗ cyj .

It suggests that the probability that nodeni stays at locationx and nodenj stays at

locationy. Suppose nodeni has visited a series of locationsLi = {li1, li2, · · · , lim},

and nodenj has visitedLj = {lj1, lj2, · · · , ljn}, we use a matrix to denote the

product of time proportion betweenni andnj in different locations.

Sij =

sli1,lj1 · · · sli1,ljn...

. . ....

slim,lj1 · · · slim,ljn

.

The matrix depicts the probability that nodeni and nodenj stay at various locations.

29

Page 46: Social-based Data Routing Strategies in Delay Tolerant

Chapter 4. Location-based Routing Strategy

It reveals the similarity of their mobility patterns.

4.2 Strategy Design

Location-based data forwarding strategies are usually determined by the location

information among different nodes. In this section, we propose a comprehensive

location-based social routing strategy calledLoc in delay tolerant networks.

B

A

D

Figure 4.1: An example of visiting locations

The design of Loc is inspired by the observation that two nodes physically stay

close to each other and commonly visit same locations are more likely to meet each

other. Such situation occurs between colleagues, neighbors and etc. That is, two

nodes with similar mobility pattern and close in geographical locations (i.e. they

share many common visited places and their visited places are close) are more likely

to meet each other. For example, as shown in Fig. 4.1, supposenodeA andB are

possible relays to deliver data to nodeD. Three squares represent the visiting areas

of three nodes respectively. The circles inside of squares are several locations they

visit. Overall, the average distance between visiting locations of nodeA (circles

in the left square) and visiting locations of nodeD (circles in the middle square)

is longer than the distance between nodeB’s visiting locations (circles in the right

square) and nodeD’s visiting locations. Besides, nodeA shares one location with

nodeD as shown in grey circle, whereas nodeB shares 4 locations with nodeD as

shown in black circles. Based on the similarity of mobility pattern and geographical

distance, nodeB will most likely be selected for data delivery to nodeD. Accord-

ing to this observation, we model Loc scheme by incorporating the similarity of

mobility pattern and geographical distance to the destination node.

To measure the similarity of mobility pattern and the geographical distance,

30

Page 47: Social-based Data Routing Strategies in Delay Tolerant

4.2. Strategy Design

Table 4.1: Table of visiting history

Location Visiting time Duration Time propagationx t ∆t cx

each node needs to maintain a list of locations they have visited as shown in Table

4.1, which contains the time of visiting the location as wellas the time proportion

of staying at it. When two nodes encounter, they exchange thelist of locations

and calculate corresponding metrics. From the point view oflocation information,

on the one hand, the similarity of mobility pattern only measures the extent that

different nodes staying at the common places. It cannot reflect the spatial distance

of nodes. The geographical distance, on the other hand, onlyshows the temporal

value of distance, which cannot present the geographical closeness of two nodes.

Therefore, we propose Loc scheme by combining similarity ofmobility pattern

and geographical distance. To present this compound measurement, we take the

Hadamard product of the two matrices. The Hadamard product produces a matrix

Hij that each elementpq is the product ofpq elements inSij andDij . The operation

is depicted in the following:

Hij = Sij ◦Dij =

sli1,lj1dli1,lj1 · · · sli1,ljndli1,ljn...

. . ....

slim,lj1dlim,lj1 · · · slim,ljndlim,ljn

.

The matrix presents both similarity of mobility pattern andthe geographical dis-

tance of two nodes. We take the average of the sum of all elements in the matrix

as the geographical metric between nodeni andnj . The larger of the average, the

closer geographical relation two nodes have, and thereforethe more chance they

will encounter. When two nodes (i.e.ni andnj) encounter, for messages carried

by ni (nj), they decide whether to takenj (ni) as the next relay by comparing their

utilities to the destination, which is the average of the sumof elements inHid (Hjd).

If the average value ofni (nj) is smaller than that ofnj (ni), the message will be

forwarded fromni (nj) to nj (ni). The detail of Loc algorithm is illustrated in

Algorithm 1.

31

Page 48: Social-based Data Routing Strategies in Delay Tolerant

Chapter 4. Location-based Routing Strategy

Algorithm 1: Loc algorithminput : visiting location information and encounter events in DTNsoutput: utility and data routing decision

beginassume the initial time ist0 ;if ni encounter with nj at time t then

foreach data x in node i doupdateSidx;updateDidx;Hidx = Sidx ◦Didx ;updateSjdx;updateDjdx;Hjdx = Sjdx ◦Djdx ;if avg(Hidx(t)) < avg(Hjdx(t)) then

forward data tonj ;

foreach data y in node j doupdateSidy ;updateDidy ;Hidy = Sidy ◦Didy ;updateSjdy ;updateDjdy ;Hjdy = Sjdy ◦Djdy ;if avg(Hjdy(t)) < avg(Hidy(t)) then

forward data toni ;

32

Page 49: Social-based Data Routing Strategies in Delay Tolerant

4.3. Performance Evaluation

4.3 Performance Evaluation

We conduct experiments to study the performance of location-based social routing

strategies.

4.3.1 Experiment setup

We use HaggleSim simulator [39] to launch our experiments, which uses encounter

entries as inputs to estimate data delivery path according to different data routing

strategies. We extract 14-day session from three read traces and synthetic traces.

We utilize the GPS location information as the basic location-based social informa-

tion as the input to construct corresponding location-based social graph and related

compound location information. The simulator generates1, 000 messages for each

round of simulations. Each message is assigned with a randomsource and desti-

nation. For each message, we keep three copies in the networkto make the data

delivery with higher chance reach destinations. The message keeps alive until the

experimental session (14 days) is end. We do each simulation20 times and take the

average value of results for statistical convergence.

To measure the location information of nodes in the network,we divide the

experiment area of data traces into discrete locations. Thenetwork area is marked

with geographic coordinates. According to the coordinates, we calculate the size of

network area and convert it to a grid with multiple blocks. Each block in the grid

is defined as a location. It is distinguished by a unique ID. Specifically, we divide

the MIT Reality experiment area by cellular towers. Since MIT Reality data trace

does not contain the GPS coordinates of node mobility and a location is marked by

detected cellular tower ID, we consider the area of a cellular tower is a location. In

contrast, we convert the experiment area of the other two data sets to a grid with

adjacent squares. The size of each square is 1km2 and each of them is assigned with

a unique ID. If a node’s mobility range falls into a location,we record the visiting

history of the node. We measure the distance of two locationsby the coordinates of

their centers.

4.3.2 Strategies in comparison

We compare the Loc scheme with other two location-based datarouting strategies:

Mobyspace [49], and GPSR [45].

Mobyspace calculates the similarity of mobility pattern byEuclidean distance

33

Page 50: Social-based Data Routing Strategies in Delay Tolerant

Chapter 4. Location-based Routing Strategy

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

Time [day]

Del

iver

y ra

tio

MobyspaceLoc

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5x 105

Time [day]

Ave

rage

del

ay [s

]

MobyspaceLoc

(b) Average delay

0.5 1 2 4 6 8 10 12 140.9

1

1.1

1.2

1.3

1.4

1.5

Time [day]

Ave

rage

cos

t

LocMobyspace

(c) Average cost

Figure 4.2: Performance comparison of location-based strategies on MIT Realitydata trace

of two nodes’ visiting history and chooses the node with shorter distance (i.e., more

similar mobility pattern) with the destination node as the relay for data delivery.

Specifically, it considers a node’s visiting history as a vector. Each value in the

vector represents the percentage of time that the node staysat the location. The

distance between two nodes is computed by the Euclidean distance between two

vectors.

GPSR routes data based on geographical distance. It makes greedy forward-

ing using the geographic positions of a router’s neighbors in the network topology.

Specifically, a node obtains its neighbors’ positions by information exchange. Then

it locally chooses the optimal next hop with the neighbor geographically closest

to the packet’s destination. Forwarding in this regime follows successively closer

geographic hops until the destination is reached. When a packet reaches a region

where greedy forwarding is impossible (i.e. packets reaches the local maxima), the

34

Page 51: Social-based Data Routing Strategies in Delay Tolerant

4.3. Performance Evaluation

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

Time [day]

Del

iver

y ra

tio

MobySpaceGPSRLoc

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5

3x 105

Time [day]

Ave

rage

del

ay [s

]

MobySpaceGPSRLoc

(b) Average delay

0.5 1 2 4 6 8 10 12 140

0.2

0.4

0.6

0.8

1

1.2

1.4

Time [day]

Ave

rage

cos

t

MobySpaceGPSRLoc

(c) Average cost

Figure 4.3: Performance comparison of location-based strategies on DieselNet datatrace

algorithm recovers by routing around the perimeter of the region. It exploits the

right-hand rule which traverses the interior of a closed polygonal region in clock-

wise edge order to seek for the next hop.

4.3.3 Performance analysis

To implement the Loc scheme, we set the period for GPS coordinates refreshment

as one day. We use the same settings as the original paper for the implementation

of Mobyspace and GPSR.

Fig. 4.2 shows the results of delivery ratio, average delay and cost of Mobyspace

and Loc on MIT Reality data trace. Since MIT Reality trace does not provide co-

ordinates information, the GPSR scheme cannot be evaluatedand the utility of Loc

can only reflect the similarity of mobility pattern on the data trace. The presenta-

tion of Loc expresses the same meaning as Mobyspace, though their calculation is

35

Page 52: Social-based Data Routing Strategies in Delay Tolerant

Chapter 4. Location-based Routing Strategy

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

Time [day]

Del

iver

y ra

tio

MobyspaceGPSRLoc

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5

3x 105

Time [day]

Ave

rage

del

ay [s

]

MobyspaceGPSRLoc

(b) Average delay

0.5 1 2 4 6 8 10 12 140.8

1

1.2

1.4

1.6

1.8

Time [day]

Ave

rage

cos

t

MobySpaceGPSRLoc

(c) Average cost

Figure 4.4: Performance comparison of location-based strategies on Cabspottingdata trace

slightly different. Therefore, they get very similar performance in terms of various

aspects.

The performance on DieselNet data trace is shown in Fig. 4.3.Fig. 4.3a shows

that the Loc scheme has similar delivery ratio with Mobyspace on DieselNet data

trace throughout the experiment period. Nevertheless, GPSR has 20% degradation

compared with the Loc scheme in terms of delivery ratio sincethe 4th day of the

experiment. This degradation attributes to the distance latency of GPSR algorithm,

which does not show the real time distance between current node and the desti-

nation, thus makes the relay selection inefficient. Similarresults as shown in Fig.

4.3b, the average delays of three schemes are similar before6 days. Afterwards, the

average delay of the Loc scheme and Mobyspace is around 80% ofthat of GPSR.

The results of average cost as shown in Fig. 4.3c depict that the Loc scheme has

0.1 hops lower cost than Mobyspace. The cost of GPSR is the lowest, which only

36

Page 53: Social-based Data Routing Strategies in Delay Tolerant

4.3. Performance Evaluation

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

Time [day]

Del

iver

y ra

tio

MobySpaceGPSRLoc

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2x 105

Time [day]

Ave

rage

del

ay [s

]

MobySpaceGPSRLoc

(b) Average delay

0.5 1 2 4 6 8 10 12 141

1.5

2

2.5

Time [day]

Ave

rage

cos

t

MobySpaceGPSRLoc

(c) Average cost

Figure 4.5: Performance comparison of location-based strategies on synthetic datatrace

takes 0.1 hops of average cost.

The performance on Cabspotting data trace is depicted as Fig. 4.4. The result of

delivery ratio is shown in Fig. 4.4a. It shows that Loc outperforms Mobyspace by

2% throughout the experiment. The delivery ratio of GPSR has10% lower delivery

ratio in the first half experiment period and then 5% higher delivery ratio than that

of the Loc scheme. Fig. 4.4b presents the average delay of Mobyspace and Loc is

60% of that of GPSR from 6th day of the experiment. Similar results as shown in

Fig. 4.4c, the average cost of the Loc scheme is very similar with Mobyspace, and

it is around 0.6 hops lower than that of GPSR throughout the experiment.

In a summary, the experiment results show that Loc outperforms the other

location-based routing strategies in most cases on three real world data traces.

Similarly, we conduct the comparison of location-based routing strategies on

synthetic traces and show the comparison results on one datatrace (ns = 1.5m/s,

37

Page 54: Social-based Data Routing Strategies in Delay Tolerant

Chapter 4. Location-based Routing Strategy

nn = 100). The evaluation results are shown in Fig. 4.5. The deliveryratio of Loc

is 8% higher than that of Mobyspace and 15% higher than that ofGPSR during the

entire experiment period, as depicted in Fig. 4.5a. Three location-based routing

strategies have similar delay for data delivery as shown in Fig. 4.5b. Additionally,

as shown in Fig. 4.5c the Loc scheme takes 0.1 hops more than Mobyspace at

the end of the experiment. It costs 1 hop less than GPS in the beginning of the

experiment, and reaches similar cost in the end. The Loc scheme performs not

worse than other location-based routing strategies. The performance on synthetic

traces confirms our results on the real traces.

4.4 Summary of Contributions

In this chapter, we propose a location-based routing strategy incorporating geo-

graphic distance and mobility pattern as two aspects of location information to con-

duct routing utility. The comprehensive metric is adaptable for more network situ-

ations compared with the metric relying on sole aspect of location information. We

conduct the performance evaluation on both real and synthetic data traces. The re-

sults show that the proposed comprehensive location-basedsocial routing strategy

outperforms other location-based strategy around 10% in terms of delivery ratio. It

takes less decay and cost for data delivery.

38

Page 55: Social-based Data Routing Strategies in Delay Tolerant

Chapter 5

Encounter-based Routing Strategy

We discuss the location-based routing strategy in the last chapter. However, the

utilization of location-based social information is very sensitive and needs more

privacy concern [10, 11, 72, 98, 47]. In contrast, encounter-based social information

refers to the fundamental element of DTNs. That is, the data delivery in DTNs relies

on the encountering event of nodes. Applying such encounter-based information

reveals much less sensitive than location information.

Compared with location-based routing, encounter-based routing has different

characteristics. Location-based strategies forward datato the nodes geographically

closer to the destination, which tend to achieve geographical shortest routing path.

In contrast, encounter-based social strategies forward data to the nodes logically

closer to the destination. It looks for the logical shortestrouting path.

Inspired by the motivation of privacy preserving as well as the characteristics of

encounter-based routing strategies, we propose a comprehensive encounter-based

social routing scheme.

5.1 Encounter-based social information

We consider two nodes have an encounter if they are in the communication range of

each other. The encounter-based social graph is modeled by the following method.

5.1.1 Encounter-based social graph

As we described in Chapter 3, we model DTN as a social graphG = (V,E,W )

whereV is the set of mobile nodes in the network, the set of social links is rep-

resented byE and the set of weights of links is depicted byW . The social links

indicate the social relations between two nodes and the weight of a link suggests

the social strength. Involving encountering events, two nodes have a social link if

the number of their encounters exceeds a threshold. The weight of the edges is the

number of encounters between two nodes.

39

Page 56: Social-based Data Routing Strategies in Delay Tolerant

Chapter 5. Encounter-based Routing Strategy

Social information in an encounter-based social graph refers as the information

that either directly obtained from encounter-based socialgraph, or social analysis

results of contact graph. It indicates the structural status of a node in the constructed

weighted graph. Typical encounter-based social information includes: social degree

of a node, which is known as the degree of a node in encounter-based social graph,

representing the number of friends (encountering nodes), social strength between

two nodes, denoted by the weight of an edge between the pair ofnodes in encounter-

based social graph, and social similarity, which represented by the common friends

of two nodes in encounter-based social graph and etc.

Specifically, we address two widely used social properties inferred from encounter-

based social graph: social similarity and social centrality.

5.1.2 Social similarity

Social similarity evaluates the number of common friends of two nodes, which

indicates the trustiness and cohesive of social links [23, 22]. We define the social

similarity as follows:

Si,j(τ) = 1 + |Fi(τ)⋂

Fj(τ)|, (5.1)

whereFi(τ) (Fj(τ)) is the set of friends of nodeni (nj) at timeτ and plus 1 is

to avoid 0 values. Thus, the social similarity betweenni and the destinationnd

is Si,d(τ). Intuitively, if a node has higherSi,d(τ) value, it shares more common

friends with the destination, thus more likely to transmit the message successfully.

5.1.3 Social centrality

Social centrality is the quantification of the relative importance of nodes in the so-

cial network. There are various definitions of centrality, such as edge betweenness

[31, 30] or closeness centrality [30], which cannot very easily be exploited in DTNs

since they need global information to estimate the centrality value. Therefore, we

use the Freeman’s degree centrality [30], which only needs the neighbor informa-

tion, to define social centrality in the context of DTN. For a nodeni, its centrality

is defined as follows:

Ci(τ) =

∑N

k=1 dik(τ)

N,

40

Page 57: Social-based Data Routing Strategies in Delay Tolerant

5.2. Strategy Design

wheredik(τ) = 1 if a direct link exists betweenni andnk at timeτ andN is the

number of nodes in the network. Structurally, a central nodehas strong connection

with other nodes, and it is suitable to serve as a hub for information exchange.

Previous research [80] shows that there is a strong statistical dependency between

delivery ratio and social centrality.

5.2 Strategy Design

Encounter-based data routing schemes handle data based on either direct encounter

information or various social information derived from encounter-based social graph,

such as social degree and the strength of social ties. We propose a social-based rep-

resentative data routing scheme in this section, which is named asSoc.

Previous research has shown that the status of nodes in a social network is un-

even: some nodes are in the central positions of the network while the others are in

the edges [92]. An example is that a small fraction of nodes occupy most of degrees

in the social graph structure. Generally speaking, forwarding data to the node who

is more social active will increase the probability of data delivery. Based on this

consideration, we propose the Soc scheme combining social centrality and social

similarity to find the most feasible social path for data forwarding.

Table 5.1: Table of encounter history

Encounters Time Similarity Centralitynj tj Si,j(tj) Ci(tj)

To determine the value of social centrality and social similarity, for a nodeni, it

records the encounter history for a period of∆T as shown in Table 5.1, which con-

tains the encounter time and its social properties accordingly. Meanwhile, it also

maintains a list of friends. When two nodes encounter, they exchange their friend

lists to compute the social similarity. Noticing that social similarity and social cen-

trality only reflect the features of network structure, we also need to consider the

dynamics of social networks. Since the encounters of mobilenodes change dy-

namically, the comprehensive utility should be a time-varying function. To address

the dynamic feature and avoid the accumulative effects, we define the comprehen-

sive utility as the convolution of social similarity and social centrality with a factor

41

Page 58: Social-based Data Routing Strategies in Delay Tolerant

Chapter 5. Encounter-based Routing Strategy

Algorithm 2: Soc algorithminput : encounter events in DTNs and encounter-based social graphoutput: utility and data routing decision

beginassume the initial time ist0 ;if ni encounter with nj at time t then

foreach data x in ni doYi,dx(t) = Ci(t0);Yj,dx(t) = Cj(t0);foreach encounter between ni and ndx at ti,dx(ti,dx < t) do

Yi,dx(t) = Yi,dx(t) + Si,dx(ti,dx) ∗ Ci(ti,dx)/(t− ti,dx) ;

foreach encounter between nj and ndx at tj,dx(tj,dx < t) doYj,dx(t) = Yj,dx(t) + Sj,dx(tj,dx) ∗ Cj(tj,dx)/(t− tj,dx) ;

if Yi,dx(t) < Yj,dx(t) thenforward data tonj ;

foreach data y in nj doYj,dy(t) = Cj(t0);Yi,dy(t) = Ci(t0);foreach encounter between nj and ndy at tj,dy(tj,dy < t) do

Yj,dy(t) = Yj,dy(t) + Sj,dy(tj,dy) ∗ Cj(tj,dy)/(t− tj,dy) ;

foreach encounter between ni and ndy at ti,dy(ti,dy < t) doYi,dy(t) = Yi,dy(t) + Si,dy(ti,dy) ∗ Ci(ti,dy)/(t− ti,dy) ;

if Yi,dy(t) > Yj,dy(t) thenforward data toni ;

42

Page 59: Social-based Data Routing Strategies in Delay Tolerant

5.3. Performance Evaluation

decaying as time.

Yi,d(T ) = Si,d(T )⊗ (Ci(τ)/T ).

=

ˆ T

τ=0

Si,d(τ) ∗Ci(τ)

T − τ.

The convolution operation provides a time-decaying description of all prior val-

ues of social similarity and social centrality. The utilityis updated each time by

accumulation of social similarity when a new encounter occurs. The decay func-

tion suggests that the most recent encounters typically have the more influence, and

the impacts of previous encounters decrease as elapsed timeand social centrality.

When two nodes (i.e. nodeni andnj) encounter at timet, for messages carried by

nodeni and messages carried by nodenj, they are determined whether to be trans-

mitted tonj (ni) by comparing the utilities ofni andnj to destinations. IfYi,d(t)

(Yj,d(t)) is smaller thanYj,d(t) (Yi,d(t)), the message will be transmitted fromni

(nj) to nj (ni). The algorithm of Soc is outlined in Algorithm 2.

5.3 Performance Evaluation

We conduct experiments to study the performance of the encounter-based social

strategies.

5.3.1 Experiment setup

We use HaggleSim simulator [39] to launch our experiments, which uses encounter

entries as inputs to estimate data delivery path according to different data rout-

ing strategies. We extract 14-day session from three data traces and synthetic data

traces. We utilize the encountering events as the input to construct corresponding

encounter-based social graph and related social information. The simulator gener-

ates1, 000 messages for each round of simulations. Each message is assigned with

a random source and destination. For each message, we keep three copies in the

network to make the data delivery with higher chance reaching destinations. The

message keeps alive until the experimental session (14 days) is end. We do each

simulation 20 times and take the average value of results forstatistical convergence.

To show the social features of composed encounter-based social graph, we in-

vestigate the structural properties of the graph from a random selected periodical

social graph. We first measure the cluster coefficient of three data traces. The clus-

43

Page 60: Social-based Data Routing Strategies in Delay Tolerant

Chapter 5. Encounter-based Routing Strategy

1 2 3 4 5 60

0.2

0.4

0.6

0.8

1

Number of hops

CD

F (

pair

of n

odes

)

MIT RealityDieselNetCabspotting

Figure 5.1: Cumulative distribution function of shortest path lengths

tering coefficient for MIT Reality is 0.5137, DieselNet has the coefficient value

0.7365 and the clustering coefficient of Cabspotting is 0.4544. We also measure the

shortest path length of three composed encounter-based social graphs. By randomly

sampling pairs of nodes in the social graph and calculating their average shortest

path lengths, we draw the cumulative distribution function(CDF) in Fig. 5.1. As

shown in the figure, over 95% of the shortest paths in the threedata sets are below 5

hops. And almost 100% of the shortest paths are below 6 hops. Thus for a random

pair of nodes in the DTNs, they are connected by a shortest path lower than 5 hops

with high probability. The two measurements suggest that the encounter-based so-

cial graphs are all small world networks [91] and they are as clustered as social

networks [92], which suggests the application of social-based properties, such as

social centrality and social similarity are feasible for encounter-based social rout-

ing strategies.

5.3.2 Strategies in comparison

We compare Soc with two other encounter-based social routing strategies: Bubble

Rap [40] and SimBet [24].

Bubble Rap considers the data routing in pocket switched network (PSN) which

consists of several communities and there are social relationships among users. It

uses k-clique percolation as the basic community detectionmethod. There are two

steps of routing in Bubble Rap. The first step is to forward data to the destina-

tion community. It delivers data items from outside of the destination’s community

according to a node’s global social centrality. If a node with higher global social

44

Page 61: Social-based Data Routing Strategies in Delay Tolerant

5.3. Performance Evaluation

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

Time [day]

Del

iver

y ra

tio

SocSimBetBubble Rap

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5

3x 105

Time [day]

Ave

rage

del

ay [s

]

SocSimBetBubble Rap

(b) Average delay

0.5 1 2 4 6 8 10 12 140.5

1

1.5

2

2.5

3

Time [day]

Ave

rage

cos

t

SocSimBetBubble Rap

(c) Average cost

Figure 5.2: Performance of encounter-based social schemeson MIT Reality datatrace

centrality, it will be selected as the relay for data forwarding. Within the destina-

tion’s community, the forwarding utility is based on a node’s local social centrality.

The data item will be forwarded to a node with higher local social centrality.

SimBet takes the linear combination of social similarity and social centrality

as the forwarding utility to construct the data forwarding path. Instead of only

considering single social property, the SimBet scheme considers the utility function

as the sum of social similarity and social centrality, whichmeasures both the social

closeness with destination node and social position of the node in the network. In

this work, the social similarity is represented by the number of common friends.

The social centrality is calculated by local betweenness. The scheme chooses the

node with higher combination utility value as the relay for data forwarding.

45

Page 62: Social-based Data Routing Strategies in Delay Tolerant

Chapter 5. Encounter-based Routing Strategy

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

Time [day]

Del

iver

y ra

tio

SocSimBetBubble Rap

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5

3x 105

Time [day]

Ave

rage

del

ay [s

]

SocSimBetBubble Rap

(b) Average delay

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

Time [day]

Ave

rage

cos

t

SocSimBetBubble Rap

(c) Average cost

Figure 5.3: Performance of encounter-based social schemeson DieselNet data trace

5.3.3 Performance analysis

To implement the Loc scheme, we set the rate for refreshment of social links as

once per day. We use the same settings as the original paper for the implementation

of Bubble Rap and SimBet.

The performance of encounter-based social schemes on MIT Reality is shown

in Fig. 5.2. It shows the dynamics of delivery ratio, averagedelay and average cost

as a function of time. Due to the low contact rate in DTNs, the node may not reach

the destination if the time that the data is sent out from the source is short. Thus, the

delivery ratio shows the tight relation with time and increases as time passing. The

average delay changes accordingly. Regarding the performance of delivery ratio on

MIT Reality, as shown in Fig. 5.2a, due to the social characteristics in MIT Reality

trace is apparent, the Soc schemes performs similar as Bubble Rap. Specifically,

the Soc scheme outperforms Bubble Rap by 4%, but SimBet performs not as well

as the others. The Soc scheme achieves more than 15% higher that SimBet in the

46

Page 63: Social-based Data Routing Strategies in Delay Tolerant

5.3. Performance Evaluation

0.5 1 2 4 6 8 10 12 140

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Time [day]

Del

iver

y ra

tio

SocSimBetBubble Rap

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5

3

3.5x 105

Time [day]

Ave

rage

del

ay [s

]

SocSimBetBubble Rap

(b) Average delay

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

Time [day]

Ave

rage

cos

t

SocSimBetBubble Rap

(c) Average cost

Figure 5.4: Performance of encounter-based social schemeson Cabspotting datatrace

end of the experiment period. Similar results are shown in Fig. 5.2b in terms of

average delay. The average delay of Soc is a little longer than that of Bubble Rap,

but 40% shorter than that of SimBet in the end of the experiment. Accordingly, in

the end of the experiment, Bubble Rap takes the longest average cost, which is 1.2

hops longer than that of Soc, while SimBet takes 0.5 hops shorter average cost than

the Soc scheme.

Fig. 5.3 is the performance on DieselNet trace. Although DieselNet is a bus

trace, it still shows strong social characteristics in its encounter-based social graph.

As shown in Fig. 5.3a, Soc has slightly lower delivery ratio than Bubble Rap

throughout the experiment, which is about 1%. In contrast, the delivery ratio of

Soc is 10% higher than that of SimBet in the end of the experimental period. The

comparison of average delay is shown in Fig. 5.3b, where the three schemes take

similar delays before 6 days experiment time. Afterwards, the average delay of Soc

47

Page 64: Social-based Data Routing Strategies in Delay Tolerant

Chapter 5. Encounter-based Routing Strategy

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

Time [day]

Del

iver

y ra

tio

SocSimBetBubble Rap

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2x 105

Time [day]

Ave

rage

del

ay [s

]

SocSimBetBubble Rap

(b) Average delay

0.5 1 2 4 6 8 10 12 141

1.5

2

2.5

Time [day]

Ave

rage

cos

t

SocSimBetBubble Rap

(c) Average cost

Figure 5.5: Performance of encounter-based social schemeson synthetic data trace

is 2% longer than that of Bubble Rap, but 10% shorter than thatof SimBet. Ad-

ditionally, Bubble Rap takes the highest average cost whileSoc has the lowest, as

shown in Fig. 5.3c. The average cost of Soc is 0.8 hops lower than that of Bubble

Rap and 0.3 hops higher than that of SimBet in the end of the experiment.

The performance of encounter-based social schemes on Cabspotting is shown

in Fig 5.4. Fig. 5.4a shows that throughout the entire experiment, the Soc scheme

outperforms SimBet by 10% in terms of delivery ratio, while Bubble Rap achieves

very low delivery ratio. The poor performance of Bubble Rap suggests that the so-

cial centrality metric does not opt to Cabspotting data trace for data routing. Similar

results as shown in Fig. 5.4b and Fig. 5.4c. The average delayof the Soc scheme is

50% of that of SimBet and 25% of that of Bubble Rap respectively in the end point

of the experiment. The average cost of Soc is almost the same as the cost of SimBet

throughout the experiment. It has 0.5 hops higher than Bubble Rap in the first two

days of the experiment. The cost of Bubble Rap keeps growing and it has 0.7 hops

48

Page 65: Social-based Data Routing Strategies in Delay Tolerant

5.3. Performance Evaluation

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

Time [day]

Del

iver

y ra

tio

SocLoc

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5x 105

Time [day]

Ave

rage

del

ay [s

]

SocLoc

(b) Average delay

0.5 1 2 4 6 8 10 12 140.9

1

1.1

1.2

1.3

1.4

1.5

Time [day]

Ave

rage

cos

t

SocLoc

(c) Average cost

Figure 5.6: Performance comparison of Soc and Loc strategies on MIT Reality datatrace

higher than others in the end of the experiment.

Overall, the experiment results show that the Soc scheme hascompetitive per-

formance compared with other encounter-based social schemes on three real data

traces.

To provide general assessing for encounter-based social routing schemes, we

compare the performance of encounter-based strategies on synthetic data traces.

Here we show the comparison results on one instance of traces(ns = 1.5m/s,

nn = 100), as presented in Fig. 5.5. Fig. 5.5a suggests that the Soc scheme has

5% higher delivery ratio than that of Bubble Rap and SimBet during the entire ex-

periment period. Similarly, the Soc scheme has outstandingperformance in terms

of delay and average cost as shown in Fig. 5.5b and Fig. 5.5c. In particular, Soc

takes 6% lower delay than both Bubble Rap and SimBet in the endof the experi-

ment. At the same time, throughout the experiment, Soc uses slightly higher cost

49

Page 66: Social-based Data Routing Strategies in Delay Tolerant

Chapter 5. Encounter-based Routing Strategy

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

Time [day]

Del

iver

y ra

tio

MobySpaceGPSRLoc

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

Time [day]

Del

iver

y ra

tio

SocLoc

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5x 105

Time [day]

Ave

rage

del

ay [s

]

SocLoc

(b) Average delay

0.5 1 2 4 6 8 10 12 14

0.7

0.8

0.9

1

1.1

1.2

Time [day]

Ave

rage

cos

t

SocLoc

(c) Average cost

Figure 5.7: Performance comparison of Soc and Loc strategies on DieselNet datatrace

(less than 0.1 hops) than Bubble Rap. It uses 0.1 hops less cost than SimBet. The

evaluation of encounter-based social routing strategies on synthetic traces is in line

with the results on the real traces that the Soc scheme has competitive performance

with other encounter-based social schemes.

5.3.4 Compare with the Loc strategy

In this subsection, we compare Soc to Loc in terms of various performance metrics

on three real data traces and synthetic data trace.

The performance comparison of Soc and Loc on MIT Reality is shown in Fig.

5.6. Fig. 5.6a presents the comparison of delivery ratio. Itreveals that both Soc and

Loc have very close achievement in terms of delivery ratio, albeit the delivery ratio

of Soc is slightly higher (about 1%) than that of Loc throughout the experiment.

Two schemes also perform similarly regarding average delayand average cost, as

50

Page 67: Social-based Data Routing Strategies in Delay Tolerant

5.3. Performance Evaluation

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

Time [day]

Del

iver

y ra

tio

SocLoc

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 142

4

6

8

10

12

14

16x 104

Time [day]

Ave

rage

del

ay [s

]

SocLoc

(b) Average delay

0.5 1 2 4 6 8 10 12 140.8

0.85

0.9

0.95

1

1.05

1.1

1.15

1.2

Time [day]

Ave

rage

cos

t

SocLoc

(c) Average cost

Figure 5.8: Performance comparison of Soc and Loc strategies on Cabspotting datatrace

shown in Fig. 5.6b and Fig. 5.6c, respectively. Loc takes a little higher average

delay than Soc. Loc take 0.05 hops less cost than Soc.

The delivery ratio on DieselNet as shown in Fig. 5.7a presents that the Loc

scheme has 1% degradation compared with the Soc scheme during the entire exper-

iment period. Meanwhile, the Soc scheme outperforms the Locscheme slightly in

terms of average cost as shown in Fig. 5.7c. They have very similar average delay

as shown in Fig. 5.7b.

The performance comparison on Cabspotting is shown in Fig. 5.8. The Loc

scheme outperforms the Soc scheme by 2% in terms of delivery ratio as shown in

Fig. 5.8a throughout the experiment. At the same time, the Soc scheme takes 2%

higher average delay than that of the Loc scheme as shown in Fig. 5.8b. Regarding

the average cost, as shown in Fig. 5.8c, the Soc scheme costs 0.1 hops less than the

Loc scheme.

51

Page 68: Social-based Data Routing Strategies in Delay Tolerant

Chapter 5. Encounter-based Routing Strategy

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

Time [day]

Del

iver

y ra

tio

SocLoc

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2x 105

Time [day]

Ave

rage

del

ay [s

]

SocLoc

(b) Average delay

0.5 1 2 4 6 8 10 12 141

1.5

2

2.5

Time [day]

Ave

rage

cos

t

MobySpaceGPSRLoc

0.5 1 2 4 6 8 10 12 141

1.5

2

2.5

Time [day]

Ave

rage

cos

t

SocLoc

(c) Average cost

Figure 5.9: Performance comparison of Soc and Loc strategies on synthetic datatrace

One instance of comparison of Soc and Loc on the synthetic traces (ns =

1.5m/s, nn = 100) is shown in Fig. 5.9. Similar as other data traces in the real

world, both Soc and Loc have closely performance. Specifically, Soc outperforms

Loc by about 3% in terms of delivery ratio as shown in Fig. 5.9ain the entire ex-

periment period. The average delay of Loc and Soc as presented in Fig. 5.9b is

twisted together at the beginning of the experiment and thenthe Soc scheme has

lower delay than Loc. The average cost as shown in Fig. 5.9c presents that Soc has

0.2 hops higher cost than Loc in the beginning of the experiment and the cost of

Soc is 0.1 hops lower than that of Loc in the end of the experiment.

Besides, we conduct a group of experiments on the synthetic data traces with

different node speeds ranging from0.5m/s to 2.5m/s (nn = 100). We record the

experimental results in the end of each experiment as shown in Fig. 5.10. The

delivery ratio of two schemes as a function of node speed on synthetic data traces

52

Page 69: Social-based Data Routing Strategies in Delay Tolerant

5.3. Performance Evaluation

0.5 1.0 1.5 2.0 2.50.5

0.52

0.54

0.56

0.58

0.6

Speed [m/s]

Del

iver

y ra

tio

SocLoc

(a) Delivery ratio

0.5 1.0 1.5 2.0 2.51.5

1.6

1.7

1.8

1.9

2x 105

Speed [m/s]

Ave

rage

del

ay [s

]

SocLoc

(b) Average delay

0.5 1.0 1.5 2.0 2.52.4

2.45

2.5

2.55

Speed [m/s]

Ave

rage

cos

t

SocLoc

(c) Average cost

Figure 5.10: Performance comparison of Soc and Loc strategies on synthetic datatraces as a function of node speed

is shown in Fig. 5.10a. The delivery ratio has slightly increment as the increase of

speed. This attributes to the fact that faster speed increases the chance of encounters

among nodes in the network. For each node speed, the Soc scheme outperforms Loc

by about 3% in terms of delivery ratio. The average delay as shown in Fig. 5.10b

suggests that the higher speed leads to the lower average delay. The Soc scheme

takes 15% less delay than the Loc scheme in all speeds. Similar to average delay,

the average cost of Soc and Loc is presented in Fig. 5.10c. Thecost of both schemes

decreases as the increase of the speed. Besides, in each speed, Soc has about 0.1

hops less cost than Loc.

Additionally, we compare Soc with Loc in terms of different network size on

the synthetic data traces. The results are recorded in the end of each experiment as

shown in Fig. 5.11. The number of nodes in five different synthetic traces is 20,

40, 60, 80 and 100 respectively (ns = 1.5m/s). The delivery ratio increases as

53

Page 70: Social-based Data Routing Strategies in Delay Tolerant

Chapter 5. Encounter-based Routing Strategy

20 40 60 80 1000.1

0.2

0.3

0.4

0.5

0.6

Number of nodes

Del

iver

y ra

tio

SocLoc

(a) Delivery ratio

20 40 60 80 1001.4

1.5

1.6

1.7

1.8

1.9x 105

Number of nodes

Ave

rage

del

ay [s

]

SocLoc

(b) Average delay

20 40 60 80 1000.5

1

1.5

2

2.5

Number of nodes

Ave

rage

cos

t

SocLoc

(c) Average cost

Figure 5.11: Performance comparison of Soc and Loc on synthetic data traces as afunction of network size

the increase of the number of nodes in the network. Accordingly, the general trend

of average delay and average cost declines as the number of nodes increases. The

delivery ratio of two schemes is shown in Fig. 5.11a. In line with our previous re-

sults, the delivery ratio of Soc is 3% higher than that of Loc with respect to different

network sizes. The average delay as shown in Fig. 5.11b for both schemes suggests

that it is very low in the case of the number of nodes equal to 20. It increases when

nodes number is 40 and then decreases. The Soc scheme takes more average delay

when the network size is small while it uses less when the network size becomes

large. The average cost as shown in Fig. 5.11c presents that the Soc scheme has

slightly (0.1 hops) better performance than the Loc scheme.

Overall, our comparison of the Soc scheme and the Loc scheme on both syn-

thetic traces and real data traces suggests that they have similar performance in data

routing, whose difference is within 5% in most cases. Nowadays user location infor-

54

Page 71: Social-based Data Routing Strategies in Delay Tolerant

5.4. Summary of Contributions

mation is considered to be highly private and people usuallyare not willing to share

it over the network, while encounter-based social information like social degree is

much less sensitive and easy to obtain. Therefore, the usageof encounter-based so-

cial information is more privacy preserving than location-based social information

for data routing in DTNs.

5.4 Summary of Contributions

The proposed encounter-based scheme integrates social metrics including social

centrality and social similarity to calculate a comprehensive routing metric. We pro-

vide comprehensive performance comparisons of Soc together with other encounter-

based social schemes. The proposed encounter-based strategy outperforms with

other encounter-based social strategies up to 15% in terms of delivery ratio. Our

experiment results also show that routing strategies usinglocation-based social in-

formation and encounter-based social information have no significant difference in

performance: they perform closely in delivery ratio, delayand cost with a slight dif-

ference within 5% in most cases. Our analysis indicates thatlocation-based social

information is not critical in designing routing strategies, although it can provide

accurate position information and mobility pattern of nodes. Due to the fact that

physical location information is sensitive and hard to collect, our work suggests

that encounter-based social information could be a good substitute for data routing

in DTNs for the sake of privacy preserving.

55

Page 72: Social-based Data Routing Strategies in Delay Tolerant
Page 73: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6

Community-based Routing Strategy

One characteristic of two proposed social-based routing strategies along with other

previous routing schemes in DTN is that they apply a utility function for data for-

warding decision making and choose the nodes with higher utility values as relays

for data delivery. We call this kind of schemes as utility-based routing strategies.

The most widely used utility metric is the frequency of node encounters, where

messages are forwarded to a node who is more frequently meeting the destination

than the current node [56, 85, 15]. Another important utility metric is the inter-

contact time of node pairs, where massages are forwarded to the nodes with smaller

inter-contact time to reduce transmission delay [7, 20, 58]. Several strategies con-

sider geographic distance as a metric and try to forward the messages along the

shortest geographic distance path [49, 65, 18]. Inspired bythe research of social

network analysis, the utilities such as social similarity and social centrality are also

proposed to enhance data forwarding via social connections[33, 40, 63, 41, 94].

Most existing utility-based routing strategies employ a single or multiple utility

metrics to compose their utility functions. The commonly used utilities and their

representative strategies are listed in Table 6.1.

Table 6.1: Routing utilities in DTNs

Utility metric Definition Strategies

Encountering frequencyNumber of encoun-ters in a period oftime

PROPHET [56], FC [42], Seek andFocus [85], MaxProp [15], RAPID[7], and etc.

Inter-contact time The time interval be-tween two contacts

RAPID [7], Two-Hop-Relay [20],ASBIT [58], and etc.

Geographic distance The distance of userlocations

MobySpace [49], CAR [65], MV[18], and etc.

Social centrality The de-gree/betweenessof a node in thenetwork

SDM [33], SimBet [24, 12], Bub-bleRap [40], PeopleRank [63], andetc.

Social similarity The common socialfeatures between twonodes, such as com-mon friends

SimBet [24, 12], Social-Greedy[41], Social feature-based [94], andetc.

57

Page 74: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

Figure 6.1: Proportion of blinds spots and dead ends in utility-based strategies

There are two problems in utility-based routing strategies: blind spot anddead

end. Consider the scenario that the source nodeni wants to send a message to the

destination nodend. In the utility-based routing,ni needs to forward the message to

a node with higher utility. However, ifni and all its neighbors have the similar utility

near to zero, it will be difficult to decide which node should relay the message. Such

problem is called blind spot for the reason that the next hop route is hard to be seen

from the current node. The dead end problem occurs whenni has a higher utility

value than all its neighbors, in which case the message is stuck in ni and not able to

be delivered further.

The blind spot and dead end problems are rarely noticed in theprevious works,

but they commonly exist in utility-based strategies. Fig. 6.1 shows the percentage

of blind spot and dead end when applying five utility-based routing strategies in

three DTN datasets (MIT Reality [25], DieselNet [16] and Cabspotting [77]). Ac-

cording to the figure, in most utility-based strategies, there are more than 20% of

data transmissions encountering the blind spot and dead endproblems in the MIT

reality trace. Similar percentages are observed in the DieselNet and Cabspotting

traces, varying from 14% to 27%. Such problems will clearly affect the delivery

ratio of DTN routing, and they are not yet well addressed in the past.

The community-based strategies forward data according community structure

58

Page 75: Social-based Data Routing Strategies in Delay Tolerant

6.1. Mobile and Social Characteristics of DTN

based on the fact that people tend to group into communities by their social rela-

tionships. By dividing the network into multiple communities, the nodes within a

community have strong connections, while their links across communities are weak

ties. The community structure favors intra-community communication where nodes

are closely connected (thus reduces the chance of blind spotand dead end), but also

encounters the difficulty of inter-community communication via weak links. Exist-

ing community-based routing strategies employ naive inter-community mechanism

such as flooding [51], or rely on complicated operations to discover direct links

[13] or overlapping nodes [33] between communities, which are time-consuming

and inefficient. Generally speaking, community-based routing strategies confront

two challenges. On the one hand, the existing community partitioning algorithms

are complicated and static, which is hard to adapt to the dynamic and mobile DTN

environments, thus adistributed community partitioning mechanism is desired. On

the other hand, to overcome the blindness of data forwardingamong communities

via weak ties, it needs to measure theutilities across communities for a better rout-

ing decision making.

The above analysis reveals several weakness of existing DTNrouting strategies.

In this chapter, we propose a novel community-based routingstrategy called Social

and Mobile Aware Routing sTrategy (SMART) for DTNs. SMART tackles the

above problems to achieve the following objectives: (1) significantly alleviating

blind spot and dead end problems; (2) distributed communitypartitioning; and (3)

efficient inter-community communicating.

As we model DTN social graph in Chapter 3, we define the “encounter” between

two mobile devices as the event that they move into each other’s communication

range. We use aweighted social graph to formulate encounters among mobile

devices. Each device is denoted by a node in the social graph.If two devices

encounter, there is anedge between them, which builds a social link between two

nodes. We consider two nodes with social link as friends. Thenumber of friends

that a node has is calleddegree of the node. The weight of an edge corresponds to

the strength of social links, which can be represented by thenumber of encounters

over a fixed time interval.

6.1 Mobile and Social Characteristics of DTN

In this section, we utilize three real DTN data sets as described in Chapter 3 to

explore people mobile and social characteristics. The mobility of DTN users reveals

59

Page 76: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

A

(a) Taxi #108

B

(b) Taxi #352

Figure 6.2: Two taxi trajectories in Cabspotting trace

locality and their interactions revealcommunity structure.

People movements are not random and appear regularities restricting in some

geographic areas. For example, people move regularly from home to office and

vice versa; buses run along stations according to schedules; taxi drivers tend to pick

up guests in some popular areas, etc. We observe the mobilitytrajectories of two

taxis (#108 and #352) chosen from the Cabspotting trace, which are illustrated in

Fig. 6.2. It can be seen that most of the time taxi #108 moves inareaA, while taxi

#352 moves in areaB. We call such kind of movement of which the trajectory is

mostly restricted in a small area aslocality.

To further investigate the locality of movements, we analyze the distribution of

human mobility scope. We definegeographic mass point as the centroid of a node’s

trajectory, which is calculated by averaging the GPS coordinates of its trajectory.

We compute the geographic mass points of all nodes in the network and study the

geographic distance their trajectory coordinates away from their respective mass

points. Taking Cabspotting data trace as an example, Fig. 6.3 shows the probability

distribution function (PDF) and cumulative distribution function (CDF) of the tra-

jectory coordinates departing from their mass points in thetrace. According to the

figure, although the farthest coordinates is 30km away from the mass point, most

of the movements are nearby their mass points, e.g., about 80% of trajectory co-

ordinates within an area 5km from mass points. The majority of the coordinates

concentrate in areas 2km, 3km, and 4km away from mass points.This verifies

the locality property of nodes movements: most movements ofDTN nodes are re-

stricted within a range of small distance, and there are onlya small proportion of

60

Page 77: Social-based Data Routing Strategies in Delay Tolerant

6.1. Mobile and Social Characteristics of DTN

0 5 10 15 20 25 300

0.2

0.4

0.6

0.8

1

Distance from mass point [km]

CD

F

0 5 10 15 20 25 300

0.2

0.4

Distance from mass point [km]

PD

F

Figure 6.3: The CDF and PDF of node movements (Cabspotting).

long distant movements.

6.1.1 Distributed community partitioning

Community is defined as a social unit that shares common value. It is a tight and

cohesive social entity. Intuitively, communities are formed based on locations or

interests [29, 70, 8, 68]. People in the same geographic location or sharing the

same interest are likely to be in the same community. In the context of DTNs, we

use geographic locations to study community structure and investigate the relation

between encounters and geographic distances for the discovery of communities. It

is likely that there is correlation between encountering and geographic location: the

closer two nodes, the more often they meet each other. We calculate the number

of encounters between node pairs as a function of the distance of their geographic

mass points in Cabspotting, as shown in Fig. 6.4. It illustrates that the number of

encounters decreases rapidly when the distance increases,which implies that when

the distance becomes longer, the number of encounters becomes smaller. There is

no node pair with distance greater than4km having more than100 encounters. This

verifies the strong correlation between location and encountering.

Inspired by the above observation, we propose a dynamic community partition-

61

Page 78: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

0 2 4 6 8 100

100

200

300

400

500

Distance of mass point [km]

Enc

ount

ers

Figure 6.4: The number of encounters vs. distance (Cabspotting).

ing algorithm. Unlike existing community partitioning algorithms such as k-clique

percolation [71] and Girvan Newman algorithm [67] etc. where global network

topology information is needed, our algorithm only uses local encountering statis-

tics of nodes. It is more suitable for distributed implementation in DTNs. The basic

idea is adaptively grouping nodes into communities starting from a random parti-

tion (i.e. m communities) of the network. The detailed community partitioning

process is described as follows.

The community construction process is divided into two stages: the bootstrap

stage and the evolution stage. In the bootstrap stage,m nodes are randomly selected

and each node is assigned with a unique community ID. Node without community

affiliation will choose one community through encounters until every node in the

network is assigned with a community ID. After this stage, the network hasm com-

munities. In the evolution phrase, each node counts the affiliation parameters (APs),

which indicate the number of encounters with nodes in different communities. Then

it adjusts the community affiliation according to updated APs. We use a vector to

present the affiliation parameters of nodeni:

Ei = {ap1i, ap2i , · · · , apmi},

62

Page 79: Social-based Data Routing Strategies in Delay Tolerant

6.1. Mobile and Social Characteristics of DTN

whereapji is the AP thatni connects to communityCj denoting the number of

encounters betweenni andCj . Whenni encounters a node in communityCj, it

updates its AP value accordingly, and adaptively changes its community affiliation

to the community with maximal AP value in the vector. We call our algorithm

m-partition.

Algorithm 3: m-partition algorithminput : nodeNi and AP vectoroutput: the community ID of nodeNi

beginAssume there arem communities to be detected ;for Encounter with Nj do

if Ni.communityID = null thenNi.communityID = Nj .communityID;

elsey ← Nj .communityID;x← Ni.communityID;if y = x then

apxi= apxi

+ 1;

elseapyi = apyi + 1;

if apyi > apxithen

Ni.communityID = y;

The algorithm runs dynamically as each encounter occurs in the network in

a distributed fashion. Therefore, the community structuremay change from time

to time and is maintained dynamically. We show that the communication cost of

m-partition for maintaining community members is low in DTNs. Given two com-

munitiesA andB, there arem nodes inA andn nodes inB. Suppose a nodeni in

communityA needs to switch its community fromA to B, it first obtains the new

community list from the encountered node in communityB. If we consider the

traffic overhead for transmitting one node ID as1, the communication overhead for

obtaining community members will ben. It then floods its ID to its new community

B to make other nodes in communityB be aware ofni. The communication cost

will also ben. Furthermore, when a node in communityA meets a node in com-

munityB, it checks the community member list to see whether any node changes

their community identity. In this case,ni changes fromA to B. Suppose there are

k encounters between communityA andB. The cost for transmitting node ID of

63

Page 80: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

ni is k. The node in communityA floods this information to make the remaining

nodes (m − k nodes) inA exclude the membership ofni from their local commu-

nity, which needsm − k transmissions of ID ofni. Overall, the communication

overhead for maintaining community members caused by one community switch

action isO(m+ 2n).

6.1.2 Locality of user contacts

According to the above observation, it has high correlationbetween user move-

ments and their encounters. Thus the locality of user movements is also reflected

to their encounters. To investigate such locality, we definetwo types of encounters

after community partitioning: when two nodes move into eachother’s communica-

tion range, if both nodes are from the same community, we callsuch encounter a

local contact; if the encountering nodes are from different communities,it is called

a remote contact. Since a node tends to move in a local area, which results in fre-

quent local contacts. In contrast, only a small proportion of user movements are

long-distant, which yields remote contacts forming cross-community communica-

tions.

a

b

c

d

Community A

Community B

Figure 6.5: Local contact and remote contact

Fig. 6.5 shows an example of local and remote contacts. In thefigure, arrows

denote the movement of nodes. The pair< a, b > makes a local contact, and the

pair< c, d > makes a remote contact. We calculate the proportion of localcontacts

and remote contacts of the three DTNs after applying our community partitioning

algorithm (withm = 10). The results are shown in Table 6.2. It presents that local

contacts are majority for all three traces and remote contacts are minor. MIT Re-

ality has 19.7% remote contacts; DieselNet has 7.3%; and Cabspotting has 9.9%,

which confirms the locality of user contacts. Although remote contacts only take a

64

Page 81: Social-based Data Routing Strategies in Delay Tolerant

6.2. Strategy Design

small fraction, they play an important role for informationexchange since they are

the “bridges" between communities. The more remote contacts, the more active a

community interact with others. It enlightens the idea of using local contacts for

intra-community communication and remote contacts for inter-community commu-

nication. The detailed approach is presented in the subsequent section.

Table 6.2: Proportion of local and remote contacts

Traces MIT Reality DieselNet Cabspotting

Local contact (%) 80.3% 92.7% 90.1%

Remote contact (%) 19.7% 7.3% 9.9%

6.2 Strategy Design

In this section, we introduce the social and mobile aware routing strategy (SMART)

for DTNs. The basic idea of SMART is to divide the DTN into several communities

using community partitioning algorithms, deliver the message in the same commu-

nity via local contacts, and forward the message to other communities via remote

contacts. The details are described in the following.

6.2.1 Assumptions

Before presenting the detailed design of SMART, we introduce some basic concepts

and assumptions.

• Assume a dynamic community partitioning process (i.e.m-partition) is ap-

plied to cluster the social graph into a number of communities = {C1, C2, · · · , CM}.

• Each nodeni is assigned with a community IDCi and a setπ(Ci) = {nj|∀nj in Ci}

indicating the members in the same community. The communityID and com-

munity members are obtained and maintained from the result of community

partitioning process.

• Each node records the encounter history for a period of time∆T , where∆T

is a time window representing one day or one week interval. Each node

maintains the history of local contacts that occur within the local community

and remote contacts that occur across different communities.

65

Page 82: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

Table 6.3: Local contact table

n1 n2 · · · ni · · · nN

ζi1 ζi2 · · · −1 · · · ζiN

• Local contacts are recorded into alocal contact table, whose format is shown

in Table 6.3. It records the contact frequency of a node with other nodes in

the same community. In the table,nj (j = 1, 2, · · · , N) is the ID of a node

in the local community andζij (j = 1, 2, · · · , N) is the contact frequency

betweenni andnj in ∆T . If nj = ni, let ζij = −1 indicating that the contact

with itself is not countable.

The contact frequency is the number of encounters over the time period∆T .

It is calculated by:

ζij =

∑T

t=0X(t)ij∆T

,

whereX(t)ij = 1 if two nodes contact at timet, otherwise,X(t)ij = 0.

Table 6.4: Remote contact table

C1 C2 · · · Ci · · · CM

ηi1 ηi2 · · · −1 · · · ηiM

• Remote contacts are recorded into aremote contact table, whose format is

shown in Table 6.4. It records the contact frequency of a nodewith other

communities. In the table,Cj (j = 1, 2, · · · ,M) is the ID of a community

(there areM communities in the network) andηij (j = 1, 2, · · · ,M) is the

sum of encounters thatni with nodes inCj over∆T . Again,ηij = −1 when

Cj = Ci suggesting that the local contacts is not reflected from the remote

contact table.

When a source nodens sends a message to the destinationnd, ns checks the lo-

cal community members to see whethernd is in the same community. Ifns andnd

are in the same community, we apply anintra-community communication process.

If ns andnd are in different communities, we apply aninter-community communi-

cation process. The two processes are described as follows.

6.2.2 Intra-community communication

If a source nodens and a destination nodend are in the same community by check-

ing local community members, it is possible to apply traditional utility-based strate-

66

Page 83: Social-based Data Routing Strategies in Delay Tolerant

6.2. Strategy Design

gies for data forwarding. However, to avoid the problems of blind spot and dead

end as mentioned before, the intra-community routing scheme needs to be carefully

designed.

To against blind spot and dead end, we need a social featured routing metric that

accumulates encounter effects and decays according to the node’s social status and

time elapsed. We consider each encounter has an effect to theutility value, which is

positively correlated to the social relation (i.e. social similarity) between two nodes.

That is, two nodes with closer social relation lead to higherutility increase in each

encounter. Besides, the encounter effect decays dependingon its social status and

time elapsed. An earlier effect will have less effect remaining due to temporal

factor. Meanwhile, a node with high social status will motivate further encounters.

To represent this motivation, a node with higher social status in the network should

have a slower decaying speed on the encounter effect. Combining temporal and

social factor, each encounter effect decays as the social status of a node and the

time elapsed. To select a relay, the scheme will evaluate theaccumulative effects

produced by the encounters and the decaying speed of the effects. We provide the

formulation of our scheme as follows.

As the first step, we give the definition of social relation andsocial status. The

social relation denotes the social closeness between two nodes and social status

shows the relative importance of nodes in the social network. Although the social

relation and social status can be represented in many sophisticated manners [2, 24],

we choose two representative expressions to illustrate ourscheme. Namely, we use

social similarity to represent the social relation, andsocial centrality to represent

social status.

Social similarity: it is defined as the number of common friends between a pair

of nodes, indicating the trustiness and cohesiveness of social ties between them

[23, 22]. As explained in Chapter 5, social similarity can becalculated by the

following equation.

Si,j(τ) = |Fi(τ)⋂

Fj(τ)|+ 1, (6.1)

whereFi(τ) (Fj(τ)) is the set of friends of nodeni (nj) at timeτ . The intersection

operation is to obtain the common friends between two nodes and plus 1 is to elim-

inate the effect of0. When two nodes encounter, they exchange their friend liststo

calculate the social similarity.

Social centrality: it is a quantification of the relative importance of nodes in the

social network. There are various definitions of centrality. We use the Freeman’s

degree [30] to define social centrality as described in Chapter 5. For a nodeni, its

67

Page 84: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

centrality is defined as follows.

Ci(τ) =

∑N

k=1 dik(τ)

N, (6.2)

wheredik(τ) = 1 if a direct link exists betweenni andnk at timeτ andN is the

number of nodes in the community.

Both social similarity and social centrality only require local information, and

they can be calculated locally in DTNs by exchanging information with neighbors.

The encounter effect between two nodes is therefore denotedby the social sim-

ilarity. To model the decaying effect, we introduce a decay function with respect to

social centrality and time as follows.

Di(t− τ) =Ci(τ)

t− τ, (6.3)

whereτ is the time when the encounter occurs. The decay function relying on both

social centrality and elapsed time reduces the accumulative effects of the utility

value.

If an encounter occurs in each time unit, the accumulative effects of encounters

between two nodes with decay between nodeni to destinationnd can be formulated

as the convolution of Eq. 6.1 and Eq. 6.3.

Yi,d(T ) = Si,d(T )⊗Di(T ),

=

ˆ T

τ=0

Si,d(τ) ∗ Di(T − τ).(6.4)

However, the encounter only occurs in several time units. Therefore, the accu-

mulative effects of encounters are represented by a discrete convolution as:

Ui,d(T ) =

T∑

τ=0

X(τ)id ∗ Si,d(τ) ∗ Di(T − τ), (6.5)

whereX(τ)id = 1 when an encounter occurs at timeτ or whenτ = 0 (to initialize

the utility value), otherwise,X(τ)id = 0. The utility function describes that when

each encounter occurs, it yields an encounter effect represented by social similarity.

Each effect occurs at different time decays as a decay function composed by social

centrality and time, indicating the encounter effects of a node with higher social

status decays slower than a node with poor connection to the network and a recent

68

Page 85: Social-based Data Routing Strategies in Delay Tolerant

6.2. Strategy Design

encounter effect decays slower than a older encounter effect.

Based on the above utility function, we propose our intra-community routing

principle.

Intra-community routing principle: when two nodesni andnj in the same com-

munity encounter, the routing utility in Eq. 6.5 is calculated and the node with

higher value will be chosen to forward the message. The forwarding path will be

recorded to make it free from loop.

6.2.3 Inter-community communication

If a destination nodend does not belong to the community of source nodens, we

need to choose some relay nodes to forward the message among communities. The

idea is using “fringe nodes” to bridge the communication of inter-communities.

A fringe node is a node which is capable to remote contact withother communi-

ties. It is measured by the number of links that it connects toother communities. We

select nodes with higher number of links to outside of local community as fringe

nodes. Each fringe node is represented by its ID and the remote contact table as

mentioned in Table 6.4 to indicate its links to other communities. In our proposed

scheme, each community maintains a set of fringe nodesF . The setF is randomly

selected initially, and is updated periodically. During a period∆T , each node com-

pares its remote contact table with the fringe nodes. If a node ni finds that it has

closer connection with outside communities than a fringe nodenj, it will announce

itself as the new fringe node. The comparison is described asfollows.

Assumeηi1, ηi2, · · · , ηiM is the remote contact frequency ofni, andηj1, ηj2,

· · · , ηjM is the remote contact frequency ofnj . Define a functionφ(x, y) = 1 if

x ≥ y andφ(x, y) = −1 for the rest. The selection of fringe node is determined by

the value

Fc =

M∑

k=1

φ(ηik, ηjk).

If Fc is larger than 1, it meansni has better connection thannj , andni becomes the

new fringe node and announces to the other nodes in the local community.

According to the report in [91], a small fraction rewired links are enough to

create a small world network. Our analysis to the three traces shows that the fraction

of remote contacts varies from 7.3% to 19.7% (as shown in Table 6.2), so we set

the number of fringe nodes as 10% of the community size. If thecommunity size

is smaller than 10, we set the number of fringe nodes as 1.

69

Page 86: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

Due to dense network connection in the community, the set of fringe nodesF

and their remote contact tables can easily spread to nodes inthe same community.

Since there are more than one fringe nodes which can reach to other communities,

it needs to carefully choose the forwarding node for performance consideration.

Assume the source nodens in communityC wants to send a message to the

destinationnd in communityC ′ (C 6= C ′). We first decide whetherC andC ′ are

directly connected by looking up the setF in C to check whether there exists a

fringe node connecting toC ′. If there is a connection from fringe node set to the

destination community, we sayC andC ′ are directly connected. Otherwise, we say

they are indirectly connected.

If C andC ′ are directly connected, we need to forward the message to a fringe

node who can reachC ′. There might be more than one fringe nodes directly con-

necting toC ′, and the candidate set isC = {nj |∀ nj ∈ F and (nj connects to C′)}.

We need to decide which node in the candidate set as a relay. Our principle is to

send the message to a relay having more connections withC ′. By looking up the

remote contact tables of fringe nodes, the algorithm chooses the one with the max-

imal number of encounters toC ′ as the relay. IfC andC ′ are indirectly connected,

we select the fringe node with maximal number of encounters with outside ofC by

summing up entries in remote contact tables. The source nodeforwards the message

to the selected fringe node by intra-community routing strategy. After the data is

forwarded to the dedicated fringe node, the data transmission between communities

becomes a challenge.

To enable efficient inter-community communication, we propose a utility func-

tion extended from intra-community utility to forward datafrom the fringe node

to the destination community. Namely, we extend the utilityfunction from node-

to-node to node-to-community for inter-community communication. We build the

utility function between a fringe nodef to the destination communityC ′. To con-

struct such utility, we consider social relation betweenf andC ′ as similarity be-

tween the nodef and a set of nodes inC ′. However, knowing the friends of all

nodes inC ′ would suffer too much overhead in DTNs. In this case, we provide an

estimation that we only count the friends of nodes who have ever encountered with

f . The similarity is defined as:

Sf,C′(τ) = |Ff(τ)⋂

FC′(τ)|+ 1,

whereFC′(τ) indicates the friends of a set of nodes inC ′ that ever encountered with

70

Page 87: Social-based Data Routing Strategies in Delay Tolerant

6.3. Discussion

f until timeτ .

To formulate the social status of nodef , we extend the concept of centrality

from the local community to the entire network. We call it community centrality,

denoted byCΓ(τ), which is defined as the proportion of the number of communities

it connecting with (Mc(τ)) to the total number of communities (M(τ)) at timeτ . It

is defined as:

CΓ(τ) =Mc(τ)

M(τ).

The decay function in the node-to-community utility becomesDΓ(t − τ) = CΓ(τ)t−τ

.

The overall utility function from the nodef to communityC ′ thus is defined as:

Uf,C′(T ) =T∑

τ=0

X(τ)fC′ ∗ Sf,C′(τ) ∗ DΓ(T − τ), (6.6)

whereX(τ)fC′ = 1 when an encounter occurs between nodef and communityC ′

at timeτ or whenτ = 0 (to initializeUf,C′(T )). According the utility function, the

fringe node finds the next relay by choosing a node with higherutility value with

destination community. The procedure continues until the data reaches destination

community.

6.3 Discussion

In this section, we discuss how SMART tackles blind spot and dead end problems,

and the efficiency of inter-community routing by SMART.

6.3.1 Tackling blind spot and dead end problems

The blind spot and dead end problems result from the indecisive utility value of

utility-based data routing strategies. The blind spot occurs when utility values of

a node and its neighbors are close and nearly to0. It cannot decide which node

should be the next relay. The emergence of blind spot is because these nodes have

rare interaction with the destination, which generates similar utility values close to

0. The dead end arises with all neighbors of a node having lowerutility value than

it. The node cannot conduct the forwarding behavior in the network. The existence

of dead end comes from the fact that the utility value of the relay reaches the peak

locally.

We conduct experiments to show the percentage of blind spot and dead end

71

Page 88: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

8 9 10 11 12 13 140

0.002

0.004

0.006

0.008

0.01

Time [day]

Pro

port

ion

MIT RealityDieselNetCabspotting

Figure 6.6: Percentage of blind spot and dead end in SMART.

occurring in SMART. By definition, we consider a node encounters blind spot if it

has similar utility value closing to0 as its neighbors and it cannot find the next relay

until the data is expired. A node runs into dead end if its utility value is larger than

all its neighbors and the data is stuck into the node until it is expired. We look for

the data routing failures caused by blind spot and dead end inthree data sets (MIT

Reality, DieselNet and Cabspotting). We sum the two types offailures and draw

the curve as a function of time as shown in Fig. 6.6. Accordingto the figure, there

is only a tiny percentage of blind spot and dead end appearingin SMART, most of

the time lower than 1%. Compared to the experiment results shown in Fig. 6.1, the

percentage drops dramatically. Thus we claim that SMART significantly alleviates

the blind spot and dead end problems.

6.3.2 Efficiency of inter-community communication

The inter-community communication is a difficult task in community-based strate-

gies since it lacks of strong links between different communities. Existing community-

based strategies use naive routing mechanism such as flooding [51], or rely on

discovering the direct links [13] or overlapping nodes [33]between communities,

which are time-consuming and inefficient.

72

Page 89: Social-based Data Routing Strategies in Delay Tolerant

6.3. Discussion

SMART enhances the capability of inter-community communication by select-

ing fringe node and a node-to-community utility function. The fringe node is

selected by the criterion that has rich connection to the remaining network. To

bridge the communication between the fringe node and the destination commu-

nity, SMART introduces a node-to-community utility function, which considers the

destination community as an entity. Analogous to intra-community utility, we com-

pose utility function between fringe node and the destination community, and build

routing channels among different communities.

Figure 6.7: Delivery ratio of community-based strategies

We compare the performance of SMART with other community-based strate-

gies, including Bubble Rap [40] and Friendship Based Routing (FBR) [13] to show

its efficiency. The comparison of delivery ratio on three data traces (MIT Reality,

DieselNet and Cabspotting) is shown in Fig. 6.7. It is illustrated that for inter-

community communication, the delivery ratio of Bubble Rap and FBR is 32% and

33% respectively, while the delivery ratio is improved in SMART greatly, which

achieves 47%. For delivery ratio of intra-communication, SMART achieves 85%,

which also outperforms the other strategies (with 72% in Bubble Rap and 75% in

FBR) on MIT Reality. The performance comparison on DieselNet and Cabspotting

also show the efficient inter-community communication of SMART. Together with

our distributed community partitioning algorithm, the proposed SMART strategy

73

Page 90: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

shows its effectiveness in dealing with the community-based routing problems.

6.4 Performance Evaluation

In this section, we conduct experiments to evaluate the performance of SMART and

compare it with other existing DTN routing strategies.

6.4.1 Experiment setup

We still use HaggleSim simulator [39] for the experiment evaluation. It takes the

discrete sequential encounter events and the corresponding social graph as the in-

puts and makes data forwarding decision using various routing algorithms. For

each experiment, we emulate 1000 messages sent from a randomselected source to

destination. In this group of experiments, each message only keeps one copy in the

network. We run every experiment 20 times for statistical convergence. We extract

a 2-week session from MIT Reality, DieselNet and Cabspotting respectively and

run the simulator over the selected sessions with uniformlygenerated traffic. The

SMART algorithm is implemented and is compared to other existing DTN routing

algorithms.

6.4.2 Impact of community numbers

We first investigate the impact of the number of communities on the performance of

SMART. We apply the proposed m-partition algorithm for community partitioning

on the three DTN traces and then use SMART to route messages.

Fig. 6.8, Fig. 6.9 and Fig. 6.10 show the performance metricsas a function of

community numberm (varying from 1 to the size of the data sets) and time. The

delivery ratio of MIT Reality trace is shown in Fig. 6.8a. According to this figure,

when no community partitioning algorithm is applied (m = 1), the delivery ratio is

quite low and it increases slowly with time. As the communitynumber is set to an

appropriate value (e.g.m = 10), the delivery ratio increases dramatically, which is

almost 2 times as that whenm = 1. For10 ≤ m ≤ 90, the delivery ratio becomes

stable and has only small fluctuation. When the community number approaches to

the size of the data set (m = 97), the performance drops dramatically since the

impact of community structure disappears. The average delay is illustrated in Fig.

6.8b. It is seen that the average delay is almost the same for all community numbers

74

Page 91: Social-based Data Routing Strategies in Delay Tolerant

6.4. Performance Evaluation

(a) Delivery ratio

1 10203040506070809097

0.5124681012140

2

4

6x 10

5

CommunitiesTime [day]

Ave

rage

del

ay [s

]

(b) Average delay

(c) Average cost

Figure 6.8: The performance metrics as a function of community number and time(MIT Reality).

and it only varies with time. The average cost is shown in 6.8c. Similar to delivery

ratio, the average cost is influenced bym and it increases to a stable value when

10 ≤ m ≤ 90. Similar results are also found in DieselNet as shown in Fig.6.9

and Cabspotting as shown in Fig. 6.10. It suggests that SMARTperforms better

when the community structure is outlined, while the performance of SMART is low

when no community structure is indicated in the network. It also reveals that the

proper value ofm is within a wide range. We tend to choose a smaller value ofm

to reduce the cost of maintaining the community structure. Thus in the rest of our

experiments, we fix our community number tom = 10.

6.4.3 Impact of community partitioning algorithms

We will show that the performance of SMART routing scheme based on different

community partitioning algorithms. We evaluate the performance of SMART using

75

Page 92: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

(a) Delivery ratio (b) Average delay

(c) Average cost

Figure 6.9: The performance metrics as a function of community number and time(DieselNet).

different community partitioning algorithms, including m-partition, k-clique perco-

lation algorithm [71] (which considers the adjacent k-clique as communities), and

Girvan Newman algorithm [67] (which continues removing edges with the highest

betweenness until a certain threshold is reached).

Fig. 6.11 presents experimental results of MIT Reality trace. As shown in Fig.

6.11a, the m-partition method outperforms Girvan Newman by10% and k-clique

percolation by 2% in delivery ratio. In terms of average delay, as shown in Fig.

6.11b, m-partition performs slightly better than the othertwo algorithms. The three

algorithms takes similar average cost as shown in Fig. 6.11c. Similar results are

also observed on DieselNet data set as shown in Fig. 6.12 and Cabspotting data set

as shown in Fig. 6.13.

Despite the different algorithms used for community partitioning, the routing

performance is quite similar for all three DTN data sets. It indicates that our pro-

posed m-partition community detection method can adapt to the implementation

76

Page 93: Social-based Data Routing Strategies in Delay Tolerant

6.4. Performance Evaluation

(a) Delivery ratio (b) Average delay

(c) Average cost

Figure 6.10: The performance metrics as a function of community number and time(Cabspotting).

of SMART which performs similar as other sophisticated community partitioning

algorithms relying on global information. Since Girvan Newman and k-clique per-

colation need global network topology information which isvery difficult to obtain

in DTNs, the proposed m-partition algorithm is more suitable for distributed imple-

mentation in the real world.

6.4.4 Performance analysis

We compare SMART with five existing DTN routing strategies: PROPHET [56],

SimBet [24], Bubble Rap [40], Friendship Based Routing (FBR) [13], and Epi-

demic routing [89]. PROPHET is a utility-based strategy according to encounter

histories. It forwards data to the nodes with higher delivery rate based on contact

history. SimBet is a utility-based strategy according to social features. It consid-

ers social properties including similarity and centralityto make data forwarding

decision. Bubble Rap is a community-based strategy. It depends on community

77

Page 94: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Time [day]

Del

iver

y ra

tio

Girvan Newmanm−partitionk−clique

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

1

2

3

4

5x 105

Time [day]

Ave

rage

del

ay [s

]

Girvan Newmanm−partitionk−clique

(b) Average delay

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5

3

Time [day]

Ave

rage

cos

t

m−partationGirvan Newmank−clique

(c) Average cost

Figure 6.11: The performance of SMART under different community partitioningalgorithms (MIT Reality).

structure and routes data based on rankings calculated fromsocial centrality. FBR

algorithm as another community-based algorithm presentedin [13], it constructs

temporal community and use the nodes with direct connectionto the destination

community for data delivery. Epidemic routing is a flooding strategy. It has high

delivery cost, but its delivery ratio and delay approach thetheoretical bound.

The performance comparison in three data sets is presented in Fig. 6.14, Fig.

6.15 and Fig. 6.16. Fig. 6.14 shows the performance of various algorithms as a

function of time on MIT Reality trace. The delivery ratio is compared in Fig. 6.14a.

It shows that SMART outperforms PROPHET, SimBet, FBR and Bubble Rap. The

delivery ratio of SMART is about 10% higher compared to Bubble Rap and FBR,

15% higher than that of SimBet and nearly 20% higher than thatof PROPHET. The

results also confirm that SMART outperforms utility-based strategies nearly 20%

by solving blind spot and dead end problems. The reason PROPHET performs the

worst is due to the strong community structure of MIT Realitytrace. When source

78

Page 95: Social-based Data Routing Strategies in Delay Tolerant

6.4. Performance Evaluation

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Time [day]

Del

iver

y ra

tio

Girvan Newmanm−partitionk−clique

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5

3

3.5

4x 105

Time [day]

Ave

rage

del

ay [s

]

Girvan Newmanm−partitionk−clique

(b) Average delay

0.5 1 2 4 6 8 10 12 140.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Time [day]

Ave

rage

cos

t

Girvan Newmanm−partitionk−clique

(c) Average cost

Figure 6.12: The performance of SMART under different community partitioningalgorithms (DieselNet).

and destination are inter-connected by a long path, PROPHETwill encounter the

blind spot and dead end problems, which degrade its performance. SimBet exploits

social properties to enhance the delivery ratio but it also encounters high proportion

of blind spot and dead end problems. Bubble Rap and FBR takes advantages of

community structure, so they perform better than PROPHET, but not as well as

SMART. Since Epidemic routing represents the theoretical upper bound of delivery

ratio, the performance of SMART is below the upper bound. Average delay is

compared in Fig. 6.14b. Again, the delay of SMART is lower than the other four

strategies (most of the time their performance are very close), but higher than the

lower bound (Epidemic routing). Average cost is compared inFig. 6.14c. The cost

of PROPHET is the highest. This indicates that “transitivity” in PROPHET is not

accurate enough to predict the relay selection, thus it has longer relay path than

others. The SMART is a little higher than others due to the decaying effect, which

makes SMART take more relays for data delivery.

79

Page 96: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

Time [day]

Del

iver

y ra

tio

Girvan Newmanm−partitionk−clique

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2x 105

Time [day]

Ave

rage

del

ay [s

]

Girvan Newmanm−partitionk−clique

(b) Average delay

0.5 1 2 4 6 8 10 12 141

1.2

1.4

1.6

1.8

2

Time [day]

Ave

rage

cos

t

Girvan Newmanm−partitionk−clique

(c) Average cost

Figure 6.13: The performance of SMART under different community partitioningalgorithms (Cabspotting).

Fig. 6.15 presents the performance results of various algorithms as a function

of time on DieselNet data set. The delivery ratio is depictedin Fig. 6.15a. SMART

outperforms Bubble Rap by 3%, FBR by 5% and PROPHET by 8%. It has nearly

20% higher of delivery ratio than SimBet. Regarding the average delay and the

average cost of each strategy as shown in Fig. 6.15b and Fig. 6.15c, SMART has

very close average delay with Epidemic, which is less than other strategies. The

average cost of SMART is about 50% of that of PROPHET and higher than FBR and

SimBet. DieselNet has very similar network structure with MIT Reality and thus

has similar trend on delivery ratio with MIT Reality. However, due to the regular

and repetition routine of buses in DieselNet, it makes the SimBet meet dead ends

quite often and takes more time to wait until destinations. Therefore, it has lower

delivery ratio and higher average cost. Since DieselNet hasmore tight clustering

structure, it makes Bubble Rap and FBR perform close to SMART. SMART has

similar cost with social-related strategies but much lowercost than PROPHET.

80

Page 97: Social-based Data Routing Strategies in Delay Tolerant

6.4. Performance Evaluation

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Time [day]

Del

iver

y ra

tio

FBRSMARTProphetBubbleRapSimBetEpidemic

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

1

2

3

4

5x 105

Time [day]

Ave

rage

del

ay [s

]

FBRSMARTProphetBubbleRapSimBetEpidemic

(b) Average delay

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5

3

3.5

4

Time [day]

Ave

rage

cos

t

FBRSMARTProphetBubbleRapSimBet

(c) Average cost

Figure 6.14: The performance comparison of various strategies on MIT RealityMining trace

Comparison of different algorithms’ performance on Cabspotting trace is shown

in Fig. 6.16. Fig. 6.16a depicts the delivery ratio of variedalgorithms as a function

of time. The SMART has very similar performance as PROPHET. It outperforms

FBR by 5%. Bubble Rap algorithm is impacted by weak communitystructure,

which lowers down its delivery ratio around 10% compared to SMART. SimBet

has the lowest delivery ratio, which is much lower than otherstrategies. In terms of

average delay as shown in Fig. 6.16b, SMART costs as low as Epidemic algorithm

delay, which is much lower than others. The average costs of various algorithms

are similar as shown in Fig. 6.16c.

In a summary, the proposed SMART strategy outperforms the utility-based and

community-based strategies on various DTN data sets in mostof the performance

metrics.

81

Page 98: Social-based Data Routing Strategies in Delay Tolerant

Chapter 6. Community-based Routing Strategy

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Time [day]

Del

iver

y ra

tio

FBRSMARTProphetBubbleRapSimBetEpidemic

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

1

2

3

4

5x 105

Time [day]

Ave

rage

del

ay [s

]

FBRSMARTProphetBubbleRapSimBetEpidemic

(b) Average delay

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5

3

Time [day]

Ave

rage

cos

t

FBRSMARTProphetBubbleRapSimBet

(c) Average cost

Figure 6.15: The performance comparison of various strategies on DieselNet trace

6.5 Summary of Contributions

Data routing in DTNs is challenging due to the fact that nodesare constantly mov-

ing and the opportunity of communication between node pairsis intermittent. Ex-

isting routing strategies encounter the problems of blind spot and dead end, and

also lack of efficient implementation in DTNs. In this Chapter, we first investi-

gate the characteristics of DTNs by analyzing three data sets collected from cell-

phones, buses and taxis. We reveal the social and mobile features of DTNs: they

have community structure and their movement shows locality. Based on these fea-

tures, we propose the social and mobile aware routing strategy called SMART.

In this strategy, a DTN is divided into a number of communities using an adap-

tive community partitioning algorithms. Two data routing processes are intro-

duced: intra-community communication and inter-community communication. For

intra-community communication, a utility function convoluting social similarity

82

Page 99: Social-based Data Routing Strategies in Delay Tolerant

6.5. Summary of Contributions

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Time [day]

Del

iver

y ra

tio

FBRSMARTProphetBubbleRapSimBetEpidemic

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2x 105

Time [day]

Ave

rage

del

ay [s

]

FBRSMARTProphetBubbleRapSimBetEpidemic

(b) Average delay

0.5 1 2 4 6 8 10 12 141

1.5

2

2.5

Time [day]

Ave

rage

cos

t

FBRSMARTProphetBubbleRapSimBet

(c) Average cost

Figure 6.16: The performance comparison of various strategies on Cabspottingtrace

and social centrality with a decay factor is used to choose relay nodes. For inter-

community communication, the nodes moving frequently across communities are

chosen as relays to carry the data to destination efficiently. It is shown that such

routing strategy significantly alleviates the blind spot and dead end problems. It

adapts to the community structure by enhancing performancefor inter-community

communication. We conduct extensive experiments to compare the performance of

SMART with other DTN routing strategies. It presents that the proposed routing

strategy works well in various DTN traces.

83

Page 100: Social-based Data Routing Strategies in Delay Tolerant
Page 101: Social-based Data Routing Strategies in Delay Tolerant

Chapter 7

Discussion and Future Works

In this chapter, we have a fair comparison of three proposed social-based routing

strategies and discuss the future work that can be done in thearea of social-based

data routing in DTN.

7.1 A Comparison of Three Strategies

We conduct a comprehensive comparison of three proposed social-based routing

strategies on three real data traces in terms of delivery ratio, average delay and av-

erage cost. Similar as previous evaluation setup, we use HaggleSim for the simula-

tion. 1000 messages are generated with randomly selected sources and destinations.

In this group of experiment, each message only keeps one copyof message in the

network. We run each simulation 20 times for the result convergence.

Fig. 7.1 shows the performance of three social-based routing strategies on MIT

Reality data trace. The delivery ratio of three different social-based routing strate-

gies is presented in Fig. 7.1a. The location-based routing strategy Loc has simi-

lar performance with the encounter-based routing strategies Soc, although Soc has

slightly higher delivery ratio compared with Loc. This is consistent to the results

when comparing location-based strategy and encounter-based strategy in Chapter

5. The delivery ratio of SMART is 50% higher than that of Soc and Loc. The aver-

age delay of three social-based routing schemes is shown in Fig. 7.1b. It suggests

SMART has higher average delay than both Soc and Loc, whereasSoc and Loc

perform similar in terms of average delay. The average cost of SMART, as shown

in Fig. 7.1c, is 40% higher than both Soc and Loc. The cost of Soc is also higher

than that of Loc. However, the difference between them is still not significant.

Fig. 7.2 shows the performance of three proposed social-based routing strate-

gies on DieselNet data trace. The delivery ratio is shown in Fig. 7.2a. The

community-based strategy SMART outperforms both the location-based social strat-

egy Loc and encounter-based routing strategy Soc over 100%.The performance of

Soc and Loc are nearly the same with the delivery ratio of Soc slightly higher. The

85

Page 102: Social-based Data Routing Strategies in Delay Tolerant

Chapter 7. Discussion and Future Works

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Time [day]

Del

iver

y ra

tio

SMARTSocLoc

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

1

2

3

4

5x 105

Time [day]

Ave

rage

del

ay [s

]

SMARTSocLoc

(b) Average delay

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5

Time [day]

Ave

rage

cos

t

SMARTSocLoc

(c) Average cost

Figure 7.1: Performance comparison of three social-based strategies on MIT Real-ity data trace

average delay, presented in Fig. 7.2b, shows differences among different strategies

as well. Specifically, Loc and Soc has similar average delay,which is consistent

with the comparison between Loc and Soc in Chapter 5. The average delay of

SMART is only 10% higher than that of Loc and Soc. Fig. 7.2c shows the average

cost of three social-based routing schemes on DieselNet. Itis clear that SMART

has much higher cost than both Soc and Loc (around 1 hop longer). In contrast, the

cost of Soc and Loc are still similar to each other.

Fig. 7.3 presents the performance of three social-based routing schemes on Cab-

spotting data trace. The delivery ratio of SMART, as shown inFig. 7.3a, reaches

0.7 in the end of the experiment, which is 133% higher than that of Loc and Soc.

The delivery ratio of Loc and Soc are low and similar to each other. In this experi-

ment, Loc has slightly higher delivery ratio than Soc. Fig. 7.3b shows the average

delay of three different routing strategies. In the figure, SMART has higher av-

86

Page 103: Social-based Data Routing Strategies in Delay Tolerant

7.1. A Comparison of Three Strategies

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Time [day]

Del

iver

y ra

tio

SMARTSocLoc

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

2

2.5

3

3.5

4x 105

Time [day]

Ave

rage

del

ay [s

]

SMARTSocLoc

(b) Average delay

0.5 1 2 4 6 8 10 12 140

0.5

1

1.5

Time [day]

Ave

rage

cos

t

SMARTSocLoc

(c) Average cost

Figure 7.2: Performance comparison of three social-based strategies on DieselNetdata trace

erage delay, but compared with delivery ratio, the outstanding proportion is much

lower. Soc uses sightly higher average delay compared with Loc. The average cost

is shown in Fig. 7.3c. Again, Soc and Loc have similar cost after the second day

of the simulation. The cost of SMART varies from Soc and Loc significantly. It is

about 30% higher than others.

Overall, the community-based routing strategy SMART has much better perfor-

mance than both location-based routing strategy Loc and encounter-based routing

strategy Soc. Whereas, encounter-based routing strategy has similar performance

with location-based strategy. This is consistent with the comparison results that we

have presented in Chapter 5.

87

Page 104: Social-based Data Routing Strategies in Delay Tolerant

Chapter 7. Discussion and Future Works

0.5 1 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Time [day]

Del

iver

y ra

tio

SMARTSocLoc

(a) Delivery ratio

0.5 1 2 4 6 8 10 12 142

4

6

8

10

12

14

16x 104

Time [day]

Ave

rage

del

ay [s

]

SMARTSocLoc

(b) Average delay

0.5 1 2 4 6 8 10 12 141.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

Time [day]

Ave

rage

cos

t

SMARTSocLoc

(c) Average cost

Figure 7.3: Performance comparison of three social-based strategies on Cabspottingdata trace

7.2 Future Works

In this thesis, we mainly focus on using social information for the enhancement of

data routing performance. For the future research, we will target on the following

future research directions: resource utilization and allocation, privacy preserving

and content centric routing.

One key future research direction for data dissemination indelay tolerant net-

works is the resource utilization and allocation. In a DTN, especially with human

involved, mobile nodes have limited energy, storage and computing capability and

etc. All these resources may run out during the routing process. One question is

how to allocate different resources. Social relationship in such case takes a very

important role. People may be willing to share more resourcewith his friends, but

unwilling to serve for strangers. Similarly, if the buffer of a node is full, a node

88

Page 105: Social-based Data Routing Strategies in Delay Tolerant

7.2. Future Works

may first drop the message for those nodes that it is not familiar with. The other

one is how to save the resource for the consideration of further usage of device and

environment concern. People with friendship may be cooperative for data storage

and computation sharing. For example, a node with very limited storage may apply

his friend’s storage for those data not urgently used. Similar scenario can also be

exploited for energy consumption and computing capability.

Another important future research direction for data dissemination in DTNs is

the privacy preserving. Although we consider location-based social information

may violate user privacy, there are few works targeting on privacy preserving in

terms of data routing in DTNs. Especially, different types of social information also

refer to privacy issue. The study of privacy concern in social networks [6, 78, 54,

79] have been prevailing recently. Besides, people with close friendship may have

less privacy issue since people are willing to share part of his privacy information

among friends. However, the privacy preserving bound between strangers is much

higher. A person may not share any personal information witha stranger. Therefore,

the future focus is that how to preserve user privacy during the routing process in

DTNs from social perspective.

Finally, as the proliferation of content centric network indelay tolerant net-

works [88, 69, 46, 90], data dissemination needs to adapt to such network struc-

ture. In content-centric delay tolerant networks, nodes donot need to care where

the date is stored. Data is cached on the path for data transmission. Therefore, the

source-destination data routing schemes cannot adapt to such kind of networks. The

main challenges of routing in content-centric delay tolerant network become to ad-

dress the following two questions: where is the data and how to cache data? Thus,

the routing for content-centric DTNs needs to be in end-to-data diagram. Since

the DTN is opportunistically connected, optimal caching mechanism for content-

centric DTNs also needs to be carried out.

89

Page 106: Social-based Data Routing Strategies in Delay Tolerant
Page 107: Social-based Data Routing Strategies in Delay Tolerant

Chapter 8

Conclusions

In this dissertation, we discuss several social-based routing strategies for DTNs.

We pointed out that data dissemination in DTNs is a challengesince the dynamic

network topology, limited network information and interrupted connectivity which

causes the end-to-end communication to become impractical. There are many of

routing strategies proposed to enhance the routing performance in DTNs, which

can be categorized into three groups from social perspective: location-based rout-

ing, encounter-based routing and community-based routing. One limitation of the

existing routing scheme is that they apply sole node or network feature to construct

the routing metric, which cannot adapt to various situations in the network. Be-

sides, most utility-based routing strategies meet blind spot and dead end problems

during the routing process. Lastly, rare efficient inter-community routing schemes

are proposed for community-based routing schemes.

This thesis models social graph based on geographic information and encounter

pattern. Based on location-based social information and encounter-based social in-

formation, a location-based social routing strategy comprehensively incorporating

location information is proposed in Chapter 4 to adapt to different network situa-

tions and thus enhance routing performance, and an encounter-based social routing

strategy combining multiple encounter-based social properties is devised in Chapter

5 to against sensitive location information and preserve privacy in the network.

Specifically, we discussed the location-based social routing in Chapter 4. Based

on the observation that the routing strategy relying on soleaspect of location infor-

mation cannot adapt to different network situation in DTN, we comprehensively

consider two important geographical aspects, geographical distance and mobil-

ity pattern, to compose the routing utility metric. The evaluation results show

that the proposed comprehensive location-based routing strategy outperforms other

location-based routing strategies that relying on sole aspect of location information

in terms of delivery ratio, average delay and average cost.

Due to the fact that location-based social information is sensitive and privacy

concern to nodes in DTN, while encounter-based social information is less sensi-

tive, we proposed the encounter-based routing strategy in Chapter 5. The design of

91

Page 108: Social-based Data Routing Strategies in Delay Tolerant

Chapter 8. Conclusions

the scheme combines multiple social properties by convolution. It presents the fea-

ture that the most recent encounters have more effects to therouting utility value.

The utility value decays with respect to social centrality and elapsed time which

provide different decay speeds for different nodes. The evaluation results show that

the proposed encounter-based routing strategy outperforms than other encounter-

based routing schems. Besides, we compared it with the location-based routing

strategy, the results suggest that they have very similar performance and therefore

encounter-based social information is a good subsitute of location-based social in-

formation to route data in DTN for the purpose of privacy preserving.

Finally, the thesis carried out a social and mobile aware routing strategy called

SMART in Chapter 6 that identified the blind spot and dead end problems and ineffi-

ciency of inter-community communication efficiency. SMARTintroduces a convo-

lutionary routing metric for intra-community routing to address blind spot and dead

end problems. It reduces the blind spot and dead end problemsbelow 1%. For effi-

cient inter-community communication, it selects fringe node and utilize utility func-

tion similar to intra-community routing to improve the inter-community communi-

cation efficiency. Overall, SMART achieves significant improvement compared

with other utility and community-based routing strategies. Furthermore, compre-

hensive comparisons of our proposed three routing strategies have been conducted

to outline the benefit of social information applying for data routing in DTNs.

92

Page 109: Social-based Data Routing Strategies in Delay Tolerant

Bibliography

[1] T. Abdelkader, K. Naik, A. Nayak, N. Goel, and V. Srivastava. Sgbr: A

routing protocol for delay tolerant networks using social grouping. IEEE

Trans. Parallel Distrib. Syst., 24(12):2472–2481, Dec. 2013.

[2] B. Adams, D. Phung, and S. Venkatesh. Sensing and using social context.

ACM Transactions on Multimedia Computing, Communications, and Appli-

cations (TOMCCAP), 5(2):11, 2008.

[3] E. Altman, G. Neglia, F. De Pellegrini, and D. Miorandi. Decentralized

stochastic control of delay tolerant networks. InINFOCOM ’09, pages 1134–

1142, 2009.

[4] J. An, Y. Ko, and D. Lee. A social relation aware routing protocol for mobile

ad hoc networks. InIEEE International Conference on Pervasive Computing

and Communications, PerCom ’09, pages 1–6, 2009.

[5] L. Arantes, A. Goldman, and M. V. dos Santos. Using evolving graphs to

evaluate dtn routing protocols.ExtremeCom 2009, 2009.

[6] R. Baden, A. Bender, N. Spring, B. Bhattacharjee, and D. Starin. Persona:

An online social network with user-defined privacy. InProceedings of the

ACM SIGCOMM 2009 Conference on Data Communication, SIGCOMM

’09, pages 135–146, New York, NY, USA, 2009. ACM.

[7] A. Balasubramanian, B. Levine, and A. Venkataramani. Dtn routing as a re-

source allocation problem. InProceedings of the 2007 conference on Appli-

cations, technologies, architectures, and protocols for computer communica-

tions (SIGCOMM ’07), pages 373–384, New York, NY, USA, 2007. ACM.

[8] B. Ball, B. Karrer, and M. Newman. Efficient and principled method for

detecting communities in networks.Physical Review E, 84(3):036103, 2011.

[9] M. Behrisch, L. Bieker, J. Erdmann, and D. Krajzewicz. Sumo - simulation

of urban mobility: An overview. InSIMUL 2011, The Third International

93

Page 110: Social-based Data Routing Strategies in Delay Tolerant

Bibliography

Conference on Advances in System Simulation, pages 63–68, Barcelona,

Spain, 2011.

[10] A. Beresford and F. Stajano. Location privacy in pervasive computing.Per-

vasive Computing, IEEE, 2(1):46–55, 2003.

[11] C. Bettini, X. Wang, and S. Jajodia. Protecting privacyagainst location-based

personal identification.Secure Data Management, pages 185–199, 2005.

[12] J. A. Bitsch Link, N. Viol, A. Goliath, and K. Wehrle. SimBetAge: utiliz-

ing temporal changes in social networks for pocket switchednetworks. In

Proceedings of the 1st ACM workshop on User-provided networking: chal-

lenges and opportunities (U-NET ’09), pages 13–18, New York, NY, USA,

2009. ACM Press.

[13] E. Bulut and B. Szymanski. Exploiting friendship relations for efficient

routing in mobile social networks.IEEE Transactions on Parallel and Dis-

tributed Systems, PP(99):1, 2012.

[14] E. Bulut, Z. Wang, and B. Szymanski. Cost-effective multiperiod spraying

for routing in delay-tolerant networks.IEEE/ACM Transactions on Network-

ing, 18(5):1530–1543, 2010.

[15] J. Burgess, B. Gallagher, D. Jensen, and B. Levine. Maxprop: Routing for

vehicle-based disruption-tolerant networks. InProceedings of 25th IEEE

International Conference on Computer Communications (INFOCOM ’06),

pages 1–11, 2006.

[16] J. Burgess, B. N. Levine, R. Mahajan, J. Zahorjan, A. Balasubrama-

nian, A. Venkataramani, Y. Zhou, B. Croft, N. Banerjee, M. Corner,

and D. Towsley. CRAWDAD data set umass/diesel (v. 2008-09-14).

http://crawdad.cs.dartmouth.edu/umass/diesel, 2008.

[17] S. Burleigh and K. Scott. Bundle protocol specification. IETF Request for

Comments RFC, 5050, 2007.

[18] B. Burns, O. Brock, and B. Levine. Mv routing and capacity building in dis-

ruption tolerant networks. InProceedings of 24th IEEE International Con-

ference on Computer Communications (INFOCOM ’05)., pages 398 – 408,

2005.

94

Page 111: Social-based Data Routing Strategies in Delay Tolerant

[19] V. Cerf, S. Burleigh, A. Hooke, L. Torgerson, R. Durst, K. Scott, K. Fall, and

H. Weiss. Delay-tolerant networking architecture.RFC4838, April, 2007.

[20] A. Chaintreau, P. Hui, J. Crowcroft, C. Diot, R. Gass, and J. Scott. Impact of

human mobility on opportunistic forwarding algorithms.IEEE Transactions

on Mobile Computing, 6(6):606–620, 2007.

[21] P.-C. Cheng, K. C. Lee, M. Gerla, and J. Härri. Geodtn+ nav: geographic dtn

routing with navigator prediction for urban vehicular environments.Mobile

Networks and Applications, 15(1):61–82, 2010.

[22] J. Coleman.Foundations of Social Theory. Belknap Press, 1990.

[23] J. S. Coleman. Social capital in the creation of human capital. American

Journal of Sociology, 94:S95–S120, 1988.

[24] E. M. Daly and M. Haahr. Social network analysis for routing in discon-

nected delay-tolerant manets. InProceedings of the 8th ACM international

symposium on Mobile ad hoc networking and computing (MobiHoc ’07),

pages 32–40, New York, NY, USA, 2007. ACM.

[25] N. Eagle and A. S. Pentland. CRAWDAD data set mit/reality (v. 2005-07-

01). http://crawdad.cs.dartmouth.edu/mit/reality, 2005.

[26] F. Fabbri and R. Verdone. A sociability-based routing scheme for delay-

tolerant networks.EURASIP Journal on Wireless Communications and Net-

working, 2011:1, 2011.

[27] K. Fall. A delay-tolerant network architecture for challenged internets. In

Proceedings of the 2003 conference on Applications, technologies, architec-

tures, and protocols for computer communications (SIGCOMM ’03), pages

27–34, New York, NY, USA, 2003. ACM.

[28] J. Fleming.Magnets and electric currents: An elementary treatise for the use

of electrical artisans and science teachers. E. & F.N. Spon, limited, 1902.

[29] S. Fortunato. Community detection in graphs.Physics Reports,

486(3šC5):75 – 174, 2010.

[30] L. Freeman. Centrality in social networks: Conceptualclarification. Social

Networks, 1(3):215–239, 1979.

95

Page 112: Social-based Data Routing Strategies in Delay Tolerant

Bibliography

[31] L. C. Freeman. A set of measures of centrality based on betweenness.So-

ciometry, 40(1):35–41, 1977.

[32] W. Gao and G. Cao. On exploiting transient contact patterns for data for-

warding in delay tolerant networks. In18th IEEE International Conference

on Network Protocols (ICNP ’10), pages 193–202, 2010.

[33] W. Gao, Q. Li, B. Zhao, and G. Cao. Multicasting in delay tolerant networks:

a social network perspective. InProceedings of the tenth ACM international

symposium on Mobile ad hoc networking and computing (MobiHoc ’09),

pages 299–308, New York, NY, USA, 2009. ACM.

[34] N. Glance, D. Snowdon, and J.-L. Meunier. Pollen: Usingpeople as a com-

munication medium.Comput. Netw., 35(4):429–442, Mar. 2001.

[35] M. S. Granovetter. The strength of weak ties.American journal of sociology,

pages 1360–1380, 1973.

[36] M. Grossglauser and M. Vetterli. Locating mobile nodeswith ease: Learning

efficient routes from encounter histories alone.IEEE/ACM Trans. Netw.,

14(3):457–469, June 2006.

[37] S. Guo, M. H. Falaki, E. A. Oliver, S. Ur Rahman, A. Seth, M. A. Zaharia,

U. Ismail, and S. Keshav. Design and implementation of the kiosknet system.

In International Conference on Information and Communication Technolo-

gies and Development(ICTD ’07), pages 1–10, 2007.

[38] P. Hui, A. Chaintreau, J. Scott, R. Gass, J. Crowcroft, and C. Diot. Pocket

switched networks and human mobility in conference environments. InPro-

ceedings of the 2005 ACM SIGCOMM Workshop on Delay-tolerant Network-

ing, WDTN ’05, pages 244–251, New York, NY, USA, 2005. ACM.

[39] P. Hui and J. Crowcroft. How small labels create big improvements. In

Fifth Annual IEEE International Conference on Pervasive Computing and

Communications Workshops, 2007 (PerCom Workshops ’07), pages 65 –70,

2007.

[40] P. Hui, J. Crowcroft, and E. Yoneki. Bubble rap: social-based forwarding

in delay tolerant networks. InProceedings of the 9th ACM international

symposium on Mobile ad hoc networking and computing (MobiHoc ’08),

pages 241–250, New York, NY, USA, 2008. ACM.

96

Page 113: Social-based Data Routing Strategies in Delay Tolerant

[41] K. Jahanbakhsh, G. C. Shoja, and V. King. Social-greedy: a socially-based

greedy routing algorithm for delay tolerant networks. InProceedings of the

Second International Workshop on Mobile Opportunistic Networking (Mo-

biOpp ’10), pages 159–162, New York, NY, USA, 2010. ACM.

[42] S. Jain, K. Fall, and R. Patra. Routing in a delay tolerant network. InPro-

ceedings of the 2004 conference on Applications, technologies, architectures,

and protocols for computer communications (SIGCOMM ’04), pages 145–

158, New York, NY, USA, 2004. ACM.

[43] P. Juang, H. Oki, Y. Wang, M. Martonosi, L. S. Peh, and D. Rubenstein.

Energy-efficient computing for wildlife tracking: Design tradeoffs and early

experiences with zebranet.SIGARCH Comput. Archit. News, 30(5):96–107,

Oct. 2002.

[44] H. Kang and D. Kim. Vector routing for delay tolerant networks. In IEEE

68th Vehicular Technology Conference (VTC ’08), pages 1–5, 2008.

[45] B. Karp and H. T. Kung. Gpsr: greedy perimeter statelessrouting for wireless

networks. MobiCom ’00, pages 243–254, New York, NY, USA, 2000. ACM.

[46] V. Kawadia, N. Riga, J. Opper, and D. Sampath. Slinky: Anadaptive protocol

for content access in disruption-tolerant ad hoc networks.In ACM MobiHoc

2011 International Workshop on Tactical Mobile Ad Hoc Networking, 2011.

[47] J. Krumm. A survey of computational location privacy.Personal Ubiquitous

Comput., 13(6):391–399, Aug. 2009.

[48] J. LeBrun, C.-N. Chuah, D. Ghosal, and M. Zhang. Knowledge-based op-

portunistic forwarding in vehicular wireless ad hoc networks. In IEEE 61st

Vehicular technology conference (VTC ’05), volume 4, pages 2289–2293.

IEEE, 2005.

[49] J. Leguay, T. Friedman, and V. Conan. Evaluating mobility pattern space

routing for dtns. InProceedings of 25th IEEE International Conference on

Computer Communications (INFOCOM ’06), pages 1 –10, 2006.

[50] I. Leontiadis and C. Mascolo. Geopps: Geographical opportunistic rout-

ing for vehicular networks. InIEEE International Symposium on a World

of Wireless, Mobile and Multimedia Networks, (WoWMoM ’07), pages 1–6.

IEEE, 2007.

97

Page 114: Social-based Data Routing Strategies in Delay Tolerant

Bibliography

[51] F. Li and J. Wu. Localcom: a community-based epidemic forwarding scheme

in disruption-tolerant networks. InProceedings of the 6th Annual IEEE com-

munications society conference on Sensor, Mesh and Ad Hoc Communica-

tions and Networks (SECON ’09), pages 574–582, Piscataway, NJ, USA,

2009. IEEE Press.

[52] Z. Li and H. Shen. Utility-based distributed routing inintermittently con-

nected networks. In37th International Conference on Parallel Processing,

2008 (ICPP ’08), pages 604–611. IEEE, 2008.

[53] Z. Li and H. Shen. Sedum: Exploiting social networks in utility–based dis-

tributed routing for dtns.IEEE Trans. Comput., 62(1):83–97, Jan. 2013.

[54] X. Liang, X. Li, K. Zhang, R. Lu, X. Lin, and X. Shen. Fullyanonymous

profile matching in mobile social networks.IEEE J Sel Areas Commun (to

appear), 2013.

[55] K. C.-j. Lin, C.-w. Chen, and C.-F. Chou. Preference-aware content dis-

semination in opportunistic mobile social networks. InProceedings of the

31st Annual IEEE International Conference on Computer Communications

(INFOCOM ’12), pages 1960–1968, 2012.

[56] A. Lindgren, A. Doria, and O. Schelén. Probabilistic Routing in Intermit-

tently Connected Networks. volume 3126 ofLecture Notes in Computer

Science, pages 239–254. Springer-Verlag GmbH, 2004.

[57] C. Lochert, M. Mauve, H. Füßler, and H. Hartenstein. Geographic routing in

city scenarios.ACM SIGMOBILE Mobile Computing and Communications

Review, 9(1):69–72, 2005.

[58] G. Luo, J. Zhang, H. Huang, K. Qin, and H. Sun. Exploitingintercontact time

for routing in delay tolerant networks.Transactions on Emerging Telecom-

munications Technologies, 2012.

[59] T. Matsuda and T. Takine. (p,q)-epidemic routing for sparsely populated mo-

bile ad hoc networks.IEEE Journal on Selected Areas in Communications,

26(5):783–793, 2008.

[60] A. Mei, G. Morabito, P. Santi, and J. Stefa. Social-aware stateless forwarding

in pocket switched networks. InINFOCOM ’11, pages 251–255, 2011.

98

Page 115: Social-based Data Routing Strategies in Delay Tolerant

[61] A. G. Miklas, K. K. Gollu, K. K. Chan, S. Saroiu, K. P. Gummadi, and

E. De Lara. Exploiting social interactions in mobile systems. In UbiComp

2007: Ubiquitous Computing, pages 409–428. Springer, 2007.

[62] M. Motani, V. Srinivasan, and P. S. Nuggehalli. Peoplenet: engineering a

wireless virtual social network. InProceedings of the 11th annual inter-

national conference on Mobile computing and networking (MobiCom ’05),

pages 243–257, New York, NY, USA, 2005. ACM.

[63] A. Mtibaa, M. May, C. Diot, and M. Ammar. Peoplerank: social oppor-

tunistic forwarding. InProceedings of the 29th conference on Information

communications (INFOCOM ’10), pages 111–115, Piscataway, NJ, USA,

2010. IEEE Press.

[64] P. Mundur, M. Seligman, and J. N. Lee. Immunity-based epidemic routing in

intermittent networks. In5th Annual IEEE Communications Society Confer-

ence on Sensor, Mesh and Ad Hoc Communications and Networks, SECON

’08,, pages 609–611, 2008.

[65] M. Musolesi and C. Mascolo. Car: Context-aware adaptive routing for

delay-tolerant mobile networks.IEEE Transactions on Mobile Computing,

8(2):246 –260, 2009.

[66] S. Naidu, S. Chintada, M. Sen, and S. Raghavan. Challenges in deploying

a delay tolerant network. InProceedings of the third ACM workshop on

Challenged networks, pages 65–72. ACM, 2008.

[67] M. Newman and M. Girvan. Finding and evaluating community structure in

networks.Physical Review E, 69:026113, 2004.

[68] M. E. Newman. Fast algorithm for detecting community structure in net-

works. Physical review E, 69(6):066133, 2004.

[69] A. D. Nguyen, P. Senac, V. Ramiro, and M. Diaz. Pervasiveintelligent rout-

ing in content centric delay tolerant networks. InIEEE Ninth International

Conference on Dependable, Autonomic and Secure Computing (DASC ’11),

pages 178–185, 2011.

[70] N. P. Nguyen, T. N. Dinh, Y. Xuan, and M. T. Thai. Adaptivealgorithms for

detecting community structure in dynamic social networks.In Proceedings

99

Page 116: Social-based Data Routing Strategies in Delay Tolerant

Bibliography

of 30th Annual IEEE International Conference on Computer Communica-

tions (INFOCOM ’11), pages 2282–2290, 2011.

[71] G. Palla, I. Derenyi, I. Farkas, and T. Vicsek. Uncovering the overlapping

community structure of complex networks in nature and society. Nature,

435:814, 2005.

[72] I. Parris and T. Henderson. The impact of location privacy on opportunistic

networks. WoWMoM ’11, pages 1 –6, 2011.

[73] F. D. Pellegrini, D. Miorandi, I. Carreras, and I. Chlamtac. A graph-based

model for disconnected ad hoc networks. InINFOCOM, pages 373–381.

IEEE, 2007.

[74] A. Pentland, R. Fletcher, and A. Hasson. Daknet: Rethinking connectivity in

developing nations.Computer, 37(1):78–83, 2004.

[75] C. E. Perkins and P. Bhagwat. Highly dynamic destination-sequenced

distance-vector routing (dsdv) for mobile computers. InACM SIGCOMM

Computer Communication Review, volume 24, pages 234–244. ACM, 1994.

[76] C. E. Perkins and E. M. Royer. Ad-hoc on-demand distancevector routing.

In Second IEEE Workshop on Mobile Computing Systems and Applications

(WMCSA’99), pages 90–100. IEEE, 1999.

[77] M. Piorkowski, N. Sarafijanovic-Djukic, and M. Gross-

glauser. CRAWDAD data set epfl/mobility (v. 2009-02-24).

http://crawdad.cs.dartmouth.edu/epfl/mobility, 2009.

[78] K. P. Puttaswamy, A. Sala, and B. Y. Zhao. Starclique: Guaranteeing user

privacy in social networks against intersection attacks. In Proceedings of

the 5th International Conference on Emerging Networking Experiments and

Technologies, CoNEXT ’09, pages 157–168, New York, NY, USA, 2009.

ACM.

[79] K. P. Puttaswamy and B. Y. Zhao. Preserving privacy in location-based mo-

bile social applications. InProceedings of the Eleventh Workshop on Mobile

Computing Systems & Applications, pages 1–6. ACM, 2010.

[80] N. Ristanovic, G. Theodorakopoulos, and J.-Y. Le Boudec. Traps and Pitfalls

of Using Contact Traces in Performance Studies of Opportunistic Networks.

INFOCOM ’12, 2012.

100

Page 117: Social-based Data Routing Strategies in Delay Tolerant

[81] A. Seth, D. Kroeker, M. Zaharia, S. Guo, and S. Keshav. Low-cost commu-

nication for rural internet kiosks using mechanical backhaul. In Proceedings

of the 12th Annual International Conference on Mobile Computing and Net-

working (MobiCom ’06), pages 334–345, New York, NY, USA, 2006. ACM.

[82] T. Small and Z. J. Haas. The shared wireless infostationmodel: A new ad

hoc networking paradigm (or where there is a whale, there is away). In

Proceedings of the 4th ACM International Symposium on Mobile Ad Hoc

Networking &Amp; Computing (MobiHoc ’03), pages 233–244, New York,

NY, USA, 2003. ACM.

[83] T. Spyropoulos, K. Psounis, and C. S. Raghavendra. Spray and wait: an effi-

cient routing scheme for intermittently connected mobile networks. WDTN

’05, pages 252–259, New York, NY, USA, 2005. ACM.

[84] T. Spyropoulos, K. Psounis, and C. S. Raghavendra. Spray and focus: Ef-

ficient mobility-assisted routing for heterogeneous and correlated mobility.

In Fifth Annual IEEE International Conference on Pervasive Computing and

Communications Workshops, (PerCom Workshops’ 07), pages 79–85. IEEE,

2007.

[85] T. Spyropoulos, K. Psounis, and C. S. Raghavendra. Efficient routing in

intermittently connected mobile networks: the single-copy case.IEEE/ACM

Trans. Netw., 16(1):63–76, 2008.

[86] J. Su, J. Scott, P. Hui, J. Crowcroft, E. De Lara, C. Diot,A. Goel, M. H.

Lim, and E. Upton. Haggle: seamless networking for mobile applications.

In Proceedings of the 9th international conference on Ubiquitous computing

(UbiComp ’07), pages 391–408, Berlin, Heidelberg, 2007. Springer-Verlag.

[87] N. Thompson, R. Crepaldi, and R. Kravets. Locus: a location-based data

overlay for disruption-tolerant networks. CHANTS ’10, pages 47–54. ACM

Press, 2010.

[88] G. Tyson, J. Bigham, and E. Bodanese. Towards an information-centric

delay-tolerant network. InProc. IEEE INFOCOM Workshop on Emerging

Design Choices in Name-Oriented Networking (NOMEN), 2013.

[89] A. Vahdat, D. Becker, et al. Epidemic routing for partially connected ad hoc

networks. Technical report, CS-200006, Duke University, 2000.

101

Page 118: Social-based Data Routing Strategies in Delay Tolerant

Bibliography

[90] L. Wang, R. Wakikawa, R. Kuntz, R. Vuyyuru, and L. Zhang.Data naming in

vehicle-to-vehicle communications. In2012 IEEE Conference on Computer

Communications Workshops (INFOCOM WKSHPS), pages 328–333. IEEE,

2012.

[91] D. J. Watts and S. H. Strogatz. Collective dynamics of ’small-world’ net-

works. Nature, 393(6684):440–442, 1998.

[92] C. Wilson, B. Boe, A. Sala, K. P. Puttaswamy, and B. Y. Zhao. User interac-

tions in social networks and their implications. EuroSys ’09, pages 205–218,

New York, NY, USA, 2009. ACM.

[93] H. Wu, R. Fujimoto, R. Guensler, and M. Hunter. Mddv: a mobility-centric

data dissemination algorithm for vehicular networks. InProceedings of the

1st ACM international workshop on Vehicular ad hoc networks, pages 47–

56. ACM, 2004.

[94] J. Wu and Y. Wang. Social feature-based multi-path routing in delay tolerant

networks. InProceedings of the 31st Annual IEEE International Conference

on Computer Communications (INFOCOM ’12), 2012.

[95] J. Wu, M. Xiao, and L. Huang. Homing spread: Community home-based

multi-copy routing in mobile social networks. InINFOCOM, pages 2319–

2327. IEEE, 2013.

[96] F. Xia, L. Liu, J. Li, J. Ma, and A. Vasilakos. Socially aware networking: A

survey.IEEE Systems Journal, PP(99):1–18, 2013.

[97] M. Xiao, J. Wu, and L. Huang. Community-Aware Opportunistic Routing in

Mobile Social Networks.IEEE Transactions on Computers, PP(99):1, 2013.

[98] H. Zang and J. Bolot. Anonymization of location data does not work: a

large-scale measurement study. MobiCom ’11, pages 145–156, New York,

NY, USA, 2011. ACM.

[99] M. Zarafshan-Araki and K.-W. Chin. Trainnet: A transport system for deliv-

ering non real-time data.Comput. Commun., 33(15):1850–1863, Sept. 2010.

[100] Y. Zhang and J. Zhao. Social network analysis on data diffusion in delay

tolerant networks. InProceedings of the tenth ACM international symposium

102

Page 119: Social-based Data Routing Strategies in Delay Tolerant

on Mobile ad hoc networking and computing (MobiHoc ’09), pages 345–346.

ACM, 2009.

[101] K. Zhu, P. Hui, Y. Chen, X. Fu, and W. Li. Exploring user social behaviors

in mobile social applications. InProceedings of the 4th Workshop on So-

cial Network Systems, SNS ’11, pages 3:1–3:6, New York, NY, USA, 2011.

ACM.

[102] K. Zhu, W. Li, and X. Fu. Smart: A social and mobile awarerouting strategy

for disruption tolerant networks.IEEE Transactions on Vehicular Technology

(to appear), 2013.

[103] K. Zhu, W. Li, and X. Fu. Rethinking routing information in mobile social

networks: Location-based or social-based?Elsevier Computer Communica-

tions (to appear), 2014.

103

Page 120: Social-based Data Routing Strategies in Delay Tolerant
Page 121: Social-based Data Routing Strategies in Delay Tolerant

Curriculum Vitae

Personal Data

Konglin Zhu

Albrecht-Thaer-Weg 10b

37075, Goettingen Germany

Telephone: +4955139172047

Email: [email protected]

Research Interest

Mobile social networks

Online social networks

Routing in Delay Tolerant Networks

Education

2010-present Ph.D. (Computer Science), University of Goettingen

2008-2009 M.S. (Computer Science), University of California Los Angeles

2006-2008 B.S. (Computer Science), Southern Polytechnic State University

2004-2008 B.S. (Computer Science), North China Universityof Technology

105

Page 122: Social-based Data Routing Strategies in Delay Tolerant

Curriculum Vitae

Publications

1. Dongsheng Wei, Konglin Zhu, Xin Wang, Fairness-Aware Cooperative Caching

Scheme for Mobile Social Networks,accepted in IEEE International Confer-

ence on Communications (ICC) 2014.

2. Konglin Zhu, Wenzhong Li, Xiaoming Fu, Rethinking Routing Information

in Mobile Social Networks: Location-based or Social-based? accepted in

Computer Communications. January 2014.

3. Konglin Zhu, Wenzhong Li, Xiaoming Fu, SMART: A Social andMobile

Aware Routing Strategy for Disruption Tolerant Networks,accepted in IEEE

Transactions on Vehicular Technology, October 2013.

4. Konglin Zhu, Wenzhong Li, Xiaoming Fu, Jan Nagler, How Do Online Social

Networks Grow?PLOS ONE (under second review).

5. Konglin Zhu Wenzhong Li and Xiaoming Fu, Modeling population growth in

online social networks,Springer Complex Adaptive Systems Modeling, 2013.

6. Sufian Hameed, Alexander Wolf, Konglin Zhu, Xiaoming Fu, Evaluation of

Human Altruism Using a DTN- based Mobile Social Network Application,

5th workshop on Digital Social Networks (DSN 2012) in conjunction with

Gesellschaft fuer Informatik (GI 2012), Braunschweig, Germany, September

2012.

7. Konglin Zhu, Wenzhong Li , Xiaoming Fu, Exploring Regional and Global

Population Growth in Online Social Networks (extended abstract),Praxis der

Netzwerkforschung 2012 (PdN 2012), Frankfurt am Main, Germany, May

2012.

8. Narisu Tao, KonglinZhu, Xiaoming Fu, Thumb: A Real-Time Resource In-

formation Sharing Application over Mobile Phones,IEEE International Con-

ference on Pervasive Computing and Communication (PerCom) 2012, demo

session, Lugano, Switzerland, March 2012.

9. Konglin Zhu, Kelvin (Biao) Zhou, Xiaoming Fu, and Mario Gerla, Geo-

Assisted Multicast Inter-Domain Routing (GMIDR) Protocolfor MANETs,

106

Page 123: Social-based Data Routing Strategies in Delay Tolerant

Publications

IEEE International Conference on Communications (ICC 2011), Kyoto, Japan,

IEEE, June 2011.

10. Konglin Zhu, Pan Hui, Yang Chen, Xiaoming Fu, Wenzhong Li, Exploring

User Social Behaviors in Mobile Social Applications,4th ACM Workshop on

Social Network Systems (SNS 2011), in conjunction with ACM EuroSys 2011,

Salzburg, Austria, April 2011.

11. Kelvin Biao Zhou, Abhishek Tiwari, Konglin Zhu, You Lu, Mario Gerla,

Anurag Ganguli, Bao-hong Shen, David Krzysiak, Geo-Based Inter-Domain

Routing (GIDR) Protocol for MANETs,IEEE Military Communications Con-

ferences (MILCOM 2009), Boston, MA, October 2009.

January 23rd, 2014 Goettingen

107