48
2009:015 CIV MASTER'S THESIS REDS - Redundant and Expandable Distributed file Storage system for a serverless network Paula Arenas Lindmark Luleå University of Technology MSc Programmes in Engineering Computer Science and Engineering Department of Computer Science and Electrical Engineering Division of Computer Communication 2009:015 CIV - ISSN: 1402-1617 - ISRN: LTU-EX--09/015--SE

2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

2009:015 CIV

M A S T E R ' S T H E S I S

REDS- Redundant and Expandable Distributed file Storage

system for a serverless network

Paula Arenas Lindmark

Luleå University of Technology

MSc Programmes in Engineering Computer Science and Engineering

Department of Computer Science and Electrical EngineeringDivision of Computer Communication

2009:015 CIV - ISSN: 1402-1617 - ISRN: LTU-EX--09/015--SE

Page 2: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

Redundant andExpandableDistributed fileStorage system for a serverless network

Paula Arenas Lindmark

Lule̊a University of TechnologyDepartment of Computer Science and Electrical Engineering

September 2008

Page 3: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users
Page 4: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

ABSTRACT

Peer-to-peer (P2P) storage systems are an interesting and emerging field, providing new

possibilities for distributed applications. P2P storage systems allow a network of col-

laborating nodes to increase the availability of their data by store replicas of the data

on other nodes in the network. Reasons they are preferred over traditional client-server

systems include fault tolerance, availability and scalability. Despite their significant po-

tential, current peer-to-peer storage systems lack in their defense against cheating nodes

who attempt to use more storage space than they provide.

This thesis addresses this deficiency and presents ”REDS - Redundant and Expandable

Distributed file Storage system for a serverless network”, which has a novel approach to

overcome the problem with cheating nodes who tries to falsify information in a peer-to-

peer storage system. REDS uses a ranking system based on different kinds of requests to

locate and suspend malicious nodes in the system. A working prototype of the proposed

system has been implemented in Java on top of the FreePastry implementation of the

Pastry routing layer. Furthermore, a graphical user interface for the prototype has been

implemented. Simulation results, based on freepastry’s own simulator, indicate that

REDS scales well and is able to efficiently support a large number of nodes.

iii

Page 5: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users
Page 6: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

PREFACE

This Master’s Thesis is the final part of my Master of Science in computer Science

and Computer Communication at Lule̊a University of Technology. The project was de-

veloped at the Department of Computer Science at Colorado University at Boulder, USA.

Many people have contributed to the realization of this Master’s Thesis. First of all I

would like to thank my supervisor at the Department of Computer Science at The Uni-

versity of Colorado at Boulder, Dr. Douglas C. Sicker, for making this Master’s Thesis

possible and for all the help during the development of this project.

Furthermore I would like to thank Mr. Kevin S. Bauer at the Department of Computer

Science at University of Colorado at Boulder, for help and advice regarding the theoret-

ical aspect of anonymous networks. My thanks also go to my examiner, H̊akan Jonsson

at the Department of Computer Science and Electrical Engineering at Lule̊a University

of Technology.

I would also like to thank Staffan Backen at the Department of Computer Science and

Electrical Engineering at Lule̊a University of Technology, for contributing with valuable

ideas and feedback throughout the work procedure and also for standing by my side and

having faith in me, you have been invaluable.

A special thanks goes to my family who has been a source of constant support and has

always been there when I needed them. Finally, I would like to thank my friends Jessica

Hilb and Suzanne Short in Boulder, USA, for making my stay in Boulder really great.

Paula Arenas Lindmark

v

Page 7: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users
Page 8: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

CONTENTS

Chapter 1: Introduction 1

1.1 Background and problem description . . . . . . . . . . . . . . . . . . . . 1

1.2 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Chapter 2: Anonymous Networks 3

2.1 David Chaum’s Mix Network . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 Onion Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.3 Tor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.4 Cashmere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.5 Anonymous Remailers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.6 Measuring Anonymity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.7 Attacks against Anonymous Networks . . . . . . . . . . . . . . . . . . . . 6

2.8 Users of Anonymous Networks . . . . . . . . . . . . . . . . . . . . . . . . 7

Chapter 3: Peer-to-peer Systems 9

3.1 Description of P2P Systems . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Napster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.3 Gnutella . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.4 BitTorrent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.5 MorphMix, A P2P Anonymous Communications System . . . . . . . . . 11

3.6 Freenet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Chapter 4: Peer-to-peer Storage Systems 13

4.1 OceanStore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.2 Eternity Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.3 PAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.4 Scribe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.5 Anonymity in P2P Storage Systems . . . . . . . . . . . . . . . . . . . . . 15

Chapter 5: REDS 17

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5.2 REDS Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Chapter 6: Conclusions and Future Work 35

Page 9: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users
Page 10: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

CHAPTER 1

Introduction

1.1 Background and problem description

The need and use of large-scale distributed storage has rapidly increased in the last few

years, much research has therefore been done in this area. Storage systems have tradi-

tionally been a localized task requiring specific backup hardware and administration. In

the early 1960’s, the first backup traditions and strategies started to arise. The fact that

tape backup is very reliable and scalable makes it an attractive solution even today, but

its high cost is a big disadvantage.

The first hard drive was introduced by IBM [1] in 1956. Over the years, a rapid im-

provement of the hard drive technology has been made, while the cost significantly has

decreased. An important event was the introduction of the redundant array (RAID) of

inexpensive disks technology [2]. This data storage scheme uses multiple hard drives to

share or replicate data among them. In 1969 the first floppy disk was introduced and was

considered as revolutionary media for transporting data from one computer to another.

They could not store as much data as hard drives, but, being much cheaper and more

flexible, they became very widespread. However, the floppy disk had a very low capacity

and was replaced by the next step in the development of storage media, the CD and

DVD. Hard drive space has rapidly increased over the years, and studies suggest that

the majority of the hard drive space today often remains underused.

Peer-to-peer (P2P) systems is an exciting and emerging field and has in recent years

become one of the most popular Internet applications. This has lead to a significant

growth in the research of P2P systems. The novelty of these systems is such that the

traditional client-server network is replaced by a decentralized network where peers play

1

Page 11: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

2 Introduction

the part of both client and server. P2P systems like Napster [3], Gnutella [4] and Bit-

Torrent [5], just to name a few of them, are often associated with illegal file sharing.

However, P2P systems have many more uses and applications than simply enabling ille-

gal file sharing, such as offering a decentralized, scalable, self sustained and fault tolerant

network of peers providing access to some of their resources. As a result of peers joining

and leaving the network, ensuring high availability of the stored data is an interesting

and challenging problem, as well as preventing ”freeloaders” who use disproportionately

more storage on other peers than they contribute to the network.

Anonymous networks is a large and growing field of research and used to preserve the

anonymity and privacy during communication over public networks. Technological sup-

port for anonymity has been the subject of much research, starting with Chaum’s seminal

paper [6]. Recently, much of the research has been focused on P2P anonymous systems.

P2P systems are designed to scale well, which is an advantage for anonymous systems

that benefit from large user populations.

Instead of purchasing extra hardware or leasing online storage, the unused gigabytes

of hard drive space could be utilized in order to store replicas of data within a collabo-

rating group of networked peers. Though a large amount of research has already been

invested in P2P Storage systems, a successful file storage system has yet to be found.

This master’s thesis will present the design and implementation of ”REDS - Redundant

and Expandable Distributed file Storage system for a serveless network”, a new P2P file

Storage system, which inherits characteristics from previous storage-, P2P- and anony-

mous systems. REDS will try to overcome the problems with cheating nodes using an

innovative ranking system.

1.2 Thesis outlineIn section 2 anonymous networks are introduced and some well-known anonymous sys-

tems are described. Section 3 briefly defines what P2P is and then presents the basic

concepts of well-known P2P systems. Section 4 presents major research of distributed

P2P storage systems. In Section 5, the design and implementation of REDS is presented,

as well as an analysis of the performance of REDS.

Page 12: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

CHAPTER 2

Anonymous Networks

The importance and concern about preserving the anonymity and privacy during com-

munication over public networks has increased, caused by the expanding online world.

With this in mind, there is a need for solutions to ensure the user’s anonymity by hiding

the identity of the user during for example online payments and voting, but also for

email and web browsing. Much effort has therefore been expended to achieve anonymity

in these kinds of electronic interactions, which has led to the development of different

protocols and applications for anonymous communication.

2.1 David Chaum’s Mix Network

The idea of anonymous communication can be traced back to David Chaum, who in

1981 proposed a system for anonymous email [6]. The proposed system was using a

computer proxy, called a Mix, for the processing of emails. The Mix, located between

the sender and the receiver, used public key cryptography to act as a clearinghouse for

anonymous emails. To send an email with Chaum’s Mix, the sender must build a specially

formatted message for transmission to the Mix. While the encapsulated and encrypted

content of the email remains unchanged, the source and destination addresses as well

as the cryptographic keys are specific for each Mix. Each Mix only knows the identity

of previous and next hop in the route, which makes it difficult to identify the complete

communication route. Chaum’s Mix network further improves anonymity by processing

messages in batches at random intervals. By using a random delay, coincidence and

timing attacks are made more difficult.

3

Page 13: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

4 Anonymous Networks

2.2 Onion Routing

Goldschlag, Reed, and Syverson developed a technique for anonymous communication

over a computer network, called Onion Routing [7]. This technique is based on Mix net-

works, but was extended to work with real time communications such as web browsing.

Onion Routing is an infrastructure for private communication over a public network.

The system works in the following way: encrypted messages, or packets of information,

are sent through a distributed network of randomly selected servers, Onion Routers,

each of which knows only its predecessor and successor. Messages flowing through the

network are unwrapped by symmetric encryption keys at each onion router that peels

off one layer and reveals instructions for the next downstream node. The content of the

message can only be revealed by the final receiver. An onion is a data structure, treated

as the destination address by the onion routers. Onion routing does not guarantee perfect

anonymity, but it helps protect users from eavesdroppers who are not watching both the

initiator and recipient of the message at the time of the transaction. It is also strongly

resistant against traffic analysis.

2.3 Tor

In [8], Tor: The Second-Generation Onion Router, is described as a circuit-based low la-

tency anonymous communication service that addresses limitations in the original design

of onion routing. This is done by adding forward secrecy, congestion control, directory

servers, integrity checking, configurable exit policies, and practical design for location-

hidden services via rendezvous points.

Tor can anonymize applications that use TCP like web browsing and publishing, Instant

Messaging (IM), Internet Relay Chat (IRC), Secure Shell (SSH) etc. The main goal of

the Tor project is to stop traffic analysis, a network monitoring system that threatens

anonymity and privacy. Tor uses TCP for transport and Transport Layer Security (TLS)

[9] for encryption. In the design of Tor, as described in [10][8][11], there are some key

elements; Onion Router (the server component of the network), Onion Proxy (The client

part of the network), and a circuit, which is a path of three onion routers through the

Tor network from the onion proxy to the destination server. The first onion router is

called the entrance router, the second router is called the mix router and the final router

is called the exit router. Finally, the unit of transmission through the Tor network is

called a cell, which is a fixed-size 512 byte packet, that is padded if necessary. Each onion

router maintains two keys, one long-term identity key used to sign TLS certificates and

to sign the Onion Router’s router description (a summary of its keys, address, bandwidth

etc.), and the onion key is used to decrypt requests from users to setup a circuit and

negotiate ephemeral keys. This approach improves forward secrecy by eliminating the

Page 14: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

2.4. Cashmere 5

risk that nodes along the circuit could be compromised and forced to decrypt captured

traffic. Once a circuit is torn down, they keys are discarded, thus also offering protection

from replay attacks. There is also a short term link key, established by the TLS protocol,

when communicating between Onion Routers. Tor uses the Tor Authentication Protocol

(TAP), described and analyzed in [12]. Bauer and McCoy makes in their paper the

conclusions that TAP is secure in a random oracle model. This means that there is only

a small chance of success for a man-in-the-middle attack.

2.4 CashmereCashmere [13] is a MIX-based failure resilient anonymous routing layer implemented on

structured overlays. Cashmere enables anonymous routing, and provides both source

anonymity and unlinkability of source and destination. The key benefit of Cashmere

over traditional approaches is that it provides an increased resilience to node failures and

node churns which generally degrades the performance of traditional anonymous routing

protocols based on Chaum-Mixes. Traditionally Chaum-Mixes based routing protocols

achieve anonymity by relaying the traffic through a sequence of nodes, such that any

two nodes, which are not adjacent to each other along the path, are unable to identify

each other. Thus, if the relayed path contains more than two nodes, then there is no

way the destination can identify the source. More specifically, no downstream node can

identify the upstream nodes. Such anonymous routing has several weaknesses, and the

most important one is that if any node along the route fails, or misbehaves, then the

message is never delivered to the destination. Moreover, in an anonymous setting, it

is difficult for the source to identify such failures, and identify which node has failed.

Another problem is that, if an attacker can observe all routers, then he can use timing

analysis to identify the route from the source to the destination. Cashmere presents a

novel anonymous routing protocol, which attempts to address these problems with the

traditional onion routing.

2.5 Anonymous RemailersA remailer is a server that receives messages and forwards it without revealing the identity

of the sender. Remailers are not anonymous networks but are still interesting, especially

when considering that some of the techniques used for the anonymous networks mentioned

above are picked up by remailers. There are four different kinds of remailers, with each

offering different functionality.

1. Cyperpunk remailers removes the senders identity from a message, which will make

it impossible to reply on a message.

2. Mixmaster remailers sends a message in small fixed size packets and switch the

order of the packets, this will make it harder for potential traffic sniffers.

Page 15: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

6 Anonymous Networks

3. With Mixminion remailers, the messages is sent through several mix-nodes. The

mix-nodes decrypts the messages and mix them before forwarding the messages.

Like Onion Routing, the messages are being encrypted in layers with asymmetric

encryption.

4. Pseudonymous remailers gives the sender a pseudonym, an alias, which will hide

the real identity of the sender but still makes it possible for the receiver to reply

on the message.

2.6 Measuring Anonymity

Measuring the actual anonymity a certain system provides can be hard. Nevertheless,

several papers have been published on how to quantify the degree the anonymity a user

provides from a anonymous communication system. One definition of the degree of

anonymity, proposed by Reiter and Rubin [14], is as 1 − p, where p is the probability

an attacker, after observing the system, assigns to the different users of the system as

being the originators of a message. Berthold et.al [15] defines the anonymity as log(2N)

where N is the number of users of the system. However, both definitions are rejected

by Diaz, Seys, Claessens, and Preneel [16]. They claim the first definition does not

give information about how distinguishable a user is within the anonymity set. Their

comment to the second definition is that this degree only depends on the number of users

in the system, and does not take into account the information the attacker may obtain

by observing the system. Instead they propose a method that measures the information

the attacker obtains, taking into account the whole set of users and the probabilistic

information the attacker obtains about them.

2.7 Attacks against Anonymous Networks

Overlier and Syverson [17] presented an attack against Tor that revealed the location of

hidden servers. This was done using only a single hostile Tor node and was accomplished

in just a couple of minutes. According to Overlier and Syverson, their results also apply to

other anonymous networks. The predecessor attack presented in [18], where the attacker

controls a subset of the nodes in the anonymous system to passively log possible initiators

of a stream of communications, was used to statistically identify the hidden service as

the server that appeared on these linked paths most often. The Sybil attack [19], which

applies to all open peer-to-peer systems, demonstrates that an attacker has the ability

to control a significant fraction of the network. Both the Tarzan and Morphmix peer-to-

peer anonymous communications systems, however, implement a partial defense. In [10],

“Low-Resource, End-to-End Anonymity Attacks Against Tor ”, is presented. This attack

shows that Tor is vulnerable to attacks from non-global adversaries that control only a

few high resource nodes, or nodes that are perceived to be high-resource. By controlling

Page 16: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

2.8. Users of Anonymous Networks 7

both the start and end node, it is possible to get to know who is communicating with

whom through the Tor network. In that paper, a few defense methods are presented as

well.

2.8 Users of Anonymous NetworksAnonymity can of course be used for various purposes, both good and bad. Anonymity

allows people to express their views freely, without the fear of repercussions. As an exam-

ple, people in countries with a political regime may use anonymity to avoid persecution

for their political opinions. When being anonymous in discussions, people are more equal,

factors like status, gender, etc., will not influence the evaluations of what has been said.

There has always, however, been a dark side of anonymity and this does, unfortunately,

not exclude anonymity on the Internet. Criminals want of course to have anonymity so

that they can break the law without revealing their identity. However, criminals practi-

cally have anonymity already and the likelihood of these individuals committing crimes

is probably not tied to the availability of any one single technology. Technologies for

anonymity are a way for ordinary people to preserve their anonymity and privacy. One

way of keeping the identity hidden and protected from not only criminals and potential

harassments, but also from unwanted marketing and junk emails. An Expectation Maxi-

mization (EM) algorithm for distribution reconstruction is described and analyzed in [20],

as a solution to the issue of privacy preservation. The Government can also benefit from

anonymous networking for many reasons. For example using it to create anonymous tip

lines for whistle blowers, and anonymous means for citizens to submit anonymous leads

in criminal investigations.

Page 17: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users
Page 18: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

CHAPTER 3

Peer-to-peer Systems

Peer-to-peer systems is an exciting and emerging field and has in recent years become one

of the most popular Internet applications. Reasons they are preferred over traditional

centralized systems includes fault tolerance, availability, scalability and performance.

P2P networks are widely used to share all kinds of digital content with other users, both

legal and illegal content (i.e. copyrighted or subject to censorship). This chapter provides

an overview of P2P systems.

3.1 Description of P2P SystemsThe idea behind P2P networks is to have a system that enables end-point resources to

be shared. The resources can be of various kinds such as files, storage space, CPU cycles

etc. The defining feature of a P2P system, is the ability of end systems (ie. peers) to

communicate with each other. In this sense, a P2P system can be viewed as an overlay

network over the Internet. According to [21], there are three criterias that defines a P2P

system:

• Self-organizing: Nodes organize themselves into a network through a discovery pro-

cess. There is no global directory of peers or resources.

• Symmetric communication: Peers are considered equals ; they both request and offer

services, rather than being confined to either client or server roles.

• Decentralized control : Peers determine their level of participation and their course

of action autonomously. There is no central controller that dictates behavior to

individual nodes.

The P2P approach existed before file sharing systems like Napster, Gnutella and Freenet

made the idea of P2P popular. One of the first P2P systems, the Internet (Arpanet),

was based on the P2P approach.

9

Page 19: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

10 Peer-to-peer Systems

3.2 Napster

In 1999 Napster [3], the best-known P2P system, was created. The idea behind Napster

was a program that allowed computer users to share and swap files, specifically music,

through a centralized file server. Napster has a server holding a central database of all

files available by the connected peers (clients). Clients login to this server and sends a

list of files they offer. To be able to use Napster, each client has to create an account

on the Napster server for this purpose. After creating an account, the clients can send

file requests to the server and receive a list of clients, on where the matching file exists.

The requester can then choose clients from this list and request to download the file

directly from these clients. When considering that Napster is using a server to hold the

information of all files, you can say that it is not a pure P2P system. Napster is actually

a combination of a client/server and P2P system, where the client/server technology is

used for the initiation and file request and the P2P technology for downloading of files.

Since Napster uses a centralized database, it avoids many of the problems of other P2P

systems regarding query and routing. The original Napster was sentenced to go out

of business after facing legal issues with the music industry. Napster was taken over

by Bertelsmann and was remodeled to be a pay-service, but has not gained its original

popularity.

3.3 Gnutella

Gnutella [4] is a decentralized file-sharing system, where the participants forms a virtual

network communicating in a P2P fashion via the Gnutella protocol [4], which is a sim-

ple protocol for distributed file search. The first step for participating in Gnutella is to

connect to a known Gnutella host. The Gnutella protocol consists of five basic message

types: Ping, Pong, Query, QueryHit and Push. The messages are routed by the peers

using a constrained broadcast mechanism: when a peer receives a message, it will decre-

ment the message’s time-to-live (TTL) field. If the TTL is greater than 0 and it has not

seen the message’s identifier before, it resends the message to all peers it knows about.

Additionally the peer checks whether it should respond to a message: if the peer receives

a Query message, it checks to see if it can satisfy the request and then responds with a

QueryHit message. The respond is routed along the same path as the originating message.

Gnutella is a simple yet effective protocol. Hit rates for search queries are reasonably

high, it is fault tolerant, and is adaptable to a dynamically changing in the P2P system

topology. However, Gnutella is a system that comes with high bandwidth consumption

since the search requests are broadcasted over the network and each peer receiving a

search request are doing a local database scan to see if it can satisfy the request.

Page 20: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

3.4. BitTorrent 11

3.4 BitTorrent

BitTorrent [5] is a file transfer protocol developed by Bram Cohen and implemented in

a client/server package of the same name. Using this software it is possible to distribute

files over unreliable networks. The transfers are coordinated by special text files called

torrent files, which help the BitTorrent clients coordinate the peer-based transfers. This

means that by connecting to other BitTorrent users it is possible to upload and download

a common file using the combined bandwidth of all users.

In BitTorrent, a sharing file is divided into multiple small chunks. A user can down-

load different chunks concurrently from multiple users, and at the same time upload its

holding chunks to other BitTorrent users. This means that the users does not need to

have the complete file of what is being transferred, but everyone who has even a small

piece of it may participate in the transfer. Each peer is responsible for maximizing its

own download rate by connecting to suitable peers, and peers that supplies high upload

rate will probably also be able to get a high download rate. When a peer has finished

downloading a file, it may become a seed by staying online and make it able for other

peers to download the file.

3.5 MorphMix, A P2P Anonymous Communications Sys-tem

Mix-based systems for anonymous communication on the Internet gives the users protec-

tion against eavesdropping and traffic analysis. These systems can suffer from scalability

problems, with potentially large bandwidth and system overhead costs, and the servers

themselves can be targets of an attack. These problems have been addressed by Rennhard

and Plattner in [22]. In their paper, they introduce a Peer-to-Peer network for anonymous

communication called MorphMix. In this system, as well as in Tarzan [23] and Freenet

[24], every client is also a proxy at the same time. This makes MorphMix scale very well

and makes it the first system that enables anonymous low-latency Internet access for a

large number of users. MorphMix allows users to tunnel out through low-latency net-

work and includes some interesting collusion detection algorithms. The most significant

difference compared to traditional Mix-based systems, is that because of its network size,

MorphMix does not have to send cover traffic to prevent traffic analysis and therefore, it

will be less overhead.

3.6 Freenet

Freenet [24] is a P2P system for the publication, replication and retrieval of data files. Its

central goal is to provide an infrastructure that protects the anonymity of authors and

readers of the data. It is designed in a way that makes it infeasible to determine the origin

Page 21: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

12 Peer-to-peer Systems

of files or the destination of requests. It is also difficult for a node to determine what

it stores, since the files are encrypted when they are stored and sent over the network.

As Freenet can potentially contain illegal content, it provides deniability that the owner

of the node knows nothing of what is stored on the node, due to the encryption that

Freenet provides. Thus following the reasoning of the designers of Freenet, no user can

be sued for storing illegal content. Besides the aspect of anonymity protection the Freenet

system implements another interesting concept: an adaptive routing scheme for efficiently

routing requests to the physical locations where they are most probable to appear. In

order to improve search efficiency Freenet maintains routing tables that are dynamically

updated as searches and insertions of data occur. Freenet also uses dynamic replication

of more popular files such that files can migrate to peers where they are more likely to

be found. Thus Freenet does not require a central server as Napster, and compared to

Gnutella Freenet avoids inefficient message broadcasts.

Page 22: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

CHAPTER 4

Peer-to-peer Storage Systems

Traditional storage systems rely on robust servers and magnetic tapes where the data

are stored. This is a reliable storage solution, but expensive and hence mostly aimed

for companies. The growth of storage volume, bandwidth and the greater demand from

private persons has fundamentally changed the way applications are constructed and has

led to a new type of storage systems that use distributed P2P infrastructures. Compared

to traditional storage systems, these systems are inexpensive but poses problems of reli-

ability, confidentiality, availability, routing and more. This chapter briefly presents some

well known distributed storage systems.

4.1 OceanStore

OceanStore [25] is a proposed system for an Internet-based, distributed, global storage

infrastructure. It consists of many cooperating servers provided by different companies.

OceanStore is designed using a cooperative utility model in which consumers pay a fee to

the service providers to ensure access to persistent storage. The OceanStore is a storage

utility comprised of untrusted servers, therefore it uses encryption when storing data in

the network. The data is split up in fragments and can redundantly be stored anywhere in

the network, which provides high availability and prevention of denial-of-service attacks.

Files are uniquely identified by a Global ID (GUID). For the location of files, OceanStore

uses either a nondeterministic but fast algorithm or a deterministic, slower algorithm. To

achieve high fault tolerance, self-monitoring is provided, a mechanism that continually

monitors and repairs neighbor links. OceanStore uses ACL for restricting write access to

data, while read access is available with a key.

13

Page 23: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

14 Peer-to-peer Storage Systems

4.2 Eternity Service

The basic idea behind The Eternity Service [26] is the use of redundancy and distribution

techniques for the replication of data over a large set of nodes. An anonymity mechanism

is also used to try to prevent selective denial attacks.

The user uploads data along with a requested file duration. The user has to pay for

this service and the cost are based on size of data and the desired duration. When a

user uploads data to for example 100 servers, the user only has to remember 10 of these

servers for the purpose of auditing their performance.

The fact that the user does not record most of the servers where data has been dis-

tributed, there is no way to identify which of the participating eternity servers are storing

the data. Data queries are done via broadcast, and data delivery is achieved through

one-way anonymous remailers.

4.3 PAST

PAST [27] is a large scale P2P persistent storage management utility built on top of

the Pastry [28] lookup system. It is comprised of self-organizing, Internet based overlay

network of storage nodes which route file queries in a cooperative manner, perform replica

storage and caching. The PAST nodes forms an overlay network and are identified with

a 128-bit node identifier. All files have a fileId thats is a SHA-1 hash of the file name

and the public key of the node. To retrieve a file in PAST, the fileId is used, and in some

cases, the decryption key. In PAST, there are three main operations:

• Insert: insert a file to be stored in the network, replicated k times, where k is a

user specified number.

• Lookup: reliably retrieve a copy of the requested file identified by a file Id.

• Reclaim: reclaim the storage occupied by k copies of the file.

PAST uses smart cards that produce signed endorsement of a node’s request to consume

remote storage, the consumed space is charged to an internal counter. If a storage is

reclaimed the counter will be credited.

4.4 Scribe

Scribe [29] is a decentralized and scalable publish/subscribe system built on top of Pastry.

Users create topics to which other users can subscribe. Each Scribe group has a 160 bit

groupId, which serves as the address of the group. The nodes subscribed to each group

forms a multicast tree, consisting of the union of Pastry routes from all group members

Page 24: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

4.5. Anonymity in P2P Storage Systems 15

to the node with nodeId numerically closest to the groupId. Users create and insert

messages into the system. The messages are encrypted before inserted. To send a message

to another user of the group, the notification service is used to provide the recipient with

the necessary information to locate and decrypt the message. The recipient may then

modify their personal metadata to incorporate the message into their view (e.g., into a

private mail folder).

4.5 Anonymity in P2P Storage Systems

Anonymity is a way of dissociating actions from identities. It is a key privacy technol-

ogy, since the value of private information is greatly diminished if it cannot be tied to a

particular identity. Privacy can sometimes be used to hide actions made, but is mostly

used to prevent observers from knowing the identity of a user. Moreover, there are some

kinds of private information that can only be protected with anonymity technologies, as

the actions themselves cannot be otherwise hidden, but the association with the actors

is sensitive information

Technological support for anonymity has been the subject of much research, starting

with Chaum’s seminal paper. Recently, much of the research has focused on P2P anony-

mous systems. P2P, in this context, means a dynamic, decentralized network comprising

of a large number of peers, each of whom both provides services for others in the network

and uses the services provided by others. The interest in P2P anonymous systems is

motivated by several factors.

The anonymity community has been long concerned about central points of failure, and

research into mix networks, starting with chaums original paper, has aimed to defend

against attacks on a particular server. The decentralized nature of P2P systems provides

a mechanism to distribute trust among a very large population. At the same time, P2P

systems are designed to scale very large numbers of users, in part by using scalable al-

gorithms for networks maintenance, and in part by exploiting the capacity scaling that

comes from users also providing services to others. Anonymity systems greatly benefit

from large user populations, since users can hide their actions among a larger crowd of

potential actors. Finally, there is an extra level of deniability that can achieved by con-

tributing to an anonymous system, rather than being just a user. Therefore, users have

an incentive to provide services to other users, mitigating the free rider problem of P2P

networks.

When considering a P2P publish system the need of dissociate actions from identities

can be a really important aspect, hence these kind of systems implements either or both

author/publisher anonymity and reader anonymity. This means that an the system pre-

vents an adversary from linking an author/publisher to an document, as well as prevent a

Page 25: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

16 Peer-to-peer Storage Systems

document from being linked with its readers. In a P2P storage system, when using data

encryption, the same need of the above mentioned anonymities is really not needed.

Page 26: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

CHAPTER 5

REDS

This chapter presents REDS - Redundant and Expandable Distributed file Storage sys-

tem for a serverless network, a large scale peer-to-peer persistent storage application.

REDS is based on a self-organizing network of peers.

While traditional network storage systems rely on a central server machine, a serverless

system utilizes computers cooperating as peers to provide different services. A serverless

system will also provide better performance and scalability than traditional server based

systems. Furthermore, the design of REDS provides high availability via redundant data

storage. To demonstrate the functionality of REDS, an implementation has been made

in Java. Furthermore, a Graphical User Interface (GUI) has been implemented.

5.1 Introduction

REDS is a data storage application designed to provide persistent access to the user’s

files within a P2P network. Inserted files are replicated on multiple nodes to ensure

persistence and availability. With high probability, the set of nodes over which a file is

replicated is diverse in terms of geographic location, ownership, network connectivity etc.

To ensure file content anonymity (ie. only the owner of a file can read the original con-

tent of a file), files are encrypted before distribution. Files are split into chunks and each

node have a list of the nodes storing its chunks. The users chooses how many replicas of

the file that should be distributed on the network, and hence how much disk space that

are required to be allocated from this user to store other users files. A main problem

in current P2P storage system is to prevent cheating nodes, who use disproportionately

more storage on other peers than they contribute to the network or falsify information

of either themselves or some innocent node. To overcome the problems with cheating

nodes, a innovative approach, consisting of a ranking system for the nodes, are being used.

17

Page 27: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

18REDS - Redundant and Expandable Distributed file Storage system for

a serverless network

REDS builds on the Pastry [28] P2P routing scheme, ensuring that messages are reliably

routed to the appropriate nodes. Pastry provides fault-tolerance and scalability. A

Request to retrieve a file within the REDS network are routed to a node numerically

close to the node that issued the request, among all live nodes that store the file. In

[28], it is shown that the maximum expected number of REDS nodes traversed while

performing such a request, is dlog2bNe.

5.1.1 Project Ambitions and Challenges

In REDS, each node will play the role of both provider (each node provides disk space

to other nodes) and consumer (each node will be able to store data at other nodes).

A transaction between nodes can be a request to either store, retrieve, delete or verify

a file. After a transaction, a node get credits (in form of ranking points) for replying

to the request. The decentralized REDS system should satisfy the following requirements:

Cheating proof: the system should be resistant to abuse by ”cheating” nodes. A

decentralized storage system has no central authority that can control the nodes storage

records. In such a system it is reasonably to assume that some nodes tries to behave

unfair. This thesis considers two possible ways of cheating:

Freeloading - when a node, acting on its own, is trying to store more data in the network

than it stores locally for other nodes.

Node collaboration - when a group of nodes collaborates to falsify information of either

themselves or some innocent node.

Adaptable: the system should be efficient, scalable and able to handle the dynamic

nature of P2P systems, such as nodes joining and leaving the network. If a node crashes

and is unable to restore itself, it should be possible to set up a new node and retrieve its

old stored data. Furthermore, unlike other systems of this type file deletion should be

supported since typical nodes have finite local storage space.

Content Anonymous: the system should provide file content anonymity, meaning

that the original content of a file stored in the network should not be readable for other

users than the owner of the file.

Meeting these requirements poses several challenges in a decentralized environment. The

main challenges are ensuring that information about transactions is accurately recorded,

and making this information available to other nodes upon request. In decentralized

systems it is not obvious who should record this type of information, why other peers

should trust this entity and how nodes are storing the required information.

Page 28: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

5.1. Introduction 19

5.1.2 Contribution

The primary contribution of this thesis is the design and implementation of REDS. This

thesis presents a novel approach to overcome the problems with cheating nodes who

attempt to falsify information in a P2P storage system. In comparison to previous work,

the design of REDS tolerates nodes which are occasionally offline. An analysis of the

REDS system has been performed, using freepastry’s built-in simulator. The simulations

provides a number of performance-critical characteristics, even for large number of nodes

in the network.

5.1.3 Related Work

A number of P2P storage systems have proceeded REDS. Some of the earliest systems

includes The Eternity Service and Freenet, designed to provide uncensorable storage.

Like REDS these systems use cryptography and redundancy to protect data, but unlike

REDS neither The Eternity Service [26] or Freenet [24] have a defense against cheating

nodes who tries to use more storage space than they provide. In Freenet , data which

is not being frequently accessed are deleted to make room for newer data. The Eter-

nity service uses redundancy and secret sharing to replicate data, and adds anonymity

to prevent denial of service attacks. Queries in the Eternity Service are broadcast, and

delivery is achieved through anonymous remailers.

More recent peer-to-peer storage systems include PAST [27] and OceanStore [25]. These

systems do not attempt to provide uncensorability, and are thus simpler than the pre-

vious systems. The PAST system is producing a global scale storage system using data

replication for durability. PAST use smartcards issued by trusted third parties for user

quota management so that users cannot use more remote storage than they are provid-

ing locally. This is not a great solution since the smartcards have to be re-issued after a

certain time, which will increase the cost for the user.

OceanStore is a federated system where utility companies pool their resources to pro-

vide storage to users. Each user contracts with a single company, the responsible party, to

receive storage for a fee. That company then exchanges storage with the other companies

for greater reliability and geographic range. OceanStore, because of its need to support

concurrent updates, is very complicated and requires a great deal of central resources.

Unlike REDS, both PAST and Oceanstore involves a cost for the users of these systems.

P2P file sharing systems, like Napster [3] and Gnutella [4], are in wide use and provides

a mechanism for file search and retrieval among a large set of peers. A centralized index

server is used in Napster to handle the searches, while broadcast queries are used in the

Gnutella system. These systems provide anonymity through encrypted search keys, data

caching, source-node spoofing and time-to-live values.

Page 29: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

20REDS - Redundant and Expandable Distributed file Storage system for

a serverless network

5.2 REDS Design

Any host connected to the Internet can act as a node in REDS by installing the REDS

software. A REDS network is formed from a network of nodes, connected to the Internet,

storing each others data. Each node has a unique, randomly chosen 128-bit identifer. For

the purpose of routing, the node identifiers are thought as a sequence of digits with base

2b (b is a confguration parameter with typical value 4). When a node joins the system,

a node identity (nodeId) is randomly assigned to it. The identity of the node are then

used by Pastry to route messages between the nodes in the network. The node identity

indicates a node’s position in a circular namespace, which ranges from 0 to 2128− 1. The

identity of the node is randomly generated by computing a cryptographic hash code of

the node’s IP address and are assumed to be uniformly distributed in the circular names-

pace. This process leads, with high probability, to that there is no correlation between

the value of the node identity and the node’s geographic location, network connectivity,

or ownership.

Files are enrypted and split into specified-sized data chunks before inserted into the net-

work. Each data chunk has an associated chunk identity computed as a MD5 (Message-

Digest algorithm 5) [30] checksum of the chunk’s content. A chunk is then stored in

the network at the node whose node identity is numerically closest to the chunk iden-

tity. An index file will be created describing each file in terms of the ordered list of

chunks from which the files are composed. To retrieve a file in REDS, the user needs

to know the chunk identities, its decryption key and the node identity of the storing node.

In the design of REDS, each node will have a ranking list containing the ranking status

of all the nodes storing its data. In essence, this approach allows decentralization of a

previously centralized environment, where no central server or node is needed. REDS

will not provide facilities for searching or key distribution. REDS is, unlike many other

P2P systems, intended as an storage and content distribution utility and not as a file

sharing system.

5.2.1 Routing with Pastry

All distributed systems need a routing layer to get messages to their intended recipients.

Pastry is a P2P routing layer used by REDS. It is a scalable, decentralized and self-

organizing overlay network that automatically adapts to arrival, departure and failure of

nodes. A short overview of the Pastry design will be presented to provide the necessary

background for understanding REDS.

Each Pastry node has a unique, randomly chosen 128-bit identifier, called a node

identity. For the purpose of routing, the node identities are thought as a sequence of

digits with base 2b (b is a configuration parameter with typical value 4).

Page 30: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

5.2. REDS Design 21

5.2.1.1 Node State

A Pastry node state consists of a routing table R, a leaf set L, and a neighborhood set M .

The routing table is organized into rows and the maximum number of rows are log2bN

(N is the number of nodes in the overlay). Each row can have 2b − 1 entries, where

each entry maps a node identity to the associated node’s IP address. Each entry in row

n refer to a node whose node identity matches the present node’s node identity in the

first n digits, followed by the column number and the rest of node identity. A routing

table entry is left empty if no node with the appropriate node identity prefix is known.

Table 1 shows an example of a routing table for a node with node identity 10233102, the

associated IP addresses are not shown in the table.

Routing Table

0 1 2 3

- - - -

- 11301233 12230203 13021022

10031203 10132102 - 10323302

10200230 10211302 10222302 -

10230322 10231000 10232121 -

10233001 - 10233232 -

- - 10233120 -

- - - -

Table 1: Routing Table for node with node identity 10233102. Format: matched digits(redcolored) column number(orange colored) rest of node identity)

Each node also maintains a leaf set ( see Table 2) of the l/2 numerically closest larger

node identities and the l/2 closest smaller node identities, relative to the present node (l

is a configuration parameter with typical value of 16 or 32). The leafset is used during

message routing, described later on. A neighborhood set is also maintained by the nodes,

listing a set of nodes which are closest, according to the proximity metric, to the present

node. The proximity metric is scalar value that represents the distance between a pair

of nodes, such as the round trip time. The neighborhood set is not normally used in

routing messages, but is useful in maintaining locality properties.

Leafset

Smaller Greater

10233033 10233021 10233120 10233122

10233001 10233000 10233230 10233232

Table 2: Leafset for node10233102

Page 31: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

22REDS - Redundant and Expandable Distributed file Storage system for

a serverless network

5.2.1.2 Routing

Messages in Pastry are routed according to a longest prefix match principle on the desti-

nation’s Pastry key. If the key of a message is in range of the present node’s leaf set, then

the message is routed to the node whose node identity is closest to the key. If the key

is not covered by the node’s leafset, it looks up in the routing table a node whose node

identity shares a longer prefix with the key than its own node identity and routes the

message to this node. If there is no such node the message is routed to a node that shares

the same length prefix with the present node, but is numerically closer to the destination

address. This process Assuming a REDS network consisting of N nodes, Pastry can route

to the numerically closest node to a given chunk identity in less than dlog2bNe steps on

average [28]. With concurrent node failures, eventual delivery is guaranteed unless bl/2cnodes with adjacent node identities fail simultaneously (l is a configuration parameter

with typical value 16).

Figure 5.1 shows the steps that a query takes, while being routed through the Pastry

routing substrate. In the example below, node with identity 65A1FC is trying to locate

data using the key D46A1C.

D13D

A3

D4213F

D462BA

D46

7C4

D471F

1

65A1FC

route(D46A1C)

D46

A1C

0 2 -1128

Figure 5.1: Routing in Pastry

Page 32: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

5.2. REDS Design 23

5.2.1.3 Node Arrival and Departure

When a new node joins the Pastry network, it has to initialize a routing table, a leaf

set and a neighborhood set. After the initialization the node inform other nodes of its

presence. If a node with identity X wants to join the Pastry network, it initially locates

a node A in the network. Node A can be located by performing an expanding ring search

using a multicast mechanism [31]. Node X sends a ”join message” with its node identity

X as the key to node A. Node A routes the message to node Z, which is the numerically

closest node to X. Each node along the path to node Z sends a row from its routing

table to node X. The i:th node sends its i:th row. The Neigbhorhood set is taken from

node A because it is very likely that this is a close-by node X. The leaf set is taken from

Z, because it is numerically the nearest node to X. Finally, node A informs any node

that needs to be aware of its arrival.

As nodes may fail or depart without warning, nodes in the leafset periodically exchanges

keep-alive messages. A Pastry node is considered failed when its immediate neighbors in

the node identity space can no longer communicate with the node. If a node fails, all the

members of the failed node’s leafset update their leaf sets. Stale entries in the routing

table are repaired by first trying to find a new route via its downstream node, and if not

successful, starting its route table management mechanism.

The main functions in the Pastry API are route and deliver (Table 3 and 4). Route is

used to send a message to a node which is responsible for a given key, and deliver is a

call-back invoked at the target node when the message has arrived. The method forward

(Table 5) is also a call-back invoked on each node on the routing path.

Page 33: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

24REDS - Redundant and Expandable Distributed file Storage system for

a serverless network

VOID route(KEY key, MESSAGE msg, NODEHANDLE hint)

Routes the message msg to the node which is responsible for key. The call is asyn-

chronous, no acknowledgment is sent, and no quality of service is guaranteed. The

optional parameter hint may contain the address of a node which will be used as

first routing hop. If the hint is good (e. g. it contains the target node’s address),

the message can be delivered with only one hop. In the contrary, a bad hint may

introduce a further unnecessary routing step.Table.3

VOID deliver(KEY key, MESSAGE msg)

This method is a call-back which has to be provided by the application. The p2p

routing mechanism calls this method upon arrival of a new message for this node.Table.4

VOID forward(KEY key, MESSAGE msg, NODEHANDLE nextHopN-

ode)

This is a call-back method which is called at each node on the routing path of

message msg, including the source node and the target node. It is called just before

the message is forwarded to the next hop, which has already been determined as

nextHopNode. Within the method, each parameter may be modified. When key or

nextHopNode are modified, the routing behavior is altered.Table.5

5.2.2 Operation Logic

At present REDS supports a very basic set of operations.

• Insert: Stores a file with the user-specified level of replication, k, determining how

many copies of the file should be stored in the network. The file is encrypted and

split into fixed-sized data chunks. Each data chunk has an associated chunkID. A

chunk is stored in the network at the node whos node identity is numerically closest

to the chunkID.

• Retrieve: The operation retrieves a copy of the specified file by retrieving all

chunks belonging to the file identified by the chunkID’s, if they exists in REDS

and if they are available in the network. The data chunks is then put together and

decrypted with the node’s decryption key.

• Delete: This operation reclaims the storage occupied by the k replicas of the file.

It also sends a delete request to the storing node to inform the node that it can

delete the file.

Page 34: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

5.2. REDS Design 25

5.2.3 Ranking System

In a P2P storage system, the ability to consume storage space can be seen as some kind

of currency. In such a system, it is just feasible that remote storage is more valuable to

a node than its local storage. When a node exchanges its local storage against another

nodes remote storage both parties will benefit off the trade, giving them an incentive to

cooperate. However, one of the main problems in a P2P storage system is to guarantee

fairness, to ensure that a node will only use as much storage space as it provides to the

system. In a P2P storage system, a decentralized approach to this problem is needed,

which ensures that all nodes are equal and no peer has higher authority than other nodes.

A P2P storage system needs to ensure its users that their data has not been modified in

the time between the initial store and later retrieval. REDS generates a MD5 checksum

of the data before distribution. Upon retrieval of the data, the checksum is compared to

the checksum of the retrieved data. Besides ensuring file integrity at the time of retrieval,

there must exist some way for the owner of a data chunk to confirm that the storage

node continues to store an accurate copy of the original uploaded data.

One of the first quota approaches in P2P storage systems has been mentioned in PAST,

suggesting the use of smart cards that produce signed endorsement of a node’s request

to consume remote storage. The consumed space is charged to an internal counter. If a

storage is reclaimed the counter will be credited. But the approach with smart card has

disadvantages. A trusted organization that issues the cards are needed. After a period

of time the smart cards will have to be re-issued to invalidate compromised cards, with

the result of increased cost for the users.

In REDS, a different approach is used to ensure fairness in the system. Unlike the

smart cards design, nodes are required to maintain and send, upon request, their own

usage records such that other nodes can take part of it. This approach has to consider the

fact that nodes have no natural incentive to tell the truth about their records. Because

of that, this approach has to have disincentives to nodes lying on their records.

Every node in REDS will maintain a usage file, available for other nodes to verify. The

usage file contains information about:

• The amount of disk space the node is providing to the system.

• Local storage list, a list consisting of (nodeId, chunkId) pairs, containing the iden-

tifiers and sizes of all files that the node is storing on its local disk on behalf of

others.

• Remote storage list, a list consisting of (chunkId, nodeId) pairs, containing the

identifiers and sizes of all files that the node is storing in the system, on behalf of

it self.

Page 35: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

26REDS - Redundant and Expandable Distributed file Storage system for

a serverless network

The two lists together describes all credits and debits to the node’s quota. The aggre-

gate file size, which includes the original file size and the sum of all its replicas, is debited

against the node’s quota when a file is saved into the system. The delete operation adds

the aggregate file size to the node’s quota. The number of replicas may be changed

autonomously, by each client node, according to the file availability considerations and

the available network storage. Every node in REDS will also maintain a ranking file,

containing all node identities from the remote storage list and ranking status of these

nodes.

There are two possible ways for a node to cheat on others. The first one is to inflate

its advertised capacity beyond the resources of the disk, this might attract storage re-

quest that the node cannot satisfy. The node may try to compensate this by creating

fraudulent entries in its local storage list, to claim the storage is being used. The second

possibility is to deflate the remote storage list, by deleting entries without informing the

storage node that it should delete the file.

To prevent nodes from cheating, a storage list control will be performed. A node detects

for any file in its local storage list, if there is an entry in the appropriate node’s remote

storage list. If the entry is missing, a negative ranking point is debited to the node’s

entry in the ranking file. Furthermore, a message with the negative ranking point is sent

to the nodes in the cheating node’s remote storage list to inform these nodes to update

their ranking status for the cheating node. Also, a node randomly performs different

requests to the nodes in its ranking file. Whenever the response is positve the node’s

ranking status is increased. The three types of requests are:

• Alive request: A randomly performed alive request to control if the node is alive.

A node’s reply to the alive request will increase its ranking points.

• Checksum request: To ensure a stored file’s existence in the network, a randomly

generated checksum request, will periodically be sent to the node holding the file.

The checksum reply will then be controlled to be correct at the node who owns the

file. The node’s ranking points will increase if the checksum is correct and decrease

otherwise.

• Data control request: A second control to ensure the stored file’s existence will be

made, less often than the checksum request, where a copy of the actual file are

requested from the storing peer. When the file is being sent to its owner, as a reply

to the file request, it will be checksummed to ensure that the file is the correct file

and has not been modified from its original format.

Page 36: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

5.2. REDS Design 27

5.2.4 Protocols

In order to facilitate the logical operations described in section 5.2.2, the following pro-

tocols have been defined. The file storage protocol is used for the insert operation, when

a node wants to store a file into the network. To perform a data control of a node’s

stored files or in case of file retrieval, the storage control protocol is used. The file delete

protocol is used during the delete operation.

The following notations are used to explain the protocols used in REDS.

Symbol Description

S(A) A’s storage request

Sr(B) B’s storage reply

reply(B) B’s reply, either Yes or No

leafset(B) A List of nodes in B’s leafsetList

localStorageList(B) A list of nodes B stores data for.

remoteStorageList(B) A list of nodes storing B’s data.

F(A) A’s file control request

Fr(B) B’s file control reply

CH(A) A’s checksum control request

CHr(B) B’s checksum control reply

D(A) A’s delete request

F File F

name(F) File name of F

Size(F) size of file F

ch(F) checksum of file F

Info(M) Message M

IDA ID of node A

t timestamp

Page 37: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

28REDS - Redundant and Expandable Distributed file Storage system for

a serverless network

File Storage Protocol Node A contacts node B with a request to store file F. Algo-

rithm 1 details the procedure.

Algorithm 1 File Storage Protocol1. Storage RequestA→ B : S(A) = Info(storage, IDA, IDB, Size(F ))A sends a storage request to B, which contains identities of both nodes and file size.

2. Storage replySr(B)→ A = Info(storagereply, reply(B), localStorageList(B),remoteStorageList(B), leafSetList(B))B sends a storage reply to A containing the reply (Yes or No), its localStorageListcontaining a list of nodes B is storing data for, B’s remoteStorageList, and its leaf-SetList containing node identity’s that A can send storage requests to instead ofB.

2.a

In case of storage reply No, A will send a request to the nodes from B’s localStor-ageList, asking these nodes if they are storing data at node B. In case of a No, Awill send out a ranking message saying that node B is lying, to nodes holding B’sdata and to those nodes storing data at B. If node B’s localStorageList is confirmed,A will send a storage request to the nodes in B’s leafset.

2.b

If the storage reply from B is Yes, a file transfer between A and B is performed.

3. File Transfer A→ B : F

A sends file F to B for storage.A adds B to its remote storage list.B adds A to its local storage list.

Page 38: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

5.2. REDS Design 29

Storage Control Protocol

The Storage control protocol is containing two parts, the checksum control request and

the file control request. Both these request are to control a nodes files stored in the

network and they are shown in Algorithm 2.

Algorithm 2 Storage Control Protocol1A. Checksum Control RequestA→ B : CH(A) = Info(checksum, name(F ))A sends a checksum control request to B, containing the filename.

1B. Checksum Control ReplyB → A : CHr(B) = Info(checksumReply, name(F ), ch(F ), t)B sends a checksum control reply containing the filename, the checksum and atimestamp for the checksum

1C. Rankinglist UpdateIf A receives a checksum control reply from B, A will control if the checksum iscorrect and then update its ranking list, either with a positive credit for a correctchecksum or a negative credit for a false checksum reply. If A does not receive areply from B, A will update the ranking list, but give B negative credit for notreplying.

2A. File Control RequestA→ B : F (A) = Info(fileControl, name(F ))A sends a file control request containing the filename of the requested file.

2B. File Control ReplyB → A : Fr(B) = Info(fileControlReply, name(F ), F )B sends a file control reply to A containing file F.

2C. Rankinglist UpdateIf A receives a file control reply from B, A will make a checksum control of thereceived file and then update its ranking list, either with a positive credit for acorrect checksum or a negative credit for a false checksum reply. If A does notreceive a reply from B, A will update the ranking list, but give B negative credit fornot replying.

Page 39: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

30REDS - Redundant and Expandable Distributed file Storage system for

a serverless network

File Delete Protocol The protocol for deleting files in REDS is described in Algo-

rithm 3.

Algorithm 3 File Delete Protocol

1. Storage reclaim

Node A reclaims the storage occupied in the network by the file F and all the

replicas of the file. The aggregate storage adds to A’s usage quota.

2. File Deletion Information

A→ B : D(A) = Info(fileDelete, name(F ), size(F ))

A sends a file delete request to B containing the file name and the size of the

file, informing B that the file can be deleted. B deletes the file and delete this

file’s entry from its local storage list, then his usage quota will be updated.

5.3 Implementation

This section described important aspects of the implementation. REDS has been im-

plemented in Java on top of the FreePastry implementation of the Pastry routing layer.

There are several reasons why Java was used as a programming language. One of the

main reasons was the speed of development. Unlike C and C++, Java is strongly typed

and uses garbage collection. These two features greatly reduce debugging time, especially

for a large project with a rapid development pace. Another reason for choosing Java was

to make the REDS platform independent.

Each node in REDS has a request system and a ranking system. The request system

consists of two modules: the consumer module that generates storage, verification, rank-

ing and deletion requests; the provider module that handles storage, verification, ranking

and deletion requests issued by other nodes. Each module is implemented as a thread

with a message queue for incoming messages. The provider module is event driven and

are activated when a message is put in the queue. The Pastry routing layer is used to

route messages between the nodes in the REDS network.

5.3.1 Security

The very nature of P2P indicates that nodes are organized in a flat structure where

no node is more important than another. In other words, users are in control of their

content. REDS security model is based on the assumption that it is computationally

infeasible to break the symmetric key cryptosystem and the cryptographic hash function

used in REDS.

Page 40: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

5.3. Implementation 31

5.3.2 Initiation

To join REDS a node identity is required, therefore such a node identity is automatically

generated at the first startup of REDS. REDS will then join the new node to the network

by using Pastry’s built-in mechanism.

The Advanced Encryption Standard (AES) [32] is used to create a 128 bit encryp-

tion/decryption key. AES is a symmetric-key block cipher, which means that it use the

same key to encrypt and decrypt data. The generated key is then used by REDS to

encrypt all data before distribution, this to ensure file content anonymity (ie. only the

owner of a file can read the original content of a file).

5.3.3 File Storage

When the initiation process is completed, the node can start storing its data into the

network. After the files to store has been selected, they will be encrypted with the

encryption key generated in the initiation step. After encryption, the encrypted files are

split into chunks of a specified size and then distributed into the network. The node’s

usage record and ranking system are updated with necessary information. In this phase it

is also recommended to use REDS built-in restore file generator, which is a file containing

all the necessary information needed for a node restore. The restore file needs to be stored

somewhere safe outside the node.

5.3.4 Data Verification

Due to the fact that nodes have no natural incentive to behave fair, REDS has a verifica-

tion system (described in section 5.2.3). Verification requests are randomly sent to nodes

in the remote storage list. Based on the replies or lack of replies, the ranking status for

the nodes will be updated. The nodes ranking status are shown in the graphical user

interface (Figure 5.2 ) by one of the three indicators: OK, WATCH OUT or BAD. If the

indicator is BAD for a certain node, it will be deleted from the remote/local storage list

and a new node will take its place. The GUI also indicates the status of the files. A file

has the status OK if at least one replica of the file is stored at nodes with status OK,

otherwise it will be in status NOT OK.

Page 41: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

32REDS - Redundant and Expandable Distributed file Storage system for

a serverless network

5.3.5 Node Restorage

In case of a node crash and a new node is being set up, it is possible to restore the

crashed node’s old data by using the restore file. REDS will at first see if the old node

identity is available, if so, assign it to the new node. If it is occupied, a new node identity

is generated. Necessary update information are routed to the nodes who needs to know

about the node restore and the remote/local storage list are updated.

FILE1_CH1_R3

FILE1_CH1_R4

+ FILE1_CH2

MESSAGESSTORAGEAllocated Local space: 2000 MB Local storage: 500 MB

Remote storage: 1000 MB

Storage to use: 1000 MB

2008-09-12 FILE1_CH1_R2, storage node will be replaced2008-09-12 FILE1_CH1_R2, unsuccessful checksum control2008-09-12 FILE1_CH1_R3, storage node will be replaced2008-09-12 FILE1_CH1_R3, unsuccessful checksum control2008-09-12 FILE2, successful data verification

FILE1_CH1_R1

FILE1_CH1_R2

BAD

WATCH OUT

FILE FILE STATUS NODE STATUS

OKOK

OK

OKBAD

RETRIEVE

DELETE

INSERT

+ FILE2

- FILE1

- FILE1_CH1

Figure 5.2: Graphical user interface

Page 42: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

5.3. Implementation 33

5.3.6 Simulations

This section described the two simulations that has been performed. The Simulations are

analyzing the file availability threshold and bandwidth overhead in the REDS system.

The file availability simulation shows the number of replicas needed to achieve an desir-

able availability of the stored files in REDS. The overhead simulation shows the overhead

of the verification and ranking system. If the overhead becomes very high, the REDS

system will not be effiecient.

Figure 5.3: File availability

Figure 5.4: Bandwidth overhead with ranking system

Figure 5.3 present a simula-

tion of 10000 nodes with un-

limited storage space. The

simulation computes the mini-

mum number of replicas neces-

sary to achieve a certain avail-

ability threshold in the pres-

ence of node failures. The goal

is to use the minimum number

of replicas of a file to provide

a desirable level of availability.

Nodes join and leave the net-

work with a specified probabil-

ity and the assumption is made

that a failing or leaving node

loose all the replicas it stores.

The system needs at least one

replica to be available, the file

availability is therefore defined

as the probability that one or

more nodes are up. During the

simulation: a certain number

of nodes goes down, a percent-

age of the nodes that are up

check for replicas of their files.

A replica location accuracy of

0.8 and the probability that 0.5

of the nodes are being up, is assumed for the simulation. The result of the file availability

simulation shows that only 4 replicas is needed to achieve an availability of 0.8.

A second simulation concerning the bandwidth overhead of data verification and rank-

ing system is presented in Figure 5.4 The storage space of each node is randomly chosen

from 1 Gigabyte up to 100 Gigabyte. In each day of the simulated time each node per-

Page 43: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

34REDS - Redundant and Expandable Distributed file Storage system for

a serverless network

form: one checksum request and one data control request to random nodes in its remote

storage list. Furthermore, each node also verifies its local storage list once a day. The

simulation is done with 100, 1000 and 10000 nodes with 300 files per node, simulated

over 7 days. The result of the simulation show a quite low overhead, even for a large set

of nodes, which indicates that REDS scales well.

Page 44: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

CHAPTER 6

Conclusions and Future Work

This thesis has introduced anonymous networks as well as described some well-known

anonymous systems. The P2P concept has been described and more detailed description

of Napster among others has been presented. Some proposed distributed P2P storage

systems was identified, as well as their lack of defense against cheating nodes.

This thesis addressed this deficiency and proposed a new P2P file storage system,

”REDS - Redundant and Expandable Distributed file Storage system for a serverless net-

work”, a large-scale and fully decentralized file storage system with the aim to support a

large number of collaborating nodes. An innovative ranking system has been presented,

a novel approach to overcome the problem with cheating nodes who tries to falsify infor-

mation in a P2P storage system. Using this ranking system, REDS is designed to detect

and suspend malicious nodes within the network. The simulation results show that the

ranking system has low bandwidth overhead. REDS has been implemented on top of

FreePastry, a java implementation of Pastry, a P2P object location and routing sub-

strate overlayed on the Internet. REDS leverages the scalability, locality, fault-resilience

and self-organization properties of Pastry. The simulation results, based on freepastry’s

own simulator, indicate that REDS scales well and is able to efficiently support a large

number of nodes. However, the system still has to prove its usability in practice. REDS

has been described, characterized and a prototype has been implemented. While many

important challenges remain, this prototype is a working subset of the vision presented

in this thesis. An important future direction of this thesis would be to investigate the

optimization of the proposed ranking system used by REDS. The ideal goal would be to

present a fully optimized ranking system, such that cheating nodes always gets detected

and suspended by REDS.

35

Page 45: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users
Page 46: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

REFERENCES

[1] “IBM homepage,” August 2008. http://www-03.ibm.com.

[2] “RAID,” August 2008. http://en.wikipedia.org/wiki/RAID.

[3] “Napster inc.,” August 2008. http://www.napster.com.

[4] “Gnutella,” August 2008. http://en.wikipedia.org/wiki/Gnutella.

[5] J. A. Pouwelse, P. Garbacki, D. H. J. Epema, and H. J. Sips, “The bittorrent p2p

file-sharing system: Measurements and analysis,” in IPTPS, pp. 205–216, 2005.

[6] D. Chaum, “Untraceable electronic mail, return addresses, and digital pseudonyms,”

Communications of the ACM, vol. 24, pp. 84–90, February 1981.

[7] P. F. Syverson, D. M. Goldschlag, and M. G. Reed, “Anonymous connections and

onion routing,” in SP 1997: Proceedings of the 1997 IEEE Symposium on Security

and Privacy, (Washington, DC, USA), p. 44, IEEE Computer Society, 1997.

[8] R. Dingledine, N. Mathewson, and P. Syverson, “Tor: The second-generation onion

router,” in In Proceedings of the 13th USENIX Security Symposium, pp. 303–320,

2004.

[9] T. Dierks and E. Rescorla, “The transport layer security (TLS) protocol version

1.1,” RFC 4346, Internet Engineering Task Force, April 2006.

[10] K. Bauer, D. McCoy, D. Grunwald, T. Kohno, and D. Sicker., “Low-resource routing

attacks against anonymous systems,” tech. rep., University of Colorado at Boulder,

Boulder, CO, USA, February 2007.

[11] S. J. Murdoch and G. Danezis, “Low-cost traffic analysis of tor,” in SP ’05: Pro-

ceedings of the 2005 IEEE Symposium on Security and Privacy, (Washington, DC,

USA), pp. 183–195, IEEE Computer Society, 2005.

37

Page 47: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

38

[12] I. Goldberg, “On the security of the tor authentication protocol,” in Proceedings of

the Sixth Workshop on Privacy Enhancing Technologies (PET 2006), (Cambridge,

UK), pp. 316–331, Springer, June 2006.

[13] L. Zhuang, F. Zhou, B. Y. Zhao, and A. Rowstron, “Cashmere: Resilient anonymous

routing,” in Proc. of NSDI, (Boston, MA), ACM/USENIX, May 2005.

[14] M. K. Reiter and A. D. Rubin, “Anonymous web transactions with crowds,” Com-

munications of the ACM, vol. 42, no. 2, pp. 32–48, 1999.

[15] O. Berthold, A. Pfitzmann, and R. Standtke, “The disadvantages of free mix routes

and how to overcome them,” in International workshop on Designing privacy en-

hancing technologies, (New York, NY, USA), pp. 30–45, Springer-Verlag New York,

Inc., 2001.

[16] C. Diaz, S. Seys, J. Claesson, and B. Preneel, “Towards measuring anonymity,”

In Procedeeings of the Privacy Enhancing Technologies Workshop (PET 2002),

Springer, vol. 2482, pp. 54–68, April 2002.

[17] L. Overlier and P. Syverson, “Locating hidden servers,” in SP ’06: Proceedings

of the 2006 IEEE Symposium on Security and Privacy, (Washington, DC, USA),

pp. 100–114, IEEE Computer Society, 2006.

[18] M. Wright, M. Adler, B. N. Levine, and C. Shields, “An analysis of the degradation

of anonymous protocols,” in Proceedings of the Network and Distributed Security

Symposium - NDSS ’02. IEEE, February 2002.

[19] J. R. Douceur, “The sybil attack,” in IPTPS, pp. 251–260, 2002.

[20] D. Agrawal and C. C. Aggarwal, “On the design and quantification of privacy pre-

serving data mining algorithms,” in PODS ’01: Proceedings of the twentieth ACM

SIGMOD-SIGACT-SIGART symposium on Principles of database systems, (New

York, NY, USA), pp. 247–255, ACM, 2001.

[21] M. Roussopoulos, M. Baker, D. S. H. Rosenthal, T. J. Giuli, P. Maniatis, and J. C.

Mogul, “2 p2p or not 2 p2p?,” in IPTPS, pp. 33–43, 2004.

[22] M. Rennhard and B. Plattner, “Introducing morphmix: peer-to-peer based anony-

mous internet usage with collusion detection,” in WPES ’02: Proceedings of the

2002 ACM workshop on Privacy in the Electronic Society, (New York, NY, USA),

pp. 91–102, ACM, 2002.

[23] M. J. Freedman and R. Morris, “Tarzan: A peer-to-peer anonymizing network layer,”

in In Proceedings of the 9th ACM Conference on Computer and Communications

Security (CCS 2002, 2002.

Page 48: 2009:015 CIV MASTER'S THESIS REDS - DiVA portalltu.diva-portal.org/smash/get/diva2:1028784/FULLTEXT01.pdfOnion routing does not guarantee perfect anonymity, but it helps protect users

39

[24] I. Clarke, O. Sandberg, B. Wiley, and T. Hong, “Freenet: A distributed anonymous

information storage and retrieval system,” In Procedeeings of Designing Privacy

Enhancing Technologies: Workshop on Design Issues in Anonymity and Unobserv-

ability, pp. 46–66, July 2000.

[25] J. Kubiatowicz, D. Bindel, Y. Chen, S. E. Czerwinski, P. R. Eaton, D. Geels,

R. Gummadi, S. C. Rhea, H. Weatherspoon, W. Weimer, C. Wells, and B. Y.

Zhao, “Oceanstore: An architecture for global-scale persistent storage,” in ASP-

LOS, pp. 190–201, 2000.

[26] R. J. Anderson, “The eternity service,” in In Proceedings of Pragocrypt, pp. 242–252,

1996.

[27] P. Druschel and A. Rowstron, “Past: A large-scale, persistent peer-to-peer stor-

age utility,” in HOTOS ’01: Proceedings of the Eighth Workshop on Hot Topics in

Operating Systems, (Washington, DC, USA), p. 75, IEEE Computer Society, 2001.

[28] A. Rowstron and P. Druschel, “Pastry: Scalable, decentralized object location and

routing for large-scale peer-to-peer systems,” In Proceedings of IFIP/ACM Interna-

tional Conference on Distributed Systems Platforms, pp. 329–350, November 2001.

[29] A. I. T. Rowstron, A.-M. Kermarrec, M. Castro, and P. Druschel, “Scribe: The

design of a large-scale event notification infrastructure,” in NGC ’01: Proceedings of

the Third International COST264 Workshop on Networked Group Communication,

(London, UK), pp. 30–43, Springer-Verlag, 2001.

[30] “MD5,” August 2008. http://en.wikipedia.org/wiki/MD5.

[31] S. Floyd, V. Jacobson, C.-G. Liu, S. McCanne, and L. Zhang, “A reliable multi-

cast framework for light-weight sessions and application level framing,” IEEE/ACM

Trans. Netw., vol. 5, no. 6, pp. 784–803, 1997.

[32] J. Daemen and V. Rijmen, The Design of Rijndael. Secaucus, NJ, USA: Springer-

Verlag New York, Inc., 2002.