11
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

Embed Size (px)

Citation preview

Page 1: SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

SCAN: a Scalable, Adaptive, Secure and Network-aware

Content Distribution Network

Yan Chen

CS DepartmentNorthwestern University

Page 2: SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

Motivation• The Internet has evolved to become a

commercial infrastructure for service delivery– Web delivery, VoIP, streaming media …

• Challenges for Internet-scale services– Scalability: 600M users, 35M Web sites, 2.1Tb/s– Efficiency: bandwidth, storage, management– Agility: dynamic clients/network/servers– Security: proliferate attacks/viruses/worms

• E.g., content delivery - Content Distribution Network (CDN)– Web delivery– Grid computing

Page 3: SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

How CDN Works

Page 4: SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

Challenges for CDN• Content Location

– Find nearby replicas with good DoS attack resilience– Dynamic, scalable semantic search

• Replica Deployment– Dynamics, efficiency– Client QoS (latency, coherence) and server capacity

constraints• Replica Management

– Replica index state maintenance scalability• Adaptation to Network Congestion/Failures

– Overlay monitoring scalability and accuracy• Security

– Proactive anomaly/intrusion detection on high-speed network

Page 5: SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

Provision: Dynamic Replication

+ Update Multicast Tree BuildingReplica Management:

(Incremental) Content Clustering

Network End-to-End Distance

Monitoring (latency & loss rate)

DHT-based Replica Location:

Network DoS Attack Resilient

& Semantic Search Support

SCAN: Scalable Content Access Network

Proactive Anomaly/Intrusion

Detection on High-speed Network

Page 6: SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

Replica Location (security)

• Existing Work and Problems– Centralized, Replicated and Distributed Directory

Services– No security benchmarking, which one has the best

DoS attack resilience?

• Solution – Proposed the first simulation-based network DoS

resilience benchmark– Applied it to compare three directory services– DHT-based Distributed Directory Services has best

resilience in practice

• Publication– 3rd Int. Conf. on Info. and Comm. Security (ICICS),

2001

Page 7: SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

Replica Location (semantic search)

• Existing Work and Problems– Mostly keyword/title based search– Emerging semantic search systems, but static,

unscalable

• Solution – Apply DHT to distribute the indices– Use “concept indexing” to incrementally grow the

semantic space => incrementally add new concepts & documents

– Group the indices based on semantic locality => semantic routing, better query accuracy and efficiency

Page 8: SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

Replica Placement & Coherence Support

• Existing Work and Problems– Static placement– Dynamic but inefficient placement– No coherence support

• Solution– Dynamically place close to optimal # of replicas

with clients QoS (latency) and servers capacity constraints

– Self-organize replica into a scalable application-level multicast for disseminating updates

– With overlay network topology only

• Publication– IPTPS 2002, Pervasive Computing 2002

Page 9: SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

• Existing Work and Problems– Cooperative access for good efficiency requires

maintaining replica indices– Per Website replication, scalable, but poor

performance– Per URL replication, good performance, but unscalable

• Solution– Clustering-based replication reduces the overhead

significantly without sacrificing much performance– Proposed a unique online Web object popularity

prediction scheme based on hyperlink structures– Online incremental clustering and replication to push

replicas before accessed• Publication

– ICNP 2002, IEEE J-SAC 2003

Replica Management

Page 10: SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

Adaptation to Network Congestion/Failures

• Existing Work and Problems – Latency estimation systems scalable, but cannot

monitor congestion/failures which require n2

measurement for n end hosts• Solution

– Tomography-based Overlay Monitoring (TOM) - selectively monitor a basis set of O(n logn) paths to infer the loss rates of other paths

– Works in real-time, adapts to topology changes, has good load balancing and tolerates topology errors

– Built an adaptive overlay streaming media system on top of TOM

– Root-cause diagnosis in progress• Publication

– Modeling: SIGCOMM IMC 2003 (extended abstract)– Full version under submission

Page 11: SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

• Existing Work and Problems – A/I detection requires flow-level traffic monitoring,

unscalable for high-speed network– Most IDS are signature-based, only for known attacks

• Solution– Leverage “K-ary sketch”, a compact probabilistic

summary of flow-level traffic, constant update/query cost, linearity

– Use statistical methods, like Hidden Markov Model (HMM) and time series analysis for proactive detection

– Profile characteristics of new apps to reduce false positive

• Publication– K-ary sketch: SIGCOMM IMC 2003

Proactive Anomaly/Intrusion Detection on High-speed

Network