Upload
eric-may
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
Khoj: A Highly Scalable and Available SearchHarneet Singh, Avinaash Gupta and Krishna Gayatri Kuchimanchi
System Overview
Load Balancing Failure Detection
Architecture
Data Partitioning and Replication
Fault Tolerance Evaluation
References[1 Karger, D.; Sherman, A.; Berkheimer, A.; Bogstad, B.; Dhanidina, R.; Iwamoto, K.; Kim, B.; Matkins, L.; Yerushalmi, Y. (1999). Computer Networks 31 (11): 12031213. � Web caching with consistent hashing.[2] Giuseppe DeCandia, et alProceedings of the 21st ACM Symposium on Operating Systems Principles, Stevenson, WA, October 2007. Dynamo: Amazon's Highly Available Key-Value Store.[3] Rajesh Nishtala, et al NSDI 2013. Scaling Memcache at Facebook.[4] Vivek Pai, Guarav Banga, ASPLOS-VIII. Locality-Aware Request Distribution.
• Backend server to multiple virtual nodes mapping • Even partitioning of the data amongst servers
• Load Redistribution on addition/removal of a backend server
• Replication at N backend servers where N=3 High Availability
Khoj is a distributed search engine which combines well known techniques to achieve high scalability and availability.• Works on a locality aware request
distribution infrastructure with multiple front end servers.
• The front-end server to serve a request is selected using round-robin scheduling.
• Front-end server uses two level consistent hash ring to determine the backend server that would serve the request.
• Coordinator server manages addition and removal of nodes.
• Inverted Indices sharded across the backend servers.
• Replication across backend servers to achieve fault tolerance and good availability.
Khoj: A Highly Scalable and Available SearchHarneet Singh, Avinaash Gupta and Krishna Gayatri Kuchimanchi
System Overview
Load Balancing Failure Detection
Architecture
Data Partitioning and Replication
Fault Tolerance Evaluation
References[1 Karger, D.; Sherman, A.; Berkheimer, A.; Bogstad, B.; Dhanidina, R.; Iwamoto, K.; Kim, B.; Matkins, L.; Yerushalmi, Y. (1999). Computer Networks 31 (11): 12031213. � Web caching with consistent hashing.[2] Giuseppe DeCandia, et alProceedings of the 21st ACM Symposium on Operating Systems Principles, Stevenson, WA, October 2007. Dynamo: Amazon's Highly Available Key-Value Store.[3] Rajesh Nishtala, et al NSDI 2013. Scaling Memcache at Facebook.[4] Vivek Pai, Guarav Banga, ASPLOS-VIII. Locality-Aware Request Distribution.
• Backend server to multiple virtual nodes mapping • Even partitioning of the data amongst servers
• Load Redistribution on addition/removal of a backend server
• Replication at N backend servers where N=3 High Availability
Khoj is a distributed search engine which combines well known techniques to achieve high scalability and availability.• Works on a locality aware request
distribution infrastructure with multiple front end servers.
• Clients send requests to the front-end servers using round-robin scheduling.
• Front-end server uses two level consistent hash ring to determine the backend server that would serve the request.
• Coordinator server manages addition and removal of nodes.
• Inverted Indices sharded across the backend servers.
• Replication across backend servers to achieve fault tolerance and good availability.