Upload
may
View
42
Download
0
Tags:
Embed Size (px)
DESCRIPTION
NetSolve / GridSolve. By Milan Novakovic , Steven Morgan. What is NetSolve?. A Distributed system... Duh Aimed at helping scientists find a wide range of helpful tools. Historically. Optimized for specific platforms No convenient interface - PowerPoint PPT Presentation
Citation preview
NetSolveBy Shu-Ming (Kevin) Lin
Overview
• Background of NetSolve• Characteristics/Mechanisms for NetSolve• Design/Architecture• Specific Implementation Concept• Performance, Contributions, Drawbacks, and Improvements
Usage• What are some application?• Distributed Supercomputing• Lots of resource available and problem that can’t be solve in single system
• High-throughput computing• To schedule large number of independent, small task for idle workstation
• On-demand computing• Resources located remotely available for access as needed in a cost-effective
way.• NetSolve falls under this category.
• Data-Intensive Computing• Data are geographically distributed that needs to be synthesized to produce new,
interesting info.• Collaborative Computing• Enable collaboration between multiple experts in a virtual shared space.
Motivation for NetSolve
• Numerical libraries are highly complex and do not have convenient interface to most computer system but only a few.• Some libraries demands a lot of programming effort from the
user.
NetSolve?• Developed by researchers of university all over the world.• California, Wisconsin, Texas, Tennessee, India, Virginia, Ireland…• Other projects like Condor, Globus, and Ninf to name a few.
• Creating a bridge between simple programming interface, the SCEs, and the grid services. • For users of SCEs who needs lots of resources• Provides access to software and hardware resources• In addition, it manages computation resources in grid
environment• GridSolve was developed after NetSolve to improve it.
NetSolve’s Architecture/DesignBasic Flow:1. Client calls an API
through RPC. 2. Agent receive the
request.3. Agent looks for most
appropriate server to execute the request.
4. The server run the computation and sends back the result to the client.
GridSolve’s Architecture/Design• GridSolve does not have:• Sequencing API• Some grid services like Globus,
Condor• Some interfaces for Octave,
Mathematica• GridSolve improvements in:• Disconnect – for computation that
takes a long time to run, client can disconnect then come back to pick up the results later.
• Server – service are statically-linked executables and not linked to the server binary, so add new service is easy without stopping the server and recompiling it again.
NetSolve’s Structure
• A set of Client APIs to access the numerical libraries • A set of servers that encapsulates numerical libraries
for remote access• One or more agents that match requests for services
provided by the server
NetSolve Language• A set of client API to functions in various numerical libraries.• It works with languages used to solve numerical problems such as
MATLAB, C, Fortran, Mathematica, Octave, or Java.• It uses RPC for the underlying structure of these calls.• In MATLAB:• c = a * b
• In NetSolve:• c = netsolve(‘matmul’, a, b)
• Asynchronous version• request = netsolve_nb(‘matmul’, a, b)
• . //Doing other computation• .• c = netsolve(‘wait’, request)
NetSolve Agent• Resource Discovery• How does agent know what resources are available?
• Resource Allocation• It accepts requests for computational services from client API• It dispatch those requests to the most appropriate server
• Load balancing• Agent continues to monitor status of resources
• Fault Tolerance
Resource Discovery
• Support resource discovery catalogue, registries, or directories of all available resources• Computational resources register with the agent
when they start up• In NetSolve, a list of resources and the application
it is used for is kept
Resource Allocation/Selection• Primary goal is optimize resource utilization over
performance of any application• Significant information about an application is required• It makes decision based on the followings:• Computation information available from requests•Size of input data•Size of the problem
• Static and dynamic information on the available resources
• In NetSolve, it takes into account, the CPU load and network load
Fault Tolerance• Fault recovery depends on fault detection algorithm plus fault
recovery technique.• NetSolve uses Globus’s fault detection service to detect which server is
up• Fault recovery techniques:• Simple restart• Checkpointing
• In NetSolve, the natures of the server and agents promotes resistant to failures• They are designed to be arbitrarily start or stop without affecting the
whole system• It uses the retry technique to resubmit request to another server if the
original one malfunctioned.
NetSolve Server
• Represents the computational resources• It has three goals:• Uniform access to software• Configurability• Preinstallation
• Use machine-independent description language• Description files are easily exchangeable
Code Management• Remote computing• Code and computational resource at server. • Client sends the data to the server for computation.• Results is sent back to the client.
• Code shipping (Applet)• Code is located at the server, computational resource at the client.• Code get shipped to the client that request the code.• Execution at client with local data.
• Proxy computing (Remote Execution)• Code and data at client/third party, but resource at the server.• Code and data are transferred to the server.• Execution done at the server and results is return back to client
Performance• NetSolve Request Farming• Handle multiple request for a given problem by executing in parallel with non-
blocking call• NetSolve Request Sequencing• Decrease network traffic by doing deep analysis on the input/output of every
request to build a directed acyclic graph.• Use of agent as a communication method.• Distributed Storage Infrastructure (DSI)• In GridSolve,• Performance improvement by adopting Receiver Makes Right and forking a
process to return results from a non-blocking call.
Advantage/Contributions• Good usability and versatility.• It uses agent-based method so all the advantage it has.• Good availability.• Transparency of resources to the user.• Users can now enjoy using computational resources to run
numeric application without worry about the cost• Researchers can collaborate on projects involve numeric
libraries
Applications
Continue• DIPS• FES• Genetic Algorithms• Genetic Crossover• Grid Application Deployment kit• HPC Grids• IBP• IPARS• Mcell• .NET• POV-Ray• http://icl.cs.utk.edu/netsolve/custom/index.html?lid=55&slid=82
Drawbacks
• Fault tolerance is slightly downgrade. • Not mainly performance focus.
Improvements• Less responsibility on the NetSolve Agent to provide
better performance.
Questions?
References
• Henri Casanova, Jack Dongarra, Chris Johnson, and Michelle Miller, "Section 7.3: Case Study: NetSolve", In Ian Foster and Carl Kesselman, editors, The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann Publishers, July 1998, pages 171-175• http://icl.cs.utk.edu/netsolve/