23
NetSolve By Shu-Ming (Kevin) Lin

NetSolve / GridSolve

  • Upload
    may

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

NetSolve / GridSolve. By Milan Novakovic , Steven Morgan. What is NetSolve?. A Distributed system... Duh Aimed at helping scientists find a wide range of helpful tools. Historically. Optimized for specific platforms No convenient interface - PowerPoint PPT Presentation

Citation preview

Page 1: NetSolve / GridSolve

NetSolveBy Shu-Ming (Kevin) Lin

Page 2: NetSolve / GridSolve

Overview

• Background of NetSolve• Characteristics/Mechanisms for NetSolve• Design/Architecture• Specific Implementation Concept• Performance, Contributions, Drawbacks, and Improvements

Page 3: NetSolve / GridSolve

Usage• What are some application?• Distributed Supercomputing• Lots of resource available and problem that can’t be solve in single system

• High-throughput computing• To schedule large number of independent, small task for idle workstation

• On-demand computing• Resources located remotely available for access as needed in a cost-effective

way.• NetSolve falls under this category.

• Data-Intensive Computing• Data are geographically distributed that needs to be synthesized to produce new,

interesting info.• Collaborative Computing• Enable collaboration between multiple experts in a virtual shared space.

Page 4: NetSolve / GridSolve

Motivation for NetSolve

• Numerical libraries are highly complex and do not have convenient interface to most computer system but only a few.• Some libraries demands a lot of programming effort from the

user.

Page 5: NetSolve / GridSolve

NetSolve?• Developed by researchers of university all over the world.• California, Wisconsin, Texas, Tennessee, India, Virginia, Ireland…• Other projects like Condor, Globus, and Ninf to name a few.

• Creating a bridge between simple programming interface, the SCEs, and the grid services. • For users of SCEs who needs lots of resources• Provides access to software and hardware resources• In addition, it manages computation resources in grid

environment• GridSolve was developed after NetSolve to improve it.

Page 6: NetSolve / GridSolve

NetSolve’s Architecture/DesignBasic Flow:1. Client calls an API

through RPC. 2. Agent receive the

request.3. Agent looks for most

appropriate server to execute the request.

4. The server run the computation and sends back the result to the client.

Page 7: NetSolve / GridSolve

GridSolve’s Architecture/Design• GridSolve does not have:• Sequencing API• Some grid services like Globus,

Condor• Some interfaces for Octave,

Mathematica• GridSolve improvements in:• Disconnect – for computation that

takes a long time to run, client can disconnect then come back to pick up the results later.

• Server – service are statically-linked executables and not linked to the server binary, so add new service is easy without stopping the server and recompiling it again.

Page 8: NetSolve / GridSolve

NetSolve’s Structure

• A set of Client APIs to access the numerical libraries • A set of servers that encapsulates numerical libraries

for remote access• One or more agents that match requests for services

provided by the server

Page 9: NetSolve / GridSolve

NetSolve Language• A set of client API to functions in various numerical libraries.• It works with languages used to solve numerical problems such as

MATLAB, C, Fortran, Mathematica, Octave, or Java.• It uses RPC for the underlying structure of these calls.• In MATLAB:• c = a * b

• In NetSolve:• c = netsolve(‘matmul’, a, b)

• Asynchronous version• request = netsolve_nb(‘matmul’, a, b)

• . //Doing other computation• .• c = netsolve(‘wait’, request)

Page 10: NetSolve / GridSolve

NetSolve Agent• Resource Discovery• How does agent know what resources are available?

• Resource Allocation• It accepts requests for computational services from client API• It dispatch those requests to the most appropriate server

• Load balancing• Agent continues to monitor status of resources

• Fault Tolerance

Page 11: NetSolve / GridSolve

Resource Discovery

• Support resource discovery catalogue, registries, or directories of all available resources• Computational resources register with the agent

when they start up• In NetSolve, a list of resources and the application

it is used for is kept

Page 12: NetSolve / GridSolve

Resource Allocation/Selection• Primary goal is optimize resource utilization over

performance of any application• Significant information about an application is required• It makes decision based on the followings:• Computation information available from requests•Size of input data•Size of the problem

• Static and dynamic information on the available resources

• In NetSolve, it takes into account, the CPU load and network load

Page 13: NetSolve / GridSolve

Fault Tolerance• Fault recovery depends on fault detection algorithm plus fault

recovery technique.• NetSolve uses Globus’s fault detection service to detect which server is

up• Fault recovery techniques:• Simple restart• Checkpointing

• In NetSolve, the natures of the server and agents promotes resistant to failures• They are designed to be arbitrarily start or stop without affecting the

whole system• It uses the retry technique to resubmit request to another server if the

original one malfunctioned.

Page 14: NetSolve / GridSolve

NetSolve Server

• Represents the computational resources• It has three goals:• Uniform access to software• Configurability• Preinstallation

• Use machine-independent description language• Description files are easily exchangeable

Page 15: NetSolve / GridSolve

Code Management• Remote computing• Code and computational resource at server. • Client sends the data to the server for computation.• Results is sent back to the client.

• Code shipping (Applet)• Code is located at the server, computational resource at the client.• Code get shipped to the client that request the code.• Execution at client with local data.

• Proxy computing (Remote Execution)• Code and data at client/third party, but resource at the server.• Code and data are transferred to the server.• Execution done at the server and results is return back to client

Page 16: NetSolve / GridSolve

Performance• NetSolve Request Farming• Handle multiple request for a given problem by executing in parallel with non-

blocking call• NetSolve Request Sequencing• Decrease network traffic by doing deep analysis on the input/output of every

request to build a directed acyclic graph.• Use of agent as a communication method.• Distributed Storage Infrastructure (DSI)• In GridSolve,• Performance improvement by adopting Receiver Makes Right and forking a

process to return results from a non-blocking call.

Page 17: NetSolve / GridSolve

Advantage/Contributions• Good usability and versatility.• It uses agent-based method so all the advantage it has.• Good availability.• Transparency of resources to the user.• Users can now enjoy using computational resources to run

numeric application without worry about the cost• Researchers can collaborate on projects involve numeric

libraries

Page 18: NetSolve / GridSolve

Applications

Page 19: NetSolve / GridSolve

Continue• DIPS• FES• Genetic Algorithms• Genetic Crossover• Grid Application Deployment kit• HPC Grids• IBP• IPARS• Mcell• .NET• POV-Ray• http://icl.cs.utk.edu/netsolve/custom/index.html?lid=55&slid=82

Page 20: NetSolve / GridSolve

Drawbacks

• Fault tolerance is slightly downgrade. • Not mainly performance focus.

Page 21: NetSolve / GridSolve

Improvements• Less responsibility on the NetSolve Agent to provide

better performance.

Page 22: NetSolve / GridSolve

Questions?

Page 23: NetSolve / GridSolve

References

• Henri Casanova, Jack Dongarra, Chris Johnson, and Michelle Miller, "Section 7.3: Case Study: NetSolve", In Ian Foster and Carl Kesselman, editors, The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann Publishers, July 1998, pages 171-175• http://icl.cs.utk.edu/netsolve/