46
www.eu-etics.org INFSOM-RI-026753 Grids and Software Grids and Software Engineering Test Engineering Test Platforms Platforms Alberto Di Meglio Alberto Di Meglio CERN CERN

Grids and Software Engineering Test Platforms

  • Upload
    tehya

  • View
    55

  • Download
    0

Embed Size (px)

DESCRIPTION

Grids and Software Engineering Test Platforms. Alberto Di Meglio CERN. Contents. Setting the Context A “Typical” Grid Environment Challenges Test Requirements Methodologies The Grid as a Test Tool Conclusions Panel discussion on Grid QA and industrial applications. Setting the Context. - PowerPoint PPT Presentation

Citation preview

Page 1: Grids and Software Engineering Test Platforms

www.eu-etics.org

INFSOM-RI-026753

Grids and Software Engineering Grids and Software Engineering Test PlatformsTest Platforms

Alberto Di MeglioAlberto Di MeglioCERNCERN

Page 2: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 2INFSOM-RI-026753

Contents

• Setting the ContextSetting the Context• A “Typical” Grid EnvironmentA “Typical” Grid Environment• ChallengesChallenges• Test RequirementsTest Requirements• MethodologiesMethodologies• The Grid as a Test ToolThe Grid as a Test Tool• ConclusionsConclusions• Panel discussion on Grid QA and industrial Panel discussion on Grid QA and industrial

applicationsapplications

Page 3: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 3INFSOM-RI-026753

Setting the Context

• What is a distributed environment?What is a distributed environment?• The main characteristic of a distributed environment The main characteristic of a distributed environment

that affects how test are performed are:that affects how test are performed are:

– Many things happen at all times in the same or different places Many things happen at all times in the same or different places and can have direct or indirect and often unpredictable effects on and can have direct or indirect and often unpredictable effects on each othereach other

• The main goal of this discussion is to show you what The main goal of this discussion is to show you what are the consequences of this on testing the grid and are the consequences of this on testing the grid and how the grid can (must) be used as a tool to test itself how the grid can (must) be used as a tool to test itself and the software running on itand the software running on it

Page 4: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 4INFSOM-RI-026753

1..nR-GMAservicetool

Worker Node (WN)

WN

CE

R-GMAGIN

R-GMAservicetool

Computing Element (CE)

WMS R-GMAservicetool

Workload Management System (WMS)

R-GMA client

User Interface

LB client

LTS client

Catalogs client

WMS client

A “Typical” Grid Environment

Local Transfer Service (LTS)Catalog (MySQL)

1..n R-GMAservicetool

Input-Output (IO) server

IO server

R-GMA browser

R-GMA flexible archiverR-GMA

servicetool

R-GMAserver

R-GMAregistry

R-GMAsite publisher

R-GMA ServerDGAS

R-GMAservicetool

Logging & Bookkeeping System (LB)

LB

PBS

LSF

Condor

DPM

dCache

Castor

SRM 2.1SRM 2.0

R-GMAservicetool

UNICORE

R-GMAservicetool

Condor

JSDL

Page 5: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 5INFSOM-RI-026753

Challenges

• Non-determinismNon-determinism• Infrastructure dependenciesInfrastructure dependencies• Distributed and partial failuresDistributed and partial failures• Time-outsTime-outs• Dynamic nature of the structureDynamic nature of the structure• Lack of mature standards (interoperability)Lack of mature standards (interoperability)• Multiple heterogeneous platformsMultiple heterogeneous platforms• SecuritySecurity

Page 6: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 6INFSOM-RI-026753

Non-determinism

• Distributed systems like the grid are inherently non Distributed systems like the grid are inherently non deterministicdeterministic

• Noise is introduced in many places (OS schedulers, Noise is introduced in many places (OS schedulers, network time-outs, process synchronization, race network time-outs, process synchronization, race conditions, etc) conditions, etc)

• Changes in the infrastructure not controlled by a test Changes in the infrastructure not controlled by a test have an effect on the test and on the sequence of testshave an effect on the test and on the sequence of tests

• Difficult to exactly reproduce a test runDifficult to exactly reproduce a test run

Page 7: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 7INFSOM-RI-026753

Infrastructure dependencies

• Operating systems and third-party applications interact Operating systems and third-party applications interact with the objects to be testedwith the objects to be tested

• Different versions of OSs and applications may behave Different versions of OSs and applications may behave differentlydifferently

• Software updates (especially security patches) cannot Software updates (especially security patches) cannot be avoidedbe avoided

• Network topologies and boundaries may be under Network topologies and boundaries may be under someone else control (routers, firewalls, proxies)someone else control (routers, firewalls, proxies)

Page 8: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 8INFSOM-RI-026753

Distributed and Partial Failures

• In a distributed systems also failures are distributedIn a distributed systems also failures are distributed• A test or sequence of tests may fail because part of the A test or sequence of tests may fail because part of the

system (a node, a service) fails or is unavailablesystem (a node, a service) fails or is unavailable• The nature of the problem can be anything: hardware, The nature of the problem can be anything: hardware,

software, local network policy changes, power failuressoftware, local network policy changes, power failures• In addition, since this is expected, middleware and In addition, since this is expected, middleware and

applications should cope with that and their behaviour applications should cope with that and their behaviour should be tested for itshould be tested for it

Page 9: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 9INFSOM-RI-026753

Time-outs

• Not necessarily due to a failure, but also to excessive Not necessarily due to a failure, but also to excessive loadload

• They may be infrastructure-related (network), system-They may be infrastructure-related (network), system-related (OS, service containers) or application-relatedrelated (OS, service containers) or application-related

• Services may react differently when time-outs occur: Services may react differently when time-outs occur: they may plainly fail, raise exceptions, have retry they may plainly fail, raise exceptions, have retry strategiesstrategies

• There are consequences of the tests sequence (non-There are consequences of the tests sequence (non-determinism again)determinism again)

Page 10: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 10INFSOM-RI-026753

Dynamic nature of the structure

• The type and number of actors and objects The type and number of actors and objects participating to the workflow change with time and participating to the workflow change with time and location (concurrent users, different processes on the location (concurrent users, different processes on the same machine, different machines across the same machine, different machines across the infrastructure)infrastructure)

• Middleware and applications may dynamically Middleware and applications may dynamically (re)configure themselves depending on local or remote (re)configure themselves depending on local or remote conditions (for example load balancing or service fail-conditions (for example load balancing or service fail-over)over)

• Actual execution paths may change with load Actual execution paths may change with load conditionsconditions

• How to reproduce and track such configurations?How to reproduce and track such configurations?

Page 11: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 11INFSOM-RI-026753

Moving Standards

• Lack of or rapidly changing standards make it difficult Lack of or rapidly changing standards make it difficult for grid services to interoperatefor grid services to interoperate

• Service-oriented architectures should make life easier, Service-oriented architectures should make life easier, but which standard should be adopted?but which standard should be adopted?

• Failures may be due to Failures may be due to incorrect/incomplete/incompatible implementationsincorrect/incomplete/incompatible implementations

• Ex 1: plain web services, WSRF, WS-*?Ex 1: plain web services, WSRF, WS-*?• Ex 2: axis (j/c), gsoap, gridsite, zsi?Ex 2: axis (j/c), gsoap, gridsite, zsi?• Ex 3: SRM, JSDLEx 3: SRM, JSDL• How to test the potential combinations?How to test the potential combinations?

Page 12: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 12INFSOM-RI-026753

Multiple Heterogeneous Platforms

• Distributed software, especially grid software, runs on Distributed software, especially grid software, runs on a variety of platforms (combinations of OS, a variety of platforms (combinations of OS, architecture and compilers)architecture and compilers)

• Software is often written on a specific platform and Software is often written on a specific platform and only later ported on other platformsonly later ported on other platforms

• OS and third-party dependencies may change across OS and third-party dependencies may change across platforms in version and typeplatforms in version and type

• Different compilers usually do not compile the same Different compilers usually do not compile the same code in the same way (if at all)code in the same way (if at all)

Page 13: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 13INFSOM-RI-026753

Security

• Security and security testing are huge issuesSecurity and security testing are huge issues• Sometimes there is a tendency to consider security an Sometimes there is a tendency to consider security an

add-on of the middleware or applicationsadd-on of the middleware or applications• Software behaves in completely different ways with Software behaves in completely different ways with

and without security for the same functionalityand without security for the same functionality• Ex: consider the simple example of a web service Ex: consider the simple example of a web service

running on http or https, with or without client running on http or https, with or without client certificatescertificates

• Sometimes software is developed on individual Sometimes software is developed on individual machines without taking into account the constraints machines without taking into account the constraints imposed by running secure network infrastructuresimposed by running secure network infrastructures

Page 14: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 14INFSOM-RI-026753

Test Requirements

• Where to start from?Where to start from?• Test PlansTest Plans• Life-cycle testingLife-cycle testing• ReproducibilityReproducibility• Archival and analysisArchival and analysis• Interactive Vs. automated testingInteractive Vs. automated testing

Page 15: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 15INFSOM-RI-026753

Test Plans

• Test plans should be the mandatory starting point of all Test plans should be the mandatory starting point of all test activities. This point is often neglectedtest activities. This point is often neglected

• It is a difficult taskIt is a difficult task• You need to understand thoroughly your system and You need to understand thoroughly your system and

the environment where it must be deployedthe environment where it must be deployed• You need to spell out clearly what you want to test and You need to spell out clearly what you want to test and

how and what are the expected resultshow and what are the expected results• Write it together with domain experts to make sure as Write it together with domain experts to make sure as

many system components and interactions as possible many system components and interactions as possible are taken into accountare taken into account

• Revise it oftenRevise it often

Page 16: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 16INFSOM-RI-026753

Life-cycle Testing

• When designing the test plan, don’t think only about When designing the test plan, don’t think only about functionality, but also about how the system will have functionality, but also about how the system will have to be deployed and maintainedto be deployed and maintained

• Start with explicit design of installation, configuration Start with explicit design of installation, configuration and upgrade tests: it is easy to see that a large part of and upgrade tests: it is easy to see that a large part of the bugs of a system fall in the installation and the bugs of a system fall in the installation and configuration categoryconfiguration category

gLite bugs categories

Page 17: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 17INFSOM-RI-026753

Reproducibility

• This requirement addresses the issue of non-This requirement addresses the issue of non-determinismdeterminism

• Invest in tools and processes that makes your tests Invest in tools and processes that makes your tests and your test environment reproducibleand your test environment reproducible

• Install your machines using scripts or system Install your machines using scripts or system management tools, but disable automated management tools, but disable automated APT/YUM/up2date updatesAPT/YUM/up2date updates

• Store the tests together with all information needed to Store the tests together with all information needed to run them (environment variables, properties, support run them (environment variables, properties, support files, etc) and use version control tools to keep the files, etc) and use version control tools to keep the tests in synch with software releasestests in synch with software releases

Page 18: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 18INFSOM-RI-026753

Reproducibility (2)

• Resist the temptation of making too much debugging Resist the temptation of making too much debugging on your test machines (are testers supposed to do on your test machines (are testers supposed to do that?)that?)

• If you can afford it, think of using parallel testbeds for If you can afford it, think of using parallel testbeds for test runs and debuggingtest runs and debugging

• Try and write a regression test immediately after the Try and write a regression test immediately after the problem is found, record it in the test or bug tracking problem is found, record it in the test or bug tracking system and feed it back to the developerssystem and feed it back to the developers

• Then scratch the machine and restartThen scratch the machine and restart

Page 19: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 19INFSOM-RI-026753

Archival and Analysis

• Archive as much information as possible about your Archive as much information as possible about your tests (output, errors, logs, files, build artifacts, even an tests (output, errors, logs, files, build artifacts, even an image of the machine itself if necessary)image of the machine itself if necessary)

• If possible use a standard test output schema (the If possible use a standard test output schema (the xunit schema is quite standard and can be used for xunit schema is quite standard and can be used for many languages and for unit, functional and regression many languages and for unit, functional and regression tests)tests)

• Using a common schema helps in correlating results, Using a common schema helps in correlating results, creating tests hierarchies, performing trend analysis creating tests hierarchies, performing trend analysis (performance and stress tests)(performance and stress tests)

Page 20: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 20INFSOM-RI-026753

Interactive Vs. Automated Tests

• This is a debated issue (related to the reproducibility This is a debated issue (related to the reproducibility and debugging issues)and debugging issues)

• Some people say that the more complex a system and Some people say that the more complex a system and the less automated meaningful tests you can dothe less automated meaningful tests you can do

• Other people say that the more complex a system and Other people say that the more complex a system and the more necessary it is to do automated teststhe more necessary it is to do automated tests

• The truth is probably in between: you need both and The truth is probably in between: you need both and whatever test tools you use should allow you to do whatever test tools you use should allow you to do bothboth

• A sensible approach is to run distributed automated A sensible approach is to run distributed automated tests using a test framework and freeze the machines tests using a test framework and freeze the machines where problems occur in order to do more interactive where problems occur in order to do more interactive tests if the available output is not enoughtests if the available output is not enough

Page 21: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 21INFSOM-RI-026753

Methodologies

• Unit testingUnit testing• MetricsMetrics• Installation and configurationInstallation and configuration• ‘‘Hello grid world’ tests and ‘Grid Exercisers’Hello grid world’ tests and ‘Grid Exercisers’• Functional and non-functional testsFunctional and non-functional tests

Page 22: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 22INFSOM-RI-026753

Unit Testing

• Unit tests are tests performed on the code during or Unit tests are tests performed on the code during or immediately after a buildimmediately after a build

• They should be independent from the environment and They should be independent from the environment and the test sequencethe test sequence

• They are not used to test functionality, but the nominal They are not used to test functionality, but the nominal behaviour of functions and methodsbehaviour of functions and methods

• Unit tests are a responsibility of the developers and in Unit tests are a responsibility of the developers and in some models (test-driven development) they should be some models (test-driven development) they should be written before the codewritten before the code

• It is proven that up to 75% of the bugs of a system can It is proven that up to 75% of the bugs of a system can in principle be stopped by doing proper unit testsin principle be stopped by doing proper unit tests

• It is also proven than they are the first thing that is It is also proven than they are the first thing that is skipped as soon as a project is late (which normally skipped as soon as a project is late (which normally happens within the initial 20% of its life)happens within the initial 20% of its life)

Page 23: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 23INFSOM-RI-026753

Metrics

• Another controversial pointAnother controversial point• Metrics by themselves are not extremely usefulMetrics by themselves are not extremely useful• However, used together with the other test However, used together with the other test

methodologies they can provide some interesting methodologies they can provide some interesting information about the systeminformation about the system

gLite bug trends examples

Page 24: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 24INFSOM-RI-026753

Installation and Configuration

• As mentioned, dedicate some time to test installation As mentioned, dedicate some time to test installation and configuration of the servicesand configuration of the services

• Use automated systems for installing and configuring Use automated systems for installing and configuring the services (system management tools, APT, YUM, the services (system management tools, APT, YUM, quattor, SMS, etc). No manual installations!quattor, SMS, etc). No manual installations!

• Tests upgrade scenarios from one version of a service Tests upgrade scenarios from one version of a service to anotherto another

• Many interoperability and compatibility issues are Many interoperability and compatibility issues are immediately discovered when restarting a service after immediately discovered when restarting a service after an upgradean upgrade

Page 25: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 25INFSOM-RI-026753

‘Hello, grid world’ tests and ‘Grid Exercisers’

• Now you have an installed and configured service. So Now you have an installed and configured service. So what?what?

• A good way of starting the tests is to have a set of A good way of starting the tests is to have a set of nominal ‘Hello, grid world’ tests and ‘Grid Exercisers’nominal ‘Hello, grid world’ tests and ‘Grid Exercisers’

• Such tests should perform a number of basic, black-Such tests should perform a number of basic, black-box tests, like submitting a simple job through the box tests, like submitting a simple job through the chain, retrieving a file from storage, etcchain, retrieving a file from storage, etc

• The tests should be designed to exercise the system The tests should be designed to exercise the system from end to end, but without focusing too much on the from end to end, but without focusing too much on the internals of the systeminternals of the system

• No other tests should start until the full set of No other tests should start until the full set of exercisers runs consistently and reproducibly in the exercisers runs consistently and reproducibly in the testbedtestbed

Page 26: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 26INFSOM-RI-026753

Functional and Non-Functional Tests

• At this point you can fire the full complement of:At this point you can fire the full complement of:– Regression tests (verify that old bugs have not resuscitated)Regression tests (verify that old bugs have not resuscitated)– Functional tests (black and white box)Functional tests (black and white box)– Performance testsPerformance tests– Stress testsStress tests– End-to-end tests (response times, auditing, accounting)End-to-end tests (response times, auditing, accounting)

• Of course this should be done:Of course this should be done:– for all services and their combinationsfor all services and their combinations– on as many platforms as possibleon as many platforms as possible– with full security in placewith full security in place– using meaningful tests configurations and topologiesusing meaningful tests configurations and topologies

Page 27: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 27INFSOM-RI-026753

The Grid as a Test Tool

• IntragridsIntragrids• Certification and Pre-Production environmentsCertification and Pre-Production environments• Virtualization and the Virtual Test LabVirtualization and the Virtual Test Lab• Grid Test FrameworksGrid Test Frameworks• State of the ArtState of the Art

Page 28: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 28INFSOM-RI-026753

Intragrids

• Intragrids are becoming more common especially in Intragrids are becoming more common especially in commercial companiescommercial companies

• An intragrid is a grid of computing resources entirely An intragrid is a grid of computing resources entirely owned by a single company/institute, not necessarily in owned by a single company/institute, not necessarily in the same geographical locationthe same geographical location

• Often they use very specific (enhanced) security Often they use very specific (enhanced) security protocolsprotocols

• They are often used as tools to increase the efficiency They are often used as tools to increase the efficiency of a company internal processesof a company internal processes

• But there are also cases of intragrids used as test toolsBut there are also cases of intragrids used as test tools• A typical example is the intragrid used by CPUs A typical example is the intragrid used by CPUs

manufactures like Intel to simulate their hardware or manufactures like Intel to simulate their hardware or test the compilers on multiple platforms.test the compilers on multiple platforms.

Page 29: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 29INFSOM-RI-026753

Certification and Pre-Production

• In order to test grid middleware and applications in meaningful In order to test grid middleware and applications in meaningful contexts, the testbeds should be as close a reproductions as contexts, the testbeds should be as close a reproductions as possible of real grid environmentspossible of real grid environments

• A typical approach is to have Certification and Pre-Production A typical approach is to have Certification and Pre-Production environments designed as smaller-scale, but full-featured grids environments designed as smaller-scale, but full-featured grids with multiple participating siteswith multiple participating sites

• A certification testbed is typically composed of a complete, but A certification testbed is typically composed of a complete, but limited set of services, usually within the same network. It is used limited set of services, usually within the same network. It is used to test nominal functionalityto test nominal functionality

• A pre-production environment is a full-fledged grid, with multiple A pre-production environment is a full-fledged grid, with multiple sites and services, used by grid middleware and application sites and services, used by grid middleware and application providers to test their softwareproviders to test their software

• A typical example is the EGEE pre-production environment where A typical example is the EGEE pre-production environment where gLite releases and HEP or biomed grid applications are tested gLite releases and HEP or biomed grid applications are tested before they are released to productionbefore they are released to production

Page 30: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 30INFSOM-RI-026753

Virtualization

• As we have seen, the Grid must embrace diversity in As we have seen, the Grid must embrace diversity in terms of platforms, development languages, terms of platforms, development languages, deployment methods, etcdeployment methods, etc

• However, testing all resulting combinations is very However, testing all resulting combinations is very difficult and time consuming, not to mention the difficult and time consuming, not to mention the manpower requiredmanpower required

• Automation tools can help, but providing and Automation tools can help, but providing and especially maintaining the required hardware and especially maintaining the required hardware and software resources is not trivialsoftware resources is not trivial

• In addition running tests on clean resources is In addition running tests on clean resources is essential for enforcing reproducibilityessential for enforcing reproducibility

• A possible solution is the use of virtualizationA possible solution is the use of virtualization

Page 31: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 31INFSOM-RI-026753

The Standard Test Lab

Test Framework

Each test platform has to be preinstalled and maintained.

Elevated-privileges tests cannot be easily done (security risks).

Required for performance and stress tests

Page 32: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 32INFSOM-RI-026753

The Virtual Test Lab

Test Framework VirtualizationSoftware

(XEN, MS Virtual Server, VMWare)

Images can contain preinstalled OSs in fixed, reproducible configurations

It allows performing elevated-privileges tests. Security risks are minimized, the image is destroyed when the test is over. But it can also be archived for later offline analysis of the tests

The testbed is only composed of a limited number of officially supported platforms

Page 33: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 33INFSOM-RI-026753

Grid Test Frameworks

• A test framework is a program or a suite of programs A test framework is a program or a suite of programs that helps managing and executing tests and collecting that helps managing and executing tests and collecting the resultsthe results

• They go from low level frameworks like xunit (junit, They go from low level frameworks like xunit (junit, pyunit, cppunit, etc) to full fledged grid-based tools like pyunit, cppunit, etc) to full fledged grid-based tools like NMI, Inca and ETICS (more on this later)NMI, Inca and ETICS (more on this later)

• It is recommended to use such tools to make the tests It is recommended to use such tools to make the tests execution reproducible, to automate or replicate tasks execution reproducible, to automate or replicate tasks across different platforms, to collect and analyse across different platforms, to collect and analyse results over timeresults over time

• But remember one of the previous tenets: make sure But remember one of the previous tenets: make sure your tests can be run manually and that the test your tests can be run manually and that the test framework doesn’t prevent thatframework doesn’t prevent that

Page 34: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 34INFSOM-RI-026753

State of the Art

• NMINMI• IncaInca• ETICSETICS• OMII-EuropeOMII-Europe

Page 35: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 35INFSOM-RI-026753

NMI

• NMI is a multi-platform facility designed to provide NMI is a multi-platform facility designed to provide (automated) software building and testing services for a (automated) software building and testing services for a variety of (grid) computing projects.variety of (grid) computing projects.

• NMI is a layer on the top of Condor to abstract the typical NMI is a layer on the top of Condor to abstract the typical complexity of the Build and Test processcomplexity of the Build and Test process

• Condor is offeringCondor is offering mechanisms and policies that support mechanisms and policies that support High Throughput Computing (HTC) on large collections of High Throughput Computing (HTC) on large collections of distributed computing resourcesdistributed computing resources

Page 36: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 36INFSOM-RI-026753

NMI (2)

Page 37: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 37INFSOM-RI-026753

NMI (3)

Page 38: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 38INFSOM-RI-026753

NMI (4)

• Currently used by:Currently used by:– CondorCondor– GlobusGlobus– VDTVDT

Page 39: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 39INFSOM-RI-026753

INCA

• Inca is a flexible framework for the automated testing, Inca is a flexible framework for the automated testing, benchmarking and monitoring of Grid systems. It benchmarking and monitoring of Grid systems. It includes mechanisms to schedule the execution of includes mechanisms to schedule the execution of information gathering scripts and to collect, archive, information gathering scripts and to collect, archive, publish, and display datapublish, and display data

• Originally developed for the TeraGrid projectOriginally developed for the TeraGrid project• It is part of NMIIt is part of NMI

Page 40: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 40INFSOM-RI-026753

INCA (2)

Page 41: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 41INFSOM-RI-026753

ETICS

Build/TestArtefacts

Web Application

ReportDB

ProjectDB NMI Scheduler

Clients

Web Service

NMI Client

Via browser

Via command-Line tools

WNs ETICS Infrastructure

Page 42: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 42INFSOM-RI-026753

ETICS (2)

• Web Application layout (project structure)Web Application layout (project structure)

Page 43: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 43INFSOM-RI-026753

ETICS (3)

Page 44: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 44INFSOM-RI-026753

ETICS (4)

• Currently used or being evaluated by:Currently used or being evaluated by:– EGEE for the gLite middlewareEGEE for the gLite middleware– DILIGENT (digital libraries on the grid)DILIGENT (digital libraries on the grid)– CERN IT FIO Team (quattor, castor)CERN IT FIO Team (quattor, castor)

• Open discussion ongoing with HP, Intel, Siemens to Open discussion ongoing with HP, Intel, Siemens to identify potential commercial applicationsidentify potential commercial applications

Page 45: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 45INFSOM-RI-026753

Conclusions

• Testing for the grid and with the grid is a difficult taskTesting for the grid and with the grid is a difficult task• Overall quality (ease-of-use, reliable installation and Overall quality (ease-of-use, reliable installation and

configuration, end-to-end security) is not always at the configuration, end-to-end security) is not always at the level that industry would find viable or cost-effective level that industry would find viable or cost-effective for commercial applicationsfor commercial applications

• It is essential to dedicate efforts to testing and It is essential to dedicate efforts to testing and improving the quality of grid software by using improving the quality of grid software by using dedicated methodologies and facilities and sharing dedicated methodologies and facilities and sharing resourcesresources

• It is also important to educate developers to appreciate It is also important to educate developers to appreciate the importance of thinking in terms of QAthe importance of thinking in terms of QA

• However the prize for this effort would be a software However the prize for this effort would be a software engineering platform of unprecedented potential and engineering platform of unprecedented potential and flexibilityflexibility

Page 46: Grids and Software Engineering Test Platforms

Grid School of Computing - 13 July 2006 - Ischia 46INFSOM-RI-026753

Panel discussion

??