10
CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/ Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008

CERN IT Department CH-1211 Geneva 23 Switzerland t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008

Embed Size (px)

Citation preview

Page 1: CERN IT Department CH-1211 Geneva 23 Switzerland  t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

Open projects in Grid Monitoring

IT-GS-MDS Section Meeting

25th January 2008

Page 2: CERN IT Department CH-1211 Geneva 23 Switzerland  t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

WLCG Monitoring Working Groups• 3 groups proposed by Ian Bird to the LCG

Management Board, Oct 06.– Goal to improve the reliability of the WLCG grid

2

Grid ServicesGrid sensors

TransportRepositories

Views…….

Grid ServicesGrid sensors

TransportRepositories

Views…….

System ManagementFabric management

Best PracticesSecurity

…….

System AnalysisApplication monitoring

……

Page 3: CERN IT Department CH-1211 Geneva 23 Switzerland  t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Reliability is our reason

• Our goal is to improve the reliability of the Grid

• WLCG availability level for a Tier-2 is 95%– Greater for Tier-0 & Tier-1s

• What do we need to do ?– Detect problems before users do !– Reduce time to respond to problems

• Approach is to put the monitoring and alarms close to the site administrators

Page 4: CERN IT Department CH-1211 Geneva 23 Switzerland  t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

High-level Model

See https://twiki.cern.ch/twiki/pub/LCG/GridServiceMonitoringInfo/0702-WLCG_Monitoring_for_Managers.pdf for details

4

LEMON

Nagios

SAM

R-GMA

SAME-WS

GridView

GridView

ExperimentDashboard

GridIce

GridIceHTTP

LDAP

GOCDB

Dashboard

GridView

Page 5: CERN IT Department CH-1211 Geneva 23 Switzerland  t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Projects

• Nagios Site Monitoring (Emir)– NCG rewrite, local tests on service (Emir)– Improved Publishers (Pranabesh)– Yaim for Nagios

• Messaging Infrastucture (James, Daniel)• OSG-SAM Integration using Messaging (Piotr,

Arvind Gopu, Rob Quick)• GridMap (Max)

– Dashboard Integration

• GridView (4xBARC)– Including quattorization – Gridview using Messaging for producers (GridView)

Presentation title - 5

Page 6: CERN IT Department CH-1211 Geneva 23 Switzerland  t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Projects

• SLS/Nagios Integration (Joanna ASGC, Sebastien Lopienski)

• APEL using Messaging (STFC, Piotr)• RDF Schema for monitoring (Piotr, … )• New SAM Portal (IT-GS, …)

– (using CMS SAM Work?)

• Management Dashboard (John Shade, …)• LEMON site monitoring (James)• GOCDB as Topology Database (STFC effort, 1

BARC from Feb'08)• "Service Cards" (Oliver Keeble, 1 BARC from

Feb'08)

Presentation title - 6

Page 7: CERN IT Department CH-1211 Geneva 23 Switzerland  t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

CCRC Reporting requirements

Presentation title - 7

Page 8: CERN IT Department CH-1211 Geneva 23 Switzerland  t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Measuring according to MoU

• WLCG MoU is what sites have agreed to– But we don’t measure it right now!

Presentation title - 8

Page 9: CERN IT Department CH-1211 Geneva 23 Switzerland  t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

CCRC’08 GridMap

Presentation title - 9

• Combines Production Status of service with availability– And dashboard metrics (in a 3rd map)

Page 10: CERN IT Department CH-1211 Geneva 23 Switzerland  t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Summary

• We’re involved in many projects – Most of the effort is external– CERN does architecture, project management,

coordination

• Main areas– Nagios site monitoring– Messaging for monitoring– SAM/GridView futures– CCRC’08 and WLCG operational monitoring

10