27
SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Embed Size (px)

Citation preview

Page 1: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

SLAC

ILC High Availability Electronics R&D

LCWS 2006 031206IISc Bangalore IndiaRay Larsen, SLAC

Presented by S. Dhawan, Yale University

Page 2: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 2

Outline I. Why High Availability & Why Now?

II. HA ILC Controls, Instrumentation Standards

III. HA ILC Power Systems Architectures

IV. Detector Systems

V. Summary Conclusion

Page 3: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 3

I. Why HA and Why Now? Computer Modeling demonstrates that ILC will only

run ~20% uptime unless all systems design for Availability. Strong investment and hardware and software R&D over

several years planned to learn how to design, implement effective HA at modest cost. Waiting is not an option.

Techniques include multi-level 1/n redundancy, modular design and hot-swap capability.

New Telecom Open Standard modular computing platform designed for “five nine’s” or 0.99999 Availability.

ILC has adopted HA design principles to evaluate all systems, subsystems & components.

Availability goal for complete C&I system is >0.99

Page 4: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 4

I. HA Standard Features Control System, Front End Systems need common

card, backplane architecture; timing, communication interface standards to meet Availability goals. Newest logic engine systems use fast serial communication

from chip-to-chip, module-to-module and backplane-to-backplane.

Core processor chips with imbedded matrix switching, software, provide auto-failover potential for failed channel, module or crate.

Single supply voltage, redundant source, on-board DC-DC converters to manage rapid evolution in V, I requirements.

HA cooling design provides auto-failover fan system.

Page 5: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 5

I. HA Standard Features 2 Redundant timing, communication paths

eliminate points of failure that would interrupt machine (fail-soft) Hot Swap capability to avoid shutting down

crates when swapping modules Software architecture to support HA hardware

without compromising up-time Improved diagnostics at system, crate and module

levels to manage hot-swap, predict and evade problems

Page 6: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 6

II. HA ILC C&I Systems “Controls & Instrumentation will be designed

for High Availability.” Dual Star network topology from Main Control to

machine sectors, fiber plant. Separate Data and Fast Timing, RF networks. Dual Star Fanout from Sector nodes to Front

Ends, fiber or copper. Redundant backbone control data and timing for

auto-failover down to front-end level, as needed. Intelligent Platform, Shelf Manager, supports hot-

swap options. Improved diagnostics minimizes MTTR.

Page 7: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 7

II. Controls Main Processor Farm

FEATURES

◊ Dual Star 1/NRedundant Backplanes

◊ Redundant Fabric Switches

◊ Dual Star/ Loop/ Mesh Serial Links

◊ Dual Star Serial Links To/From

Level 2 Sector Nodes

Page 8: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 8

II. Linac Sector Control Node

Page 9: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 9

II. ATCA Evaluation Plans Advanced Telecom Computing Architecture

New Industry Consortium open standard announced 2004 designed ground-up for HA.

Crate, Card and Sub-Module (Mezzanine Card) comprise fully-integrated HA system

Potentially huge industry support for ~$40B/yr market.

Many modular processors, switches, serial networks available from competitive industry.

Adaptation to custom modules enables controls, instrumentation with unprecedented HA performance.

Page 10: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 10

Fan Rear Exhaust Area

Shelf Manager Card Connection

Fabric Cabling Area

Fabric Interface Card Slots

Power Converters

II. Full ATCA Shelf (Crate) Features

Dual Network Switch Module Locations

Dual Star Fabric Connectors

48V DC Power Plugs

Redundant Shelf Manager Cards

Fan area

Page 11: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 11

II. ATCA Card Options

USER DEFINEDCONNECTOR AREA

REAR TRANSTION CARD

BACKPLANE

SERIAL, PARALLELCONNECTIONS

POWER CONNECTOR DETAIL

MEZZANNINE CARDS

Page 12: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 12

II. C&I R&D Summary Status Baseline Conceptual Design for all Controls &

Instrumentation subsystems require HA Architecture.

Controls core system hardware, software evaluations underway at DESY, Argonne, FNAL and SLAC. Evaluation of platform vs. instrumentation requirements

for LLRF, BPM’s Investigation of software 3-tiered HA architectures Industry tutorials, study of ATCA standards Lab Associate Memberships in ATCA consortium

(PICMG) to participate, track developments FY07 plans for strong ramp-up in R&D

Page 13: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 13

III. HA ILC Power Systems R&D Pulsed & DC Power Systems largest single limit to

Availability in current machines. HA principles being applied to all power systems. Goal is >0.99 for RF Power System, DC Magnet

Power System 1/n Redundancy at 3 level in DCPS, HVPS,

Modulators & Kickers: High power switches Modular power cells System level unit hot spares

Goal: Keep operating with 1-2 failed cells; change failed cells while machine keeps running

Page 14: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 14

III. HA Power System Examples Modulators DR Kickers Power Supplies Diagnostic Interlock Layer

Page 15: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 15

III. Marx Prototype Modulator 3-Level n/N

Redundancy 1/5 IGBT switch

subassemblies 1/8 Mother-

boards +2% Units in

overall system Intelligent

Diagnostics Imbedded

diagnostics in every MBrd

Networked by wireless & dual fiber to Main Control

Page 16: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 16

III. DR Kicker Prototype Concept

LLNL Kicker Tested at KEK ATF System Concept

Page 17: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 17

5 Parallel 1.25 KW Modules

Backplane

DC Out

AC In/ Out

Serial I/O (5x2)

EmbeddedPSDC’s

III. HA DC KW Power

1/N parallel modules for DC magnet supply (1/5 shown)

Overall load current feedback

Independent dual control/diagnostics each module

Page 18: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 18

III. Diagnostic Interlock Layer All power systems will have a “DIL” card or

hybrid circuit in each power cell. Gathers diagnostic information to view in control

room: Interlock set points vs. actual levels; fast and slow waveforms

Takes evasive action to avoid trips: Reduces load, adjusts set points if permitted; “safes” system in case of failure

First example under development for Marx Goal: General purpose solution for variety of

power systems.

Page 19: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 19

III. DIL Prototype Functional Design

8CH ±10Vdc

Gate Trigger

PLC/ IOCLike Station Controller

Main Control System

Diagnostic/Interlock

8CH Fast ADC

8 CH 16-bit DAC

Local Gate Trigger

Tandem Int. Out (OK/Fail)

8 CH Fast Signal±1V, 50Ω, 20 MHz

8 CH Slow Signal±10V, ~1kΩ

16 CH Interlock TTL

Tandem Int. In

Serial Communication

(Ethernet)DSP/CPU/FPGA

48 VDC(1kV Isolated)

Remote Trigger Sync.(Optical Fiber)

STAGE 1 Controller

STAGE 2 Controller

STAGE N Controller

Clocks(~100 MHz)

8 CH ±10V

6 CH - 3CH Relay Out (24VDC/5mA)- 3 CH TTL

AnalogMux

8CH Slow ADC

DIL SLIDES COURTESYDR. S. NAM, POHANG

Page 20: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 20

III. Test Prototype Floor Plan

DAC_out Analog_in

FAST_ADC DAC_1 Phototransistor*8 UTPFPGA EEPROM Reset

140mm

RS232FiberReceiverPhotoCoupler*3

PowerJack

SLOW_ADC

DAC_OUT SLOW_ADC_IN

Digital_out Digital_in

FPGA

RAM*5

+15V converter

-15V converter

+5V converter

DSP Module ( include DSP JTAG)

Power Regulator section(±5V, 3.3V, 2.5V, 1.2V)

Mount hole * 8

Diode

MUXOp ampOp amp

TR

FPGA JTAG

+15V 1.2A

+15V 1.2A

+5V 3.5A

SMA Connector

Page 21: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 21

III. Achieving Near-Zero MTTR Hot Swap:

In critical systems, short Mean Time To Replace (MTTR) becomes critical, e.g. in 1/n parallel modular power supplies, Controls and Timing backbones

Hot swap not practical for high voltage, current modules with large stored energy (unless by remote robotics).

Hot Swap implemented in standard package becomes affordable weapon to help achieve HA

Access to point of failure critical to use of hot-swap.

Page 22: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 22

IV. Detector Systems Detectors benefit from inherent use of highly

redundant n/N architectures. Detectors pay penalty for non-standard

architectures, some unavoidable, others not: Very high engineering costs Unique designs where “wheel is re-invented” for

each new design Lack of communication between engineers

working on subsystems for same detector

Page 23: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 23

IV. Detectors 2 Potential Benefits of Standardization Efforts

Common, robust solutions for HA power, grounding, shielding

Single supply voltage DC-DC converters to operate in magnetic fields

Drop-in standard high speed serial link chips or small modules to access sections buried in detectors.

Standard interface rules to simplify system integration across diverse subsystems.

Provide complete power, I/O design framework for custom boards

Page 24: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 24

IV. Detector Standards Collaboration Broad collaboration on upgrading physics

instrumentation standards that are 20-40 years old can benefit community New basic platform w/ HA features Adaptations to Controls, Machine instrumentation, Detectors Hardware and software evaluations Large machines & detectors, light sources, fusion, small controls,

DAQ systems for experiments

Encourage spin-offs in other projects such as ITER, Light Sources, Astrophysics Two exploratory meetings held: IEEE 2005 Real Time, 2005 NSS Propose continue supporting exploratory collaboration meetings Share ILC evaluations with broader community if collaboration group

materializes

Page 25: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 25

V. Summary Conclusions HA design efforts are well underway in the

ILC Cannot meet up-time goals without it. Full machine goal of A>0.85 requires all

subsystems to strive for >0.99 Opportunity Cost of idle ILC ~ 135K$US/hour!

ATCA platform offers ready solution to many controls, instrument applications. Need R&D to decide feature set, evaluate

suitability of ATCA as instrument platform, prototype all critical hardware-software applications

Page 26: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 26

V. Summary Conclusions 2 HA design principles valuable for all subsystems,

including power systems, vacuum, RF, cryogenics etc.

Detector applications explored through broad engineering collaborations can yield invaluable future cost, operational benefits.

The ILC Global Collaboration should seize this opportunity to advance electronics standard systems design and cost-benefits for the next generation of physics machines and detectors.

Page 27: SLAC ILC High Availability Electronics R&D LCWS 2006 031206 IISc Bangalore India Ray Larsen, SLAC Presented by S. Dhawan, Yale University

Dec 2, 2005 SLAC HA Electronics Ray Larsen 27