Upload
melvin-singleton
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Machine Availability and System Reliability at RHIC
WAO-07 Trieste, September 24-28 2007
Fulvia Pilat
WAO-07 Fulvia Pilat
RHIC performance
Delivered per runto PHENIX.
Delivered luminosity increased by >2 orders of magnitude in 6 years.
FOM=LP4
WAO-07 Fulvia Pilat
Enhanced Design Parameters
Calendar time in store affects ability to project performance.
WAO-07 Fulvia Pilat
Enhanced Design Parameters (~2009)
Parameter unit Achieved Enhanced design
Au-Au operationsEnergy GeV/n 100 100
No of bunches … 103 111
Bunch intensity 109 1.1 1.0
Average L 1026cm-2s-
112 8
p- p operationsEnergy GeV 100 100 (250)
No of bunches … 111 111
Bunch intensity 1011 1.4 2.0
Average L 1030cm-2s-
120 60
(150)Polarization P % 60 70
3x
+10%
goalexceeded
WAO-07 Fulvia Pilat
Enhanced and RHIC-II luminosity
Electron orStochasticcooling
WAO-07 Fulvia Pilat
Time at store: trend and goal
Trend
Goal: back to mid 50% in Run-8 60% time at store in Run-9
WAO-07 Fulvia Pilat
OutlineOperation stats, performance
Factors determining time at storeMachine development (short term investment)
APEX: Accelerator Physics EXperiments program (longer term investment)
Scheduled Maintenance talk Sampson todayMachine set-upSystems downtime and failureMode of operation: “pushing the envelope”
WAO-07 Fulvia Pilat
RHIC Retreat 2007-July 16-17Session on Availability and Reliabiliy
11:00 (15) Pilat Introduction11:15 (25) Ingrassia Operations and Uptime11:40 (20) Kling Turn-around time12:00 (20) Sampson Maintenance models,
organization12:20 (10) Discussion 2:00 (15) Ahrens RHIC abort system2:15 (15) Zhang, Wu Pulsed power systems2:30 (30) Bruno Power supplies3:00 (30) Sandberg Electrical systems3:30 (30) Zaltsman RF: RHIC and injectors4:30 (15) Oerter Controls, hardware4:45 (15) Morris Controls, software5:00 (15) Reich Access controls5:15 (15) Russo BPM, IPM, BBQ in operations5:30 (15) Tuozzolo Cryogenic system5:45 (15) Mapes Vacuum systems
WAO-07 Fulvia Pilat
60% goal
M M
M
M
M
M
WAO-07 Fulvia Pilat
WAO-07 Fulvia Pilat
Failure FlavorsCharged – threshold for log is 6 minutes or more Failure hours that impact the program -- charged to
one OR MORE systems during a failure period. Simultaneous failures result in charged hours less than actual hours
Actual – Severe Duration of a failure that impacts the program often
LONGER than the hours charged.
Actual – Mild Failure that does not impact the program e.g. 1 of 10
AGS Rf Stations trip. Hours recorded but not “charged”
Resets – threshold for log is less than 6 minutes
WAO-07 Fulvia Pilat
“Top 10” Failures by Group & by RunFY07 FY07 FY06 FY06 FY05 FY05 FY04 FY04
R AN K H O U R S R AN K H O U R S R AN K H O U R S R AN K H O U R S
PS_R H IC 1 186.8 1 94.6 1 78.15 2 85.5R f 2 106.9 6 39.9 3 67.8 3 79.6
C ryoge nic 3 92.6 7 41 5 66Pulse dPowe r 4 58.8 8 33 5 43.3 8 32.1
E le ctricalSe rv ice 5 58.7 7 34.2 6 34.8C ontrols 6 39.1 2 67.2 2 69.2 1 134.9
ES& FD _AtR & Expe rime nt 7 38.1 4 49 4 46.5 9 30Acce ssC ontrols 8 36.1 5 43.9 9 32 10 21.7
Q ue nchProte ction 9 31.6 6 42.6 7 34Se rv ice s Wate r 10 23.1 11 22.8 11 20.4
H umanError 11 23 9 29.9 8 32.5
Actual Actual R e se ts (h) R atioSyste m C harge d (h) Se v e re (h) M ild (h) R e se ts (#) @ 3 min pe r Actual/C harge d
PS_R H IC 187 236 5 15 1 1.26R f 107 216 272 44 2 2.02
Pulse dPowe r 59 80 15 70 4 1.36C ontrols 39 70 39 303 15 1.79
ES& FD _AtR & Expe rime nt 38 53 51 33 2 1.39Acce ssC ontrols 36 40 25 ~0 0 1.11
WAO-07 Fulvia Pilat
WAO-07 Fulvia Pilat
WAO-07 Fulvia Pilat
WAO-07 Fulvia Pilat
WAO-07 Fulvia Pilat
Operations Planned Improvements
Multiple Failure, often simultaneous CAS (tech support on shift – 2 now) needs helpTrain Siemens Watch for LOTO
Together with MCR Operators they can perform LOTO when CAS is busy
Get Operators into the field Train Operators to (only) reset “accelerator” power
supplies
OC instructed to call in help for CAS when CAS is making a repair AND another system goes down.OC instructed to call in help from two groups with knowledge of the equipment when the cause of a problem is not clear
WAO-07 Fulvia Pilat
OutlineOperation stats, performance
Factors determining time at storeMachine development (short term investment)
APEX: Accelerator Physics EXperiments program (longer term investment)
Scheduled Maintenance talk Sampson todayMachine set-upSystems downtime and failureMode of operation: “pushing the envelope”
WAO-07 Fulvia Pilat
Turn around time
WAO-07 Fulvia Pilat
WAO-07 Fulvia Pilat
WAO-07 Fulvia Pilat
OutlineOperation stats, performance
Factors determining time at storeMachine development (short term investment)
APEX: Accelerator Physics EXperiments program (longer term investment)
Scheduled Maintenance talk Sampson todayMachine set-upSystems downtime and failureMode of operation: “pushing the envelope”
WAO-07 Fulvia Pilat
Input from systems
Maintenance, set-up and turn-around time, modes of operations all affect the availability but the main factor is system failure. In Retreat presentations please focus on the reliability of your system and think critically about ways to improve it. I would ask each of you to discuss a plan - including timelines and necessary funding - to increase your system reliability. This is an important input towards an integrated plan to improve time at store to be discussed at the Retreat and implemented thereafter.
WAO-07 Fulvia Pilat
After the Retreat reliability
Review Retreat information on operations, maintenance and systemsPrioritize actions – especially systems improvements for reliabilityAnalyze aging infrastructure, systemsUse the recently revisited “Trouble Report Committee” as input and advice on system reliability
WAO-07 Fulvia Pilat
RHIC PS Performance Stats Average RHIC PS Failure Hours/Week
MTBF of RHIC due to any PS Failure
MTBF of an individual PS Failure
WAO-07 Fulvia Pilat
Leading Causes of PS Down Time in Hours
IR - Dynapowers 42.4
Main p.s.’s 36
IR p.s.’s – SCE 150’s 26.7
6000A Quench Switches 20.6
IR p.s.’s – SCE 300’s 16.2
Quench Detectors 14.6
Node Cards 6
Correctors 5.8 success story
Ground Fault 5.1
QPA’s 4.55
New Sextupole p.s.’s 4.5
Bypass chassis 0.3
WAO-07 Fulvia Pilat
Power Supply System Priorities Bipolar 150A, 300A p.s.’s Phase 1 QPA’ s (Quench protection assemblies) Main dipole and quadrupole PS Investigate yellow quad bus ground
fault Improving Dynapower PS cooling Quench detector cleaning and fan
replacements Air Conditioning (for air quality and
temperature)
WAO-07 Fulvia Pilat
Expected MTBF in Run 8?
Run 5 = 30.79 hoursRun 7 = 14.75 hoursRemove 3 major problems from Run 7 = 40 hours
WAO-07 Fulvia Pilat
Run
Power System Failure
Hrs
Total Failur
eHrs
5 15 694
6 26* 700
7 45 881
* excluding arc flash event
Electrical Systems
WAO-07 Fulvia Pilat
Most Significant Causes of ES Downtime-Run # 7
Location Hours Events Equipment
1004 A 18.3 2 Switch & 208 Volt CB
1000 P 15.3 multiple Switch & Circuit Breaker
914 13.8 1 Switch
929 9.5 1 Cooling Tower Fan Motor
4 areas responsible for 90% of downtime in Run-7
WAO-07 Fulvia Pilat
• 18 Electricians Assigned to C-AD this Summer vs. 6 last year
• On going Thermal Inspection of Switches
• Use of torque Wrenches Instituted
• Better understanding of Thermal Effects
• Replace 1000 P 13.8 kV Switches
• Replace Trip units 1000 P Substation
• Replace Switchgear in 914
• Maintenance BMMPS CB’s
Electrical systems: Steps being taken
WAO-07 Fulvia Pilat
Electrical systems: Steps being taken- cont’
•Continuation of Arc Flash Calculations
•Connecting RHIC Bard A/C Units through Isolation Transformers
•21 New Alcove UPS’ s
•8 year Program to improve Electrical Infrastructure ($ 9 million)
•Open Slot for New Power Engineer
WAO-07 Fulvia Pilat
1. Power Dips 8 in Run-6, 6 in Run-7
2. Response to 1006 Arc Flashalmost done
3. 1004 B CB Problem
4. AMMPS Transformer Replacement
ES: Top Concerns from last Year’s Retreat
Additional Steps to Improve Availability
•Increase the number of assigned electricians
•Centralize Spare Parts Location
•Increase Spares Inventory
this shutdown
WAO-07 Fulvia Pilat
RF system: Performance
Number of systems:Booster: 2 AGS: 11 RHIC: 16 Charged failure hours:Booster: 7 AGS: 39 RHIC: 65Actual failure hours:Severe: 216 Mild: 272
Factor affecting the system performance in RHIC RF: beam loading (more than double total intensity than in Run-4).
(Example: large debunching at rebucketing time, losses and beam dumps). Took time to understand and mitigate the beam loading effects.
WAO-07 Fulvia Pilat
07 Gold Bunch Merge
WAO-07 Fulvia Pilat
RF - IMPROVEMENTSComplete system upgrade of low level RF in AGS and RHIC (unified hardware and software, modern system, better ring-2-ring synchro)Window comparators to provide fast shutdown for storage systemsNew beam permit chassis to speed up the responseLow power circulatorsNew tubesOngoing work on window for storage systemContinue development of ferrite tuner for acceleration system
WAO-07 Fulvia Pilat
Abort kickers - Failure Modes
Prefires One module discharges unilaterally The other four fire in response ASAP Not synchronized with abort gap
Unconditioned Triggers All five modules discharge together Not synchronized with the abort gap
Spontaneous Capacitor Discharges As if a “stop charge” occurred with no
associated trigger – stop charge turns off the charging mechanism
Damaging if not noticed
WAO-07 Fulvia Pilat
Run 7 Prefires 12 yellow 18 blue
broken down by PFN module involved
1
2
3
4
5
4-M
ar
14
-Ma
r
24
-Ma
r
3-A
pr
13
-Ap
r
23
-Ap
r
3-M
ay
13
-Ma
y
23
-Ma
y
2-J
un
12
-Ju
n
22
-Ju
n
time (calendar)
mo
du
le i
nv
olv
ed
blue
yellow
RHIC abort kickers pre-fires in Run-7 broken out by ring and by module
WAO-07 Fulvia Pilat
Abort kickers: observations, improvements
• B2 and B4 use thyratron CX1575C. They will be replaced by CX3575C.
• Y5 had 7 pre-fire at beginning, but stayed clean after 4/4.
• Y1 stayed clean during entire RUN• Y5, B2, and B4 had 7 pre-fires each, contributed to
70% of total pre-fires.
What may help?• Condition high voltage system at higher voltage
than operation level (Engineering control? Routine procedure?)
• Keep modulators on• Pre-conditioning before beam operation • Keep operating voltage as low as possible
WAO-07 Fulvia Pilat
RHIC abort kickers: R&D
• Charge up high voltage modulators on command 4ms before beam abort to avoid pre-fire during long DC hold up
• A preliminary study was performed on 2003
• Project cost over $2 million based on 2003 budget estimate.
WAO-07 Fulvia Pilat
Cryo system: Phase III Upgrade
New gas bearing turbine for energy removal at the cold end of the refrigerator (Run-7).
New high efficiency vertical heat exchanger system at the cold end of refrigerator (Run-7).
Re-configured the cold helium supply to the accelerator rings to eliminate the use of the cold circulators (Run-6).
Modified Cold Box 5 to reduce Helium inventory, improve insulation, and reduce flow restrictions (Run-6).Results:
Saved an additional 1.0 MW of compressor power in Run-6.
Reduced the liquid inventory in the refrigerator. Additional 1.0 MW achieved during Run-7. Reduced number of running compressors by 4 FS and 1
SS.
WAO-07 Fulvia Pilat
RHIC POWER HISTORY
WAO-07 Fulvia Pilat
Cryo Stumbling at the Start of Run-7:HX OBSTRUCTION
Oil contamination in HX-20 from Rotoflow oil bearing expanders• Oil Crossover Happens During Start-up (Warm)
+ LN2 contamination on HX-20• Extended 80K operations contaminated GHe in RHIC• During cool-down 80K GHe returned to the refrigerator• Poorly seated crossover valve (H409M) between CR line and Expander 6 outlet allowed LN2 to collect on HX-20
= High Recooler Return Pressure resulting in (too) high magnet temperatures.
WAO-07 Fulvia Pilat
Blue 4.5KWave Starts
Blue recoolerWave Starts
Blue ready Yellow
45KWave Starts
Yellow 4.5KWave Starts
HX20 DP
He Flow Rate
Warm-up AttemptsTo Clear Blockage
WAO-07 Fulvia Pilat
OutlineOperation stats, performance
Factors determining time at storeMachine development (short term investment)
APEX: Accelerator Physics EXperiments program (longer term investment)
Scheduled Maintenance talk Sampson todayMachine set-upSystems downtime and failureMode of operation: “pushing the envelope”
WAO-07 Fulvia Pilat
Running for high availability
Example: Low energy copper run (Run-5)2 weeks of physics: choice to limit set-up time and
downtimeMachine parameters(almost the same #bunches 37-41, transmission HE~95%, LE ~ 85-
92 %, same transition set-up)bunch intensity: HE 41 x 4.5e9 LE: 37 x3.8e9
beta* HE: 0.85m LE: 3menergy HE: 100 GeV/u LE: 31.2 GeV/u
Reproducibility: minimized time tuning timeMinimized time between storesLonger lumi-lifetime
WAO-07 Fulvia Pilat
Cu Run-5 high-energy run*=0.85m (0.89m)
*=2.6m*=3.0m
access +snowstorm
power dip+access
access +equipmentfailures
time at store: 52%
WAO-07 Fulvia Pilat
Cu Run-5 low energy run
time at store: 74%
WAO-07 Fulvia Pilat
Cu Run-5 LE (week 2 – stores)
inje
ctio
n
acc
es
s
Phobos
0 &
pola
rity
Beam
experi
em
nts
WAO-07 Fulvia Pilat
Optimization of performance and availability
Projected performance and run plans must include optimization of the time at store if we want to achieve the 60% goalLimit the number of new developments during the run preparationStop or reduce machine developments during physics running once potential for returns is lowOptimal choice of lattice, beta*, bunch intensity and number of bunches (with parameters evolution during the run, more conservative or aggressive, based of optimization of delivered luminosity and time at store)
WAO-07 Fulvia Pilat
Conclusions
Analyzed machine availability at RHICIdentified the main factors determining the time at storeHave a plan towards increase availability to 60% in ~2 RHIC runs
….will report at the next WAO !