Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Information Technology and Facilities Report
Jerry Dreyery yVice President & Chief Information Officer
Board of Directors MeetingBoard of Directors MeetingOctober 18, 2011
Highlights
Service Availability: y Market Operations IT systems met all SLA targets Market Data Transparency IT systems met all SLA targets
Retail Market IT systems missed one SLA target (Retail Processing Business Hours) Grid Operations IT Systems met all SLA targets
Retail Market IT system outage with impact to SLA:• Data Center migration to the new T3 site resulted in an incorrect Internet Protocol (IP) address
configuration (9/26)• Improperly routed traffic caused messages to queue and slow processing which resulted in 60
minutes of delayed transactions and slow MarkeTrak performance• Correct IP configuration change was applied to resolve the issue
R t il M k t l d tRetail Market unplanned outages:• Two parallel processes encountered contention that affected Electronic Data Interchange (EDI) (9/6)
• Retail transaction processing was unavailable for 28 minutes (outside of business hours)• Short-term resolution: Restarted one of the processes in contention
L t C fi t diff t t d li i t t ti Th fi ill• Long-term: Configure processes to run on different systems and eliminate contention. The fix will be implemented with the move to new data center by 10/30
• Hardware failure caused outage of Enterprise Data Warehouse (EDW) (9/26)• Caused Get Reports and other components of TML and MIS to be unavailable for 75 minutes• Short term: Manually took component out of service and used redundant component (NIC)
2 ERCOT PublicOctober 18, 2011
• Short-term: Manually took component out of service and used redundant component (NIC)• Long-term: Replaced hardware to resolve issue
Highlights Cont’d
Core unplanned outages:• Grid Operations: There was an automatic LFC local failover caused by an EMS software defect (9/26)
• The EMS internal communication application locked-up and caused a local failover • Vendor has a lead on root cause but still doing investigationVendor has a lead on root cause but still doing investigation
Planned Outages:• Weekend Retail and Market Operations maintenance activity (9/11 and 9/25)
• Outages lasted 1,471 minutes which is within the 1,800 minutes allowed via SLA
October Planned Outage Updates• For the first time, the EMS and MMS production ran in the new Bastrop Data Center. The team successfully
completed the failover with only one missed SCED interval (Target is three or less) (10/4)• For the first time, the non-core systems (MIS, CDR, CMM,…) ran in the new Bastrop Data Center. The failover
was completed with no issues to report. (10/5)• An extended retail outage requested through RMS and COPS to execute retail system moves to the new T3 DataAn extended retail outage requested through RMS and COPS to execute retail system moves to the new T3 Data
Center in Taylor (10/22, 10/23, 10/29, 10/30)
3 ERCOT PublicOctober 18, 2011
2011 Net Service Availability (Retail and Market Ops)
2011 Net Service AvailabilityyYear to Date
100%
99.95% 99.88%99.92% 99.85% 99.80% 99.96%99.91% 99.87%
99%
98%
97%
100%
94%
95%
96%
93%
92% and below
Transaction Processing
TML MarkeTrak Retail API
TML Report Explorer
CRRTransaction Processing MIS
Retail & Market Operations
Transaction Processing SLA Target: ‐ Business Hours (BH): 99.9%‐ Off Business Hours (Off BH): 99%
MarkeTrak SLA Target: 98%TML Report Explorer SLA Target: 99%Retail API SLA Target: 99%
g(BH)
API Explorerg( Off BH)
4 ERCOT PublicOctober 18, 2011
Off Business Hours (Off BH): 99%Texas Market Link (TML) SLA Target: 99%
Congestion Revenue Rights SLA Target: 98%Market Information System SLA Target: 99%
2011 Net Service Availability (Grid Ops)
2011 Net Service AvailabilityYear to Date
99.85% 99.98% 99.99% 99.99%99.98%99.85%
99%
98%
97%
100%
97%
94%
95%
96%
93%
92% and below
MMS EMS OS NMMSEMS MMS
SCED LFC OS NMMSAggAgg
MMS Aggregate SLA Target: 99%EMS Aggregate SLA Target: 99%
EMS LFC Target: 99.93% Outage Scheduler Target: 99%
Grid Operations
5 ERCOT PublicOctober 18, 2011
MMS SCED SLA Target: 99.93% NMMS Target: 97%
2011 Data Center Availability
2011 Data Center AvailabilityYear to Date
100% 100%100% 100%
Target – 99.982%
99.99%
99 98%
100.0%
g99.98%
99.97%
99 95%
99.96%
99.94%
99.95%
99.93%
99.92%
Taylor 1 Austin BastropTaylor 2
6 ERCOT PublicOctober 18, 2011
Outage prevented due to Tier 3 redundancy
September 2011 Net Service Availability
September 2011 Net Service Availability
100%
September 2011 Net Service Availability
99.90% 100%100% 100% 99.82% 100%99.61% 99.91%
99%
98%
97%
100%
94%
95%
96%
93%
92% and below
Transaction Processing
TML MarkeTrak Retail API
TML Report Explorer
CRRTransaction Processing MISProcessing
(BH)API Explorer
Retail & Market Operations
Processing ( Off BH)
Transaction Processing SLA Target: ‐ Business Hours (BH): 99.9%‐ Off Business Hours (Off BH): 99%
MarkeTrak SLA Target: 98%TML Report Explorer SLA Target: 99%Retail API SLA Target: 99%
7 ERCOT PublicOctober 18, 2011
‐ Off Business Hours (Off BH): 99%Texas Market Link (TML) SLA Target: 99%
Congestion Revenue Rights SLA Target: 98%Market Information System SLA Target: 99%
September 2011 Net Service Availability
September 2011 Net Service Availability
100%
September 2011 Net Service Availability
100% 99.99% 100% 99.99%99.99%100%
99%
98%
97%
100%
94%
95%
96%
93%
92% and below
MMS SCED
EMS LFC OS NMMSEMS
AMMS Agg SCED LFCAggAgg
MMS Aggregate SLA Target: 99%EMS Aggregate SLA Target: 99%MMS SCED SLA T t 99 93%
EMS LFC Target: 99.93% Outage Scheduler Target: 99%NMMS T t 97%
Grid Operations
8 ERCOT PublicOctober 18, 2011
MMS SCED SLA Target: 99.93% NMMS Target: 97%
September 2011 Data Center Power Availability
100% 100%100% 100%
September 2011 Data Center Availability
Target – 99.982%
99.99%
99.98%
100.0%
99.97%
99 95%
99.96%
99.94%
99.95%
99.93%
99.92%
Taylor 1 Austin BastropTaylor 2
9 ERCOT PublicOctober 18, 2011
YTD Availability – Retail Market IT ServicesRetail Transaction Processing
(Off Business Hours)(Off Business Hours)
99.00%
98.00%
97.00%
96.00%
100.0%
94.00%
95.00%
96.00%
93.00%
92.00%
Jan Feb MayAprMar JulyJune Aug Sepyp yJune g p
10 ERCOT PublicOctober 18, 2011
YTD Availability – Market OperationsRetail API
Target – 99%99.00%
98.00%
97.00%
96.00%
100.0%
YTD – 99.85%
94.00%
95.00%
93.00%
92.00%
Jan Feb MayAprMar JulyJune Aug Sep
11 ERCOT PublicOctober 18, 2011
YTD Availability – Grid Operations IT Services
12 ERCOT PublicOctober 18, 2011
Load Frequency Control Availability
13 ERCOT PublicOctober 18, 2011
Retail Transaction Processing Availability
14 ERCOT PublicOctober 18, 2011
Retail Transaction Processing Availability
15 ERCOT PublicOctober 18, 2011
NMMS Availability
S t b 2011 N t k M d l M t S t (NMMS)September 2011 Network Model Management System (NMMS) Availability Summary
9/22 (3 Minutes): Manual NMMS restart due to database lock
September 2011 NMMS Availability – 99 99%99.00%
100.0%
9/22 (3 Minutes): Manual NMMS restart due to database lock
September 2011 NMMS Availability 99.99%
98.00%
97.00% Target – 97%
NMMS unscheduled restarts
94.00%
95.00%
96.00%
6
8
10
12
rt C
ount
NMMS unscheduled restarts
93.00%
92.00%0
2
4
Rest
a
16 ERCOT PublicOctober 18, 2011
NMMS Availability
TML Report Explorer Availability
17 ERCOT PublicOctober 18, 2011
Release Management Metrics (Releases)
Awaiting slide
18 ERCOT PublicOctober 18, 2011
Release Management Metrics (Changes)
Awaiting slide
19 ERCOT PublicOctober 18, 2011
ERCOT Public Website Metrics
Awaiting slide
20 ERCOT PublicOctober 18, 2011
ERCOT Public Website Metrics
Awaiting slide
21 ERCOT PublicOctober 18, 2011