Upload
harvey-roger-johnson
View
221
Download
1
Tags:
Embed Size (px)
Citation preview
AWIPS Product Improvement Architecture Review24 June 2004
2
Overview
• Joint discussions led to an analysis of technology and development of a “to-be” architecture which was briefed to the SET on 2/3/04.
• To-be architecture provides the framework to complete individual product improvement tasks while maintaining a common goal.
X-terminals Redundant LDAD firewallsRouter Upgrade Redundant LDAD serversDS replacement Full DVB deploymentSerial mux upgrade
• To-be architecture facilitates development of a roadmap for schedule, budget and deployment planning.
• Roadmap assists in issue and dependency identification and resource planning for risk reduction.
3
AWIPS To-Be Architecture
• Hardware–Utilize Network Attached Storage (NAS) technology–Deploy commodity servers on GbE LAN– Incrementally deployed and activated–Promote reuse of select hardware–Remove limitations of direct attached storage
• Software–For availability, move from COTS solution to use of
public domain utilities–Some experience with NCF and REP–Can be decoupled from operating system
upgrades–Supports NAS environments–Can be augmented for load balancing if required
–Deploy low cost Linux database engine
4
AWIPS To-Be Architecture
• Methodology–Stage hardware, move software when ready–Reuse hardware to ease transition and allow planned
decommissioning–Deploy flexible availability framework–Deploy Linux database engine
• Goals–Decommission AS, AdvancedServer, DS and FDDI–Support universal deployment of single Linux
distribution (currently RHE3.0 with OB6)–Provide dedicated resource for local applications–Expandable within framework (easy to add servers)
5
Step 1 – Initial Hardware Deployment
• Release OB4 is a prerequisite• Initial staging of hardware
–New rack–NAS w/LTO-2 tape (~400GB storage and backup)–2 commodity servers – DX1/DX2–2 GbE switches and associated cables, etc–2 8-port serial mux replacements (installed in PX1/2)
• NAS key to incremental deployment and activation• Serial mux replacements installed but not activated• LTO-2 drive for site backup• New hardware on GbE LAN• LDAD firewall upgrade deployed independently
6
Firewall
NAS
DS1
CP
DX2
PV
DX1 PX1 PX2
LS1AX
DS2
Mass Storage *
AS1 AS2
Step 1 and OB4
• Data Storage• Site Backup
Serial Mux
FDDI
• Decoders• Radar processing • DialRadar • wfoAPI • RMRserver • Radar Server • Radar Text Decoder• WWA Processes• Database Engine• TextDB read/write• OH cron/apps • Shef decoder• MTA/MHS processes• LDAD server • LDAD routers • listener• Local ldad applications• NWWSProduct• LAMP
• IFP• GFE• Decoders • Satellite • Grib • BinLightning • Maritime • Profiler• Web Server• SafeSeas• CommsRouter• FFMP/SCAN• MSAS
• asyncScheduler• Decoders • Metar • Synoptic • RAMOS• Redbook• RadarStorage• DNS• NIS• SNMP Agent• NWWS Scheduler• NotificationServer
• LAPS• Decoders • BufMosDecoder • WarnDB • StdDB • Collective • Raob • aircraft • ACARS• Bufr Drivers• textNotificationserver
Issues• Will plaintree switch handle nfs traffic from DS/AS to NAS?• Does site have footprint for another rack before any old ones are decommissioned?• What all needs to change to repoint everyone to the NAS?• What is the file structure of the NAS?• Do most decoders use internal storage for temp files?• What kind of NCF monitoring will be available for the NAS?
LX
LX
Benefits• Solves Active/Active PX problem* DS Mass Storage and AutoLoader can be decommisioned
• Netmetrix• Trap Interceptor (ITO)
06/21/2004
Firewall
Step 1 and Release OB4
Red text indicates newly ported or moved processes
7
Step 1 - Incremental Activation
• Activate NAS– Point DS to NAS
– Allows DS1/DS2 to be used as active/active pair using existing MC/ServiceGuard infrastructure
– Decommission DS mass storage and autoloader– Point PX1/PX2 to NAS
– Maintain active/active pair– Deliver new availability mechanism– De-activate cluster management portion of AS2.1– PowerVault (PV) deactivated for near-term
• Cooked data to the NAS, temporary files continue to be written to local disk
• Short (2 week OAT) to verify DS1/DS2 and PX1/PX2 failover and NAS data
• OAT in early October to allow full deployment decision and complete deployment to initial 41 sites by 1 Feb 05
8
Step 1 – Current Action Items
• Coordinate serial mux upgrade hardware delivery to site with DS replacement. This reduces number of hardware installs to be monitored by NCF, but as APS software will not deploy the final cable connections will be delayed until Step 2. (NGIT - L. Dominy)
• Required infrastructure changes include consolidate data from DS mass storage and PX power vault, verify MC/SG on DS1/2, deactivate cluster manager portion of AS2.1 on PX1/2, install new availability software on PX1/2 and deactivate the power vault. (NGIT - L.Dominy)
• Verify only cooked data on NAS (currently all /data/fxa, /data/fxa_local, and /home). Review /data/fxa structure. (SEC - E. Mandel/FSL - D. Davis)
• Review IFP file system to determine which directories go to NAS for data storage (all binaries should be on a server(s) and temporary file directories to stay on PX1/2. (FSL - D. Davis)
9
Step 2- Decommission AS1/AS2
• Release OB4 maintenance release of stable OB5 software required for AS decommissioning
• Activate DX1/DX2– Infrastructure/decoders move to DX1– IFP/GFE to DX2–Newly ported functionality to DX1
• Reuse PX1 and PX2 as PX1 and SX1–PX1 for applications/processes using “cooked data”–SX1 for Web Server, local applications and eventually
LDAD–Activate serial mux replacement and ported APS
• Non-ported software from AS1 to DS2• Hosts remain on RH 7.2 as risk reduction
10
Step 2 and OB4.x/OB5 (w/RH7.2)
NAS
DS1
CP
DX2
PV
DX1 PX1 SX1
LS1AX
DS2
Step 2 and OB4.X/5 (w/RH7.2)
Serial Mux
FDDI
• Radar processing • DialRadar • wfoAPI • RMRserver • RadarServer • RadarText Decoder• WWA Processes• Database Engine• TextDB read/write• OH cron/apps • Shef decoder• MTA/MHS processes• LDAD server • LDAD routers • listener• Local ldad applications• NWWSProduct• LAMP
• IFP• GFE
• Decoders • Metar • Synoptic • RAMOS
• Decoders • BinLightning • Satellite • Grib • Maritime • WarnDB • StdDB • Collective • BufMosDecoder • Raob • aircraft • ACARS • profiler • Redbook• RadarStorage• RadarMsgHandler• HandleGeneric• Bufr Drivers• SafeSeas• CommsRouter• DNS• NIS• NTP• SNMP Agent• SMTP MTA/MHS
Issues:• Will asyncScheduler and NWWS Scheduler be done?• What about LAMP (will it be centralized?)• Start running X.400 and SMTP in parallel.• Should PX and SX be retrofitted with extra disks?
* Serial MUX is an AS Decommissioning dependency** Requires prototype testing
• Web Server• routerStoreNetcdf **
• FFMP/SCAN • LAPS• MSAS• text notficationserver• asyncScheduler• NWWS Scheduler• notification server
Benefits• Could stop maintenance on ASs• Could eliminate 1 rack of equipment• Could decommission FDDI at this point if 100MB boards were cheaper than maintenance.• Start migtration of MHS
LX
LX
• Netmetrix• Trap Interceptor (ITO)
06/21/2004
FirewallFirewall
• Data Storage• Site Backup
*
Red text indicates newly ported processes, green text for moved processes.
11
Step 2- Incremental Activation
• Failover scheme–DX1 to DX2–DX2 to DX1–PX1 applications (less LAPS which does not failover)
and servers to DX2–PX1 processes APS and NWWSProduct to SX1
(require serial mux access)–SX1 baseline software to PX1
• Decommission AS1/AS2 and excess rack• Linux SMTP MTA deployed as start of migration
from x.400 (required for DS decommission)
12
Step 2 - Current Action Items
• Ensure OB5 tasking with priority check-in for Linux AsyncProductScheduler and NWWS Scheduler. Does NWWS Product need to ported at same time? (SEC - E. Mandel)
• Ensure OB5 tasking with priority check-in for Linux LAPS, notificationServer, Redbook and radarStorage. If prototype testing is successful check-in Linux routerStoreNetcdf. (FSL - D. Davis)
• Activate serial mux upgrade and move/connect cables. (NGIT - L. Dominy)
• Requires movement of processes between servers, review architecture roadmap and verify no dependencies have been overlooked. (SEC, FSL, MDL, OH and NGIT)
13
Step 3 – Deploy Data Base and OS
• Deploy Linux POSTGES data base engine to DX1–Move existing PV to new rack and connect to DX1/2–Reconfigure PV (possibly into 2 separate direct
attached disk farms, one for each DX)–Database availability via mirrored or replicated
databases is TBD at this time• Migrate ported databases• Upgrade operating system (currently RHE3.0) on all
applicable hosts LX/XT CP (if full DVB deployment complete)DX AX (Linux Informix migration?)PX/SX RP?
14
Step 3 and OB6 (RHE3.0)
NAS
DS1
CP
DX2
PV
DX1 PX1 SX1
LS1AX
DS2
Step 3 and OB6 (w/RHE3.0)
Serial Mux
FDDI
• IFP• GFE
Issues:• When will apps be ready to use POSTGRES?• Data reliability with POSTGRES• Will there be issues with running Informix and POSTGRES in parallel?• What is the performance of POSTGRES?
PV
• FFMP/SCAN • LAPS• MSAS• text notfications• asyncScheduler• NWWS Scheduler• notification server
• Web Server• RouterStoreNetcdf• ldadMon
Benefits• Start migtration of Databases
• Database
LX
LX
FirewallFirewall
6/21/2004
• Database Engine• OH cron/apps• Shef decoder• TextDB read/write• Decoders • BinLightning • Satellite • Grib • Maritime • WarnDB • StdDB • Collective • BufMosDecoder • Raob • aircraft • ACARS • profiler • Redbook• RadarStorage• RadarMsgHandler• HandleGeneric• Bufr Drivers• SafeSeas• CommsRouter• DNS• NIS• NTP• SNMP Agent• SMTP MTA/MHS• MHS processes
• Radar processing • DialRadar • wfoAPI • RMRserver • RadarServer • RadarText Decoder• WWA Processes• Database Engine• LDAD server • LDAD routers • listener• Local ldad applications• NWWSProduct• LAMP
• Decoders • Metar • Synoptic • RAMOS• Netmetrix• Trap Interceptor (ITO)
• Data Storage• Site Backup
Red text indicates newly ported processes.
15
Step 3 – Incremental Activation
• Decommission AS 2.1 and HP Informix• HP Informix engine can remain on DS for site use
(though if all databases are ported not sure why?)• Transition to SMTP and decommission x.400
16
Step 3 - Action Items/Questions
• Verify if RHE3.0 to be procured for all hosts in OB6 timeframe. (OST – Ed Mandel)
• Can/should POSTGRES be delivered early (OB5.x) to RH7.2 DXs? –Most sites run software against GFS databases, not
their own databases.–Database and software will be tested with RHE3 only
as part of OB6.– If databases delivered early how/when does parallel
ingest get developed and tested?–Should RFCs be handled differently?
• Determine plans for RFC archive server’s use of Informix.
• Do the RP servers get RHE3.0 with OB6?
17
Step 4 – Deploy LDAD Upgrade
• Deploy LS1 and LS2–Deploy redundant server pair (requirements still tbd)–Could reuse PX1/SX1 as LS1/LS2 and use new
generation hardware for PX1/SX1• Activate LS1/LS2
–Migrate internal LDAD processing to SX1–Some internal and external LDAD processing must
transition at same time• Reuse existing HP LS on internal LAN
–Existing 10/100 MB LAN card
18
Step 4 and OBx
NAS
DS1
CP
DX2
PV
DX1 PX1 SX1
LSx1
• LDAD Server applications
AX
DS2
Step 4 and OBx
Serial Mux
FDDI
• IFP• GFE
Issues:• Transition from LS to LSx needs to consider how/when local ldad apps will be done.• Do we run old LS and new LSx in parallel while sites convert their local software?• Int/Ext Ldad software must be ready to transition at the same time.• Could reuse PX1 and SX1 for LSx1/2 and buy higher end machines for PX1 and SX1.
PV
• FFMP/SCAN • LAPS• MSAS• text notfications• asyncScheduler• NWWS Scheduler• notification server
• Web Server• ldadMon• LDAD server • LDAD routers • listener• Local ldad applications
LSx1
Benefits• Move old LS to internal LAN
• Database
LX
LX
LS1
FirewallFirewall
6/21/2004
• Decoders • Metar • Synoptic • RAMOS• Netmetrix• Trap Interceptor (ITO)
• Radar processing • DialRadar • wfoAPI • RMRserver • RadarServer • RadarText Decoder• WWA Processes• NWWSProduct• LAMP
• Data Storage• Site Backup
• Database Engine• OH cron/apps• Shef decoder• TextDB read/write• Decoders • BinLightning • Satellite • Grib • Maritime • WarnDB • StdDB • Collective • BufMosDecoder • Raob • aircraft • ACARS • profiler • Redbook• RadarStorage• RadarMsgHandler• HandleGeneric• Bufr Drivers• SafeSeas• CommsRouter• DNS• NIS• NTP• SNMP Agent• SMTP MTA/MHS• MHS processesRed text indicates newly ported processes.
19
Step 5 – Decommission DS1/DS2
• Continue to move ported software/databases to Linux servers–Combine with step 6 if additional DXs are required–DX1 for database server and infrastructure–DX2 for IFP/GFE–DXn for decoders
• All Linux devices on GbE LAN• Remaining functionality on LS on 100MB LAN
–DialRadar/wfoAPI (tied to FAA/DoD requirements) may be OBE at this point. If so, Simpacts and LS can be decommissioned at all non-hub sites.
–Netmetrix (required at hub sites only)
20
Step 5 and OBx
NAS CP
DX2
PV
DX1PX1 SX1
LSx1
• LDAD Server applications
AX
Step 5 and OBx
• Data Storage
Serial Mux
• Radar processing • wfoAPI • RMRserver • RadarText • RadarServer • RadarStorage • RadarMsgHandler• WWA Processes• NWWSProduct
• IFP• GFE• Trap Interceptor (ITO)
Issues:• Dialup and wfoAPI may still be needed on an HP-UX machine• Netmetrix at HUB sites needs to be on HP-UX
PV• FFMP/SCAN • LAPS• MSAS• text notfications• asyncScheduler• NWWS Scheduler• notification server• LAMP
• Web Server• ldadMon• LDAD server • LDAD routers • listener• Local ldad applications
LSx1
• Database• DialRadar• Netmetrix
LX
LX
LS1
Site Backup Device
Benefits• FDDI and DS decommissioned.• Reuse of LS for SW that isn't ported yet. (non-redundant)• Rack consolidation
DX3(optional)
Not upgraded• Printers• Xyplex• Modems• VIRs• Simpacts
FirewallFirewall
6/21/2004
• Database Engine• OH cron/apps• Shef decoder• TextDB read/write• Decoders • BinLightning • Satellite • Grib • Maritime • WarnDB • StdDB • Collective • BufMosDecoder • Raob • aircraft • ACARS • profiler • Redbook • Metar • Synoptic • Redbook • RAMOS• HandleGeneric• Bufr Drivers• SafeSeas• CommsRouter• DNS• NIS• NTP• SNMP Agent• SMTP MTA/MHS• MHS processes
Red text indicates newly ported processes.
21
Step 6 – Process Loading
• Incrementally add DX hosts for load balancing for new functionality and data sets.–DX1 for database server and infrastructure–DX2 for IFP/GFE–DXn for decoders
22
Step 6 and OBx
NAS CP
DX2
PV
DX1PX1 SX1
LSx1
• LDAD Server applications
AX
Step 6 and OBx
• Data Storage
Serial Mux
• Radar processing • wfoAPI • RMRserver • RadarText • RadarServer • RadarStorage• WWA Processes• TextDB read/write
• IFP• GFE• Trap Interceptor (ITO)
PV• FFMP/SCAN • LAPS• MSAS• text notfications• asyncScheduler• NWWS Scheduler• notification server• LAMP
• Web Server• ldadMon• LDAD server • LDAD routers • listener• Local ldad applications
LSx1
• Database• DialRadar• Netmetrix
LX
LX
LS1
Site Backup Device
DX3(optional)
Not upgraded• Printers• Xyplex• Modems• VIRs• Simpacts
FirewallFirewall
6/21/2004
• Database Engine• OH cron/apps• Shef decoder• DNS• NIS• NTP• SNMP Agent• SMTP MTA/MHS• MHS processes
• Decoders • BinLightning • Satellite • Grib • Maritime • WarnDB • StdDB • Collective • BufMosDecoder • Raob • aircraft • ACARS • profiler • Metar • Synoptic • Redbook • RAMOS• Bufr Drivers• SafeSeas• CommsRouter
Green text indicates moved processes.
23
High Level Schedule
• Key dates –notificationServer port and enhancements–Step 1 OAT (must complete before initial deployment)–Step 2 OB4.x OAT and Full Deployment–OB6 check-in
Task
OB5 Check-in
notif icationServer
Step 1-OAT
Step 1-Initial Deploy
OB5 Beta
Step 2-OB4.x OAT **Accelerate if possible
Step 2-OB4.x-Deploy
Step 1- Full Deploy
OB5 Full Deployment
Step 3-DB Availability
Step 3- RHE3 procure
OB6 code check-in
OB6 Beta
Step 3-Early OB5.x
OB6 Full Deploy
Jul-04 Aug-04 Sep-04 Oct-04 Nov-04 Dec-04 Jan-05 Feb-05 Jul-05 Aug-05 Sep-05Mar-05 Apr-05 May-05 Jun-05