ARMADA - RAL · Rawinsonde 5 >1000 N/A 8 N/A Towers (32M) 6 5 10 7 259,200 Wind Profiler 2 30...

Preview:

Citation preview

ARMADAARMADA(AArmy RRDT&E MMeteorologicalAArchitecture for DData AArchival

4DWX Forecasters Training27 February 2008

Outline

ARMADAARMADA at the Range

BreakBreakARMADA ETL DemonstrationARMADA Feedback Session

The significant problems we haveThe significant problems we havecannot be solved at the same cannot be solved at the same level of thinking with which we level of thinking with which we c r e a t e d t h e m .c r e a t e d t h e m . - A l b e r t E i n s t e i n

http://www.quotationspage.com/quote/23588.html

Data Warehouse1

A data warehouse is an application with a computerda tabase tha t co l lec ts , i n teg ra tes and s to res anorganization's data with the aim of producing accurate andt imely management of in format ion and support ingdata analysis. The practice of data warehousing includesthe storage of virtually all transactional data, master data(customer, material), and meta data2 at a very detailed level.

1From Wikipedia http://en.wikipedia.org/wiki/Data_warehouse2Meta Data – data about the data

Human(1800)

Strip Charts(1950)

TimeTime

Dat

a Vo

lum

eD

ata

Volu

me Digital (ASCII)

(1987)

Database (MySQL)(2002)

ARMADA1

(2007)

Paradigms

Data Archival Systems

1Data Warehouse

N/A8N/A>10005Rawinsonde

259,20071056Towers (32M)

2,88031800302Wind Profiler

23,04081018Present Weather

7,48830300126SAMS1 (Mesonet)

86,40061052Tethersonde

43,200,00040.1150Sonics (3D)

11,5202900403SODAR

976,3207101113PWIDS2

2,419,20041128Electric Field Meters

345,600115203Ceilometers

VerticalVertical(Levels)(Levels)

HorizontalHorizontal(Units)(Units)

ObservationObservationRecords PerRecords Per

DayDay

ObservingObservingParametersParameters

(Count)(Count)

FrequencyFrequency(Seconds)(Seconds)

SpatialSpatialPlatformPlatformObserving UnitObserving Unit

Yellow – Continuous permanent 1SAMS – Surface Atmospheric Measuring SystemOrange – Continuous mobile 2PWIDS - Portable Weather Instrumentation Data SystemLight Blue – Manual mobile

Meteorological Observing Platformsat Dugway Proving Ground

Pre ARMADAData Collection

Data collection was developed around the observing platform– Databases: Microsoft Access, Microsoft SQL Server, and MySQL AB.– Text files: delimited, fixed column, and terse ASCII– Other media: magnetic tape, CDs, and paper– Examples of method and media used for archiving

• SAMS – MySQL database RAID1

• Sonics – delimited or terse ASCII CDs• PWIDS – delimited ASCII on CDs• Rawinsonde – text files on personal desktop

Metadata– Typically archived in spreadsheets– Metadata often archived on different computers

1RAID - Redundant Arrays of Independent Drives

Current SAMSRange Database

Create an application toextract, transport, and loadmultiple data sets

SAMS only!DataIngesters

Archive ALL MetadataNot archived

Metadata

Archive ALL range dataNot included in SAMS DB

Range Data

Create a quality assuranceprogramNone or minimal!Quality

Assurance

SolutionProblemProblem Areas

Rawindsonde SODARs

PWIDS Tower

Sensors Calibration

Site HistoryLocation

Sensor Location

Current Requirements

Any meteorological data and/or metadatacollected must be:– Accessible– Available– Usable

Data integration (modeling) Standardization

– Labels, units, and time

Centralizing the Data

Goal of ARMADA is to centrally archive ALL data andmetadata– Relation database management system (RDBMS)– Data warehouse

Controlled by a MySQL AB. database server– Single point input and output access– Applying proper information assurance procedures (Army

Regulation 25-2) can reduce single point failures Standardization of labels and units is applied to all

components in the archive– Database and table names follow predefined naming convention– Column names follow Climate Forecast (CF) standard names

and associated units (typically SI)

ARMADAARMADASystem Concept

Sounding

Other/Historical

Data

SODAR

SAMS

PWIDS

Tower

Archive(Databases)

DataIngest

Data Output

22 33

11

QualityAssurance

44

4DWX

ARCHIVEARCHIVE

Web

GUIs

CLI2

4DWX3

Serial Ports

TCP/IP

Text Files

Web Portal(Metadata)

ETL1

ApplicationsEnd User

Applications

Quality Quality AssuranceAssurance

FieldFieldDataData

1Extract, Transform, Load2Command Line Interface

ARMADA(AArmy RRDT&E MMeteorologicalAArchitecture for DData AArchival)

3Army RDT&E Four-Dimensional Weather

Standardizationof Archive (1)

Self Describing– Standard naming conventions

• Database/Table/Column• Column/Variable (Climate Forecast Compliant)http://cf-pcmdi.llnl.gov/documents/cf-standard-

names/2/cf-standard-name-table.html– International System of Units (SI)

http://physics.nist.gov/cuu/Units/units.html

Applies to all archive components

Archive Databases (1)

SAMSSAMS11

PWIDSPWIDS11

TowerTower11

META

SensorFielded

Non-FieldedCalibration

1Note site metadata history table resides in platform database

Platforms

Example of aPWIDS Database

Site History Table

Real Time Archive Table

Archive Table+-------------------------+

| Tables_in_pwids |

+-------------------------+

| pwids_2007_02 |

| pwids_2007_03 |

| pwids_2007_04 |

| pwids_2007_05 |

| pwids_month || pwids_sitepwids_site-metadata-metadata |+-------------------------+

Example of aPWIDS Table

A time stamp (UTC) record of insertion or modificationtimestamp

last_update_date_time

tinyintnorthward_wind_qc

Note: flag notation has yet to be determinedtinyinteastward_wind_qc

Raw data quality control flagtinyintwind_speed_qc

tinyintwind_from_direction_qc

doublenorthward_wind

Note: Data types are not restricted to doublesdoubleeastward_wind

Raw data fields that utilize SI unitsdoublewind_speed

doublewind_from_direction

The id of the site location and is subject to changechar(10)site_id

Is a unique unit id of a platform (i.e., SAMS)char(10)unit_id

Date time (UTC) of observationdatetimedate_time

PurposeTypeField

Note: temperature, moisture, and pressure have been left off

4/15/20074/15/2007

1212

Unit ID = 12Site ID = Y

WSMR BoundariesWSMR Boundaries

1212

Unit ID = 12Site ID = X

2/12/20062/12/2006 1212

Unit ID = 12Site ID = Z

2/2/20082/2/2008

Optional name for the site locationchar(50)site_name

Optional notes about the site locationtinytextnotes

Arbitrary altitude of sensors above ground leveldoubleabove_ground_level

Altitude at the surface above mean sea leveldoublesurface_altitude

Geodetic latitude (Note northern hemisphere is positive)doublelatitude

Geodetic longitude (Note western hemisphere is negative)doublelongitude

Date time end for validity of recorddatetimedate_time_stop

Date time start for validity of recorddatetimedate_time_start

The id of the site location and is subject to changechar(10)site_id

Is a unique unit id of a platform (i.e., SAMS)char(10)unit_id

Define a unique number to guarantee no duplicity (auto inc)intunique_site_id

PurposeTypeFields

Example of a PWIDS Site Meta Table

Metadata

SensorSensor Location

CalibrationSite Location

Users

WebInterface

Options1. Web portal and archive at DPG2. Web portal and archive at each range

MEMORIZE

Because table names use a “--” (dash), theseare considered special characters. To escapethis character use the back tick “`̀” found onthe left of the number “1” key. Note the backtick can be applied to any special character.

Example:select * from `pwids_site-metadata`;

Ingest Applications (2)

Turn on/offprocess buttons

Last data record ingested

site id and date + time

ConfigurableTitle

Enlarged visual Indicator of

on()/off() processes buttons

Button to manuallyforce an ingest process

(runs only once)

Turn off/onAuto update

Check box is on

Background colors in last data updatedYellow - idleGreen – processingRed – auto update offOrange – an output process is offMagenta – no new data

Current systemclock date/time

Configuration File

Provides instructions to File Ingester to:1. Extract

Location and architecture of data file2. Transform

Converts raw data to ARMADA architecture3. Load

Writes or populates the data into ARMADA

• Coded in XML (eXtensible Markup Language)– Readable and/or self describing– Rigid

Configuration editor is in the plans for FY08

Configuration file

Broken down into two groups– Global parameters

• Effects all data• Applies to:

– Database connections– Base directory– Time of data processing

– Site or Unit parameters• Effects on individual observations• Applies to:

– Interpolation– Calculations– Upload location within ARMADA

<?xml version="1.0" ?>

<FILEINGESTER>

<BASEDIRECTORY Dir="c:/fm" /><DISPLAYTITLES Title="FIELD MILL" SiteTitle="FM ID" DateTitle="COOL" TimeTitle="TIME UTC"/><TIMERGROUP TimeDataIO="60" TimeProcess="5" TimeAlarm="60" TimeStationUpdate="45"/><DATABASECONNECT Source="140.196.88.15" Username="remote" Password="remote" Database="field-mills" Port="3307"/><STARTPROCESS Auto="true" Database="true" Output="false" /><GLOBALSITEINFO SiteTable="`field-mills_site-metadata`" SingleOBFileCount="28" MultiOBFileCount="0"><SINGLEOBFILE FileName="c:/fm_data/EFM001_Tab1sec" StringSplitter="," EndOfLine="\\n" CharsAllowed="-0123456789.:,"-AppendFilePrefix="DCP0Z_" AppendFileDate="day" OverWrite="True" RemoveFile="True" SkipHeader="4"> <SITEID SiteID="1" Unit_ID="1" Site_ID="1" ColumnCount="13" VariableCount="6" EquationCount="0"SpecialEquations="0" TableName="`field-mills_month_test`" AppendFilePrefix="EFM001_Tab1sec" AppendFileDate="none"OverWrite="True" RemoveFile="True"> <DATETIME MySQLDT="-1" Year4="0" Year2="-1" Month="1" DayOfYear="-1" DayOfMonth="2"

HourMinute="-1" Hours="3" Minutes="4" Seconds="5" UTCTimeOffSet="0" SystemTime="false" /> <INSTANCE>

<COLUMNVARIABLE ColumnNumber="7" ColumnName="surface_electric_field" Type="Constant" Unit="NONE" /> <COLUMNVARIABLE ColumnNumber="8" ColumnName="status" Type="Constant" Unit="NONE" /> <COLUMNVARIABLE ColumnNumber="9" ColumnName="leakage_current" Type="Constant" Unit="NONE" /> <COLUMNVARIABLE ColumnNumber="10" ColumnName="panel_temperature" Type="Temperature" Unit="C" /> <COLUMNVARIABLE ColumnNumber="11" ColumnName="battery_voltage" Type="Constant" Unit="NONE" /> <COLUMNVARIABLE ColumnNumber="12" ColumnName="internal_relative_humidity" Type="Moisture" Unit="%" /> </INSTANCE> </SITEID></SINGLEOBFILE>

Example

Output (3)

Commercial or open source applications– MySQL Query Browser– Microsoft Excel

Custom applications– Excel (Macros)

• Times Series• Climatologies

– SAMS Report– HPAC GUI Data Getter– PWIDS Display– Field Meter Display

PWIDS Display(Portable Weather Instrumentation Data System)

2D Electric Field (V/m)Contour Plot

Quality Assurance (4)

Installation

Measurements

CommunicationData Collection

ARMADA

Quality Control

ARMADA QC/QA

• Quality Control– Some data tests

• PWIDS• Sonics• Wind Profiler (NIMA)

– Data are only QC’ed forcustomer requested data

– Not archived in QC format

• Quality Assurance1. QA applied to all data2. Create an archive

environment to support QA(ARMADA)

3. Develop a QA program toa. Automate processb. Manual inspectionc. Process in near real

timed. Rigorous follow on tests

Current Goal

Comparison

Multiple Test + VisualRange Test + Visual

Higher Confidence1,2Lower Confidence1,2

AutomatedManual

Lower Labor Costs1Higher Labor Costs1

Real TimePost Analysis

Single Application For MultipleData SetsSingle Application Per Data Set

GoalGoalCurrentCurrent

1Per datum2End user

ARMADA QA Flow

QA FlagsRaw

QualityControlServer

PassPassFailFail

GOLDStandard

Do Nothing

Pass AllTests

Fail ≥ 1Tests

TemporaryTemporaryData StorageData Storage QCS: Developed in Python

QA Flag

The net results of QA tests! Every variable in ARMADA is assigned a QA flag Results of every test are archived

1. All tests are packed into a type2. A type can be archived as a single element in the

database3. Initially only pass/fail results will be archived4. Possible to archive more test result information

QA Flag is composed of 0 and 1’s that describe theresults of the tests

1.84*1018

4.29*109

1.68*107

65,536

256

256

16

2

PossibleIntegers

64

32

24

16

8

8

4

0 or 1

Bits

8

4

3

2

1

1

1/2

1/8

Bytes

01010101 01010101 01010101Medium INT

Bit RepresentationType

01010101Tiny INT

01010101 01010101Small INT

01010101 01010101 01010101 01010101INT

01010101 01010101 01010101 0101010101010101 01010101 01010101 01010101Big INT

01010101Byte

0101Nibble1

1Bit

MySQL Types

1Not a MySQL type

Packing Results via Bits0 000011 000122 001033 001144 010055 010166 011077 011188 100099 1001AA 1010BB 1011CC 1100DD 1101EE 1110FF 1111

0 00011 00122 01033 01144 10055 10166 11077 111

P 0FF 1

N 00PP 01AA 10BB 11

2-Base 4-Base 8-Base 16-Base … N-Base Bit Oct Hex

PassPass

FailFail

Increasing info per test

AboveAbove BelowBelowValidity TestValidity Test

NotNotTestedTested

Human/DB/Computer(RELATIONSHIP)

Validity CheckValidity Check

ComputerMySQL(Integer)

Human

113Below

102Above

011Pass

000Not Tested

QC Approach

QA FlagsRaw

QualityControlServer

ConfigFile

(XML)

0 1 1 1 0 0 1 1

Temperature_QC = int (Flag)

Tests1

ManualManualValidityValidity

PersistencePersistenceBuddyBuddy

1All tests are performed

ARAMADA Implementation

Pass/FailA few test on SAMSManualValidityPersistenceStep testBuddy Check

Will take years to implement through outARMADA!

ARMADA SUMMARY

All data and metadata will be archived in acentral location

Repository will be standardized Naming conventions SI Units

Quality Assure all data 4DWX access

Questions?

ARMADA at the RANGE

SAMS DB Migrationto ARMADA

SAMS Ingest will be replaced by File Ingester– Allow non-fixed number of elements in SAMS data stream– Interpret numbers and strings– Capable of archiving additional derived variables

(See hand out) SAMS DB will conform to ARMADA

standardization SAMS DB will be reconfigured around range needs SAMS DB will be rebuilt during range installation SAMS DB and ARMADA will run simultaneously

indefinitely (suggested 3-6 months)

ARMADA Installation

Software1. Fileingester

• SAMS• Rawindsonde

2. MySQL 5.x3. Range QC4. Configurator?

Setup1. Software install +

database setup2. SAMS + possible

other data sets3. Training4. Upload old SAMS DB

to ARMADA

One week installation (4-5 days) Starting FY08, except CRTC August 07 Run old SAMS DB simultaneously for 3-6 months

Hardware

Current Hardware

Quality Assurance

Database +Output (Web)

Storage (4TB)

Data Ingest

Quality Assurance

Output (Web)

Database

Storage (8TB)

FY 08 FY 09-11 FY 11-13

Current Hardware

(Data Ingest)

Recommended OptimalAcceptable

DesktopDesktop

ServerServer

RAIDRAID

Backups/Redundancy

All raw data will be archived1

Ingest software will be able to• Write raw data to local files• Read raw data files and populate into ARMADA

Database Tables– Updating tables daily or weekly– Archive tables once or has been updated– Will be automatic

Master archive for all ranges located at DPG–– Need to get Port 3306 open through the firewall!Need to get Port 3306 open through the firewall!

1At a minimum raw data should be archived on long termtransferable media such as CD or DVD

Information Assurance

Adhere to Army Regulation 25-2 “InformationAssurance”and comply with local DIACAP (DoDInformation Assurance Certfication andAccreditation Process) requirements

Tighten up access to ARMADA– No more global IP’s– No remote root access1

Metadata web portal will be password protected Sensitive data can be archived in its own

database, and is not limited to the masterarchive

1Not possible on PC based systems

Maintenance/AdministrativeResponsibilities 0-2 Years1

Range1. Manage raw data

archives2. Load archived raw data3. Assist in backup and

recovery4. Maintain metadata5. Monitor data flow

DPG1. Primary MySQL admin2. Backup database

tables3. Master repository4. ARMADA development5. Customer support

1Negotiable

Out Years Objectives

1. Update Hardware (rack mounted system)2. Web interface or GUI tools to ARMADA, climatology, time series, etc3. Expand ingest capabilities beyond File Ingester4. Primary range data source to 4DWX

0909

1. QA Program Version 22. All range data + historical data in ARMADA3. Develop data mining tools

1010

1. Migrate all capabilities to Linux?2. Include non-meteorological data (Test reports, HPAC output, etc.)11-1311-13

Quality Assurance Program Version 1 Upgrading File Ingester for SODAR/Profiler Metadata Web Server operational Decommission SAMS DB

Install ARMADA at ATC, EPG, NVL,RTTC, WSMR, and YPG

Master repository at DPG Replace Legacy Programs (SAMS

Report)

0808

Install ARMADA at CRTC0707

SummerSummerSpringSpringWinterWinterFallFallFYFY

Summary

ARMADA is coming!– CRTC August– All other Ranges Fall 07 to Winter 08– Plan on a full week

ARMADA will replace SAMS DB New hardware is not required for initial install ARMADA will comply with the local DIACAP

Questions?

Recommended