46
1 High Performance Data Streaming in a Service Architecture Jackson State University Internet Seminar November 18 2004 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 [email protected] http://www.infomall.org http://www.grid2002.org

High Performance Data Streaming in a Service Architecture

Embed Size (px)

DESCRIPTION

High Performance Data Streaming in a Service Architecture. Jackson State University Internet Seminar November 18 2004 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 [email protected] - PowerPoint PPT Presentation

Citation preview

Page 1: High Performance Data Streaming in a Service Architecture

11

High Performance Data Streaming in a Service

ArchitectureJackson State University

Internet SeminarNovember 18 2004

Geoffrey FoxComputer Science, Informatics, Physics

Pervasive Technology LaboratoriesIndiana University Bloomington IN 47401

[email protected]://www.infomall.org http://www.grid2002.org

Page 2: High Performance Data Streaming in a Service Architecture

22

Abstract We discuss a class of HPC applications characterized

by large scale simulations linked to large data streams coming from sensors, data repositories and other simulations.

Such applications will increase in importance to support "data-deluged science”.

We show how Web service and Grid technologies offer significant advantages over traditional approaches from the HPC community.

We cover Grid workflow (contrasting it with dataflow) and how Web Service (SOAP) protocols can achieve high performance

Page 3: High Performance Data Streaming in a Service Architecture

33

Parallel Computing Parallel processing is built on breaking problems up

into parts and simulating each part on a separate computer node

There are several ways of expressing this breakup into parts with Software: • Message Passing as in MPI or• OpenMP model for annotating traditional languages• Explicitly parallel languages like High Performance Fortran

And several computer architectures designed to support this breakup• Distributed Memory with or without custom interconnect• Shared Memory with or without good cache• Vectors with usually good memory bandwidth

Page 4: High Performance Data Streaming in a Service Architecture

4

The Six Fundamental MPI The Six Fundamental MPI routinesroutines

MPI_Init MPI_Init (argc, argv) -- initialize(argc, argv) -- initialize MPI_Comm_rankMPI_Comm_rank (comm, rank) -- find (comm, rank) -- find

process label (rank) in groupprocess label (rank) in group MPI_Comm_sizeMPI_Comm_size(comm, size) -- find total (comm, size) -- find total

number of processesnumber of processes MPI_SendMPI_Send

(sndbuf,count,datatype,dest,tag,comm) -- (sndbuf,count,datatype,dest,tag,comm) -- send a messagesend a message

MPI_RecvMPI_Recv (recvbuf,count,datatype,source,tag,comm,st(recvbuf,count,datatype,source,tag,comm,status) -- receive a messageatus) -- receive a message

MPI_FinalizeMPI_Finalize( ) -- End Up( ) -- End Up

Page 5: High Performance Data Streaming in a Service Architecture

55

Whatever the Software/Parallel Architecture …..

The software is a set of linked parts • Threads, Processes sharing the same memory or independent programs

on different computers

And the parts must pass information between them in to synchronize themselves and ensure they really are working the same problem

The same of course is true in any system• Neurons pass electrical signals in the brain

• Humans use a variety of information passing schemes to build communities: voice, book, phone

• Ants and Bees use chemical messages

Systems are built of parts and in interesting systems the parts communicate with each other and this communication expresses “why it is a system” and not a bunch of independent bits

Page 6: High Performance Data Streaming in a Service Architecture

66

A Picture from 20 years ago

Page 7: High Performance Data Streaming in a Service Architecture

77

Passing Information Information passing between parts covers a wide range

in size (number of bits electronically) and “urgency” Communication Time = Latency + (Information

Size)/Bandwidth From Society we know that we choose multiple

mechanisms with different tradeoffs• Planes and high latency and bandwidth• Walking is low latency but low bandwidths• Cars are somewhat in between theses cases

We can always think of information being transferred as a message• If airplane passenger, sound waves or a posted letter• Whether if an MPI message or UNIX Pipe between processes

or a method call between threads

Page 8: High Performance Data Streaming in a Service Architecture

88

Parallel Computing and Message Passing We worked very hard to get a better programming model

for parallel computing that removed need for user to• Explicitly decompose problem and derive parallel

algorithm for decomposed parts• Write MPI programs expressing explicit decomposition

This effort wasn’t so successful and on distributed memory machines (including BlueGene/L) at least message passing of MPI style is the execution model even if one uses a higher level language

So for parallelism, we are forced to use message passing and this is efficient but intellectually hard

Page 9: High Performance Data Streaming in a Service Architecture

99

The Latest Top 5 in Top500

Page 10: High Performance Data Streaming in a Service Architecture

10

What about Web Services?• Web Services are distributed computer programs that

can be in any language (Fortran .. Java .. Perl .. Python) • The simplest implementations involve XML messages

(SOAP) and programs written in net friendly languages like Java and Python

• Here is a typical e-commerce use?

Security Catalog

PaymentCredit Card

WarehouseshippingWSDL interfaces

WSDL interfaces

Page 11: High Performance Data Streaming in a Service Architecture

1111

Internet Programming Model Web Services are designed as the latest distributed computing

programming paradigm motivated by the Internet and the expectation that enterprise software will be built on the same software base

Parallel Computing is centered on DECOMPOSITION Internet Programming is centered on COMPOSITION The components of e-commerce (catalog, shipping, search,

payment) are NATURALLY separated (although they are often mistakenly integrated in older implementations)

These same components are naturally linked by Messages MPI is replaced by SOAP and the COMPOSITION model is

called Workflow Parallel Computing and the Internet have the same execution

model (processes exchanging messages) but very different REQUIREMENTS

Page 12: High Performance Data Streaming in a Service Architecture

1212

Requirements for MPI Messaging

MPI and SOAP Messaging both send data from a source to a destination

• MPI supports multicast (broadcast) communication;

• MPI specifies destination and a context (in comm parameter)

• MPI specifies data to send• MPI has a tag to allow flexibility in processing in source processor

• MPI has calls to understand context (number of processors etc.)

MPI requires very low latency and high bandwidth so that tcomm/tcalc is at most 10

• BlueGene/L has bandwidth between 0.25 and 3 Gigabytes/sec/node and latency of about 5 microseconds

• Latency determined so Message Size/Bandwidth > Latency

tcommtcalc tcalc

Page 13: High Performance Data Streaming in a Service Architecture

1313

BlueGene/L MPI I

http://www.llnl.gov/asci/platforms/bluegene/papers/6almasi.pdf

Page 14: High Performance Data Streaming in a Service Architecture

1414

BlueGene/L MPI II

http://www.llnl.gov/asci/platforms/bluegene/papers/6almasi.pdf

Page 15: High Performance Data Streaming in a Service Architecture

1515

BlueGene/L MPI IIIhttp://www.llnl.gov/asci/platforms/bluegene/papers/6almasi.pdf

500 Megabytes/sec

Page 16: High Performance Data Streaming in a Service Architecture

1616

Requirements for SOAP Messaging Web Services has much of the same requirements as MPI with two

differences where MPI more stringent than SOAP• Latencies are inevitably 1 (local) to 100 milliseconds which is

200 to 20,000 times that of BlueGene/L 1) 0.000001 ms – CPU does a calculation 2) 0.001 to 0.01 ms – MPI latency 3) 1 to 10 ms – wake-up a thread or process 4) 10 to 1000 ms – Internet delay

• Bandwidths for many business applications are low as one just needs to send enough information for ATM and Bank to define transactions

SOAP has MUCH greater flexibility in areas like security, fault-tolerance, “virtualizing addressing” because one can run a lot of software in 100 milliseconds• Typically takes 1-3 milliseconds to gobble up a modest message

in Java and “add value”

Page 17: High Performance Data Streaming in a Service Architecture

1717

Ways of Linking Software Modules

Module A

Module B

.001 to 1 millisecondMETHOD CALL BASED

Service A

Service B

Messages

0.1 to 1000 millisecond latencyMESSAGE BASED

Coarse Grain Service ModelClosely coupled Java/Python …

Service B Service A

PublisherPost Events

“Listener”Subscribe to Events

Message Queue in the Sky

EVENT BASED with brokered messages

Page 18: High Performance Data Streaming in a Service Architecture

1818

MPI and SOAP Integration Note SOAP Specifies format and through WSDL

interfaces MPI only specifies interface and so interoperability

between different MPIs requires additional work• IMPI http://impi.nist.gov/IMPI/

Pervasive networks can support high bandwidth (Terabits/sec soon) but latency issue is not resolvable in general way

Can combine MPI interfaces with SOAP messaging but I don’t think this has been done

Just as walking, cars, planes, phones coexist with different properties; so SOAP and MPI are both good and should be used where appropriate

Page 19: High Performance Data Streaming in a Service Architecture

1919

NaradaBrokering http://www.naradabrokering.org We have built a messaging system that is designed to

support traditional Web Services but has an architecture that allows it to support high performance data transport as required for Scientific applications• We suggest using this system whenever your application can

tolerate 1-10 millisecond latency in linking components

• Use MPI when you need much lower latency Use SOAP approach when MPI interfaces required but

latency high• As in linking two parallel applications at remote sites

Technically it forms an overlay network supporting in software features often done at IP Level

Page 20: High Performance Data Streaming in a Service Architecture

20

Pentium-3, 1GHz, 256 MB RAM100 Mbps LAN

JRE 1.3 Linux

hop-3

0

1

2

3

4

5

6

7

8

9

100 1000

Tra

nsit

Del

ay

(Mill

isec

onds

)

Message Payload Size (Bytes)

Mean transit delay for message samples in NaradaBrokering: Different communication hops

hop-2

hop-5 hop-7

Page 21: High Performance Data Streaming in a Service Architecture

2121

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

1000 1500 2000 2500 3000 3500 4000 4500 5000

Sta

nd

ard

De

via

tion

(M

illis

eco

nd

s)

Message Payload Size (Bytes)

Standard Deviation for message samples in NaradaBrokering Different communication hops - Internal Machines

hop-2hop-3hop-5hop-7

Page 22: High Performance Data Streaming in a Service Architecture

22

Page 23: High Performance Data Streaming in a Service Architecture

23

Average Video Delays for one broker – divide by N for N load balanced brokers

Latency ms

# Receivers

One sessionMultiple sessions

30 frames/sec

Page 24: High Performance Data Streaming in a Service Architecture

24

NB-enhanced GridFTPAdds Reliability and Web Service Interfaces to GridFTPPreserves parallel TCP performance and offers choice of transport andFirewall penetration

Page 25: High Performance Data Streaming in a Service Architecture

2525

Role of WorkflowRole of Workflow

Programming SOAP and Web Services (the Grid)Programming SOAP and Web Services (the Grid): : Workflow describes linkage between servicesWorkflow describes linkage between services

As distributed, As distributed, linkage must be by messageslinkage must be by messages Linkage is two-way and has both control and dataLinkage is two-way and has both control and data Apply to multi-disciplinary, multi-scale linkage, Apply to multi-disciplinary, multi-scale linkage,

multi-program linkage, link multi-program linkage, link visualization to visualization to simulationsimulation, GIS to simulations and visualization , GIS to simulations and visualization filters to each otherfilters to each other

Microsoft-IBM specification Microsoft-IBM specification BPELBPEL is current is current preferred Web Service XML specification of preferred Web Service XML specification of workflowworkflow

Service-1 Service-3

Service-2

Page 26: High Performance Data Streaming in a Service Architecture

2626

Example workflowExample workflow

Here a sensor feeds a data-mining application(We are extending data-mining in DoD applications with Grossman from UIC)The data-mining application drives a visualization

Page 27: High Performance Data Streaming in a Service Architecture

2727

Example Flood Simulation workflowExample Flood Simulation workflow

DataArchives

DataArchives

RunoffModel

RunoffModel

FlowModel

FlowModel

FlowModel

GIS Grid Services Link Distributed

Data and Applications

SOAP MessagesAnd Events

DataArchives

DataArchives

RunoffModel

RunoffModel

FlowModel

FlowModel

FlowModel

GIS Grid Services Link Distributed

Data and Applications

SOAP MessagesAnd Events

Page 28: High Performance Data Streaming in a Service Architecture

2828

SERVOGrid Codes, RelationshipsSERVOGrid Codes, Relationships

Elastic DislocationPattern Recognizers

Fault Model BEM

Viscoelastic Layered BEM

Viscoelastic FEMElastic Dislocation Inversion

This linkage called Workflow in Grid/Web Service parlance

Page 29: High Performance Data Streaming in a Service Architecture

29

Two-level Programming I• The Web Service (Grid) paradigm implicitly assumes a

two-level Programming Model• We make a Service (same as a “distributed object” or

“computer program” running on a remote computer) using conventional technologies– C++ Java or Fortran Monte Carlo module

– Data streaming from a sensor or Satellite

– Specialized (JDBC) database access

• Such services accept and produce data from users files and databases

• The Grid is built by coordinating such services assuming we have solved problem of programming the service

Service Data

Page 30: High Performance Data Streaming in a Service Architecture

3030

Two-level Programming II The Grid is discussing the composition of distributed

services with the runtime interfaces to Grid as opposed to UNIX pipes/data streams

Familiar from use of UNIX Shell, PERL or Python scripts to produce real applications from core programs

Such interpretative environments are the single processor analog of Grid Programming

Some projects like GrADS from Rice University are looking at integration between service and composition levels but dominant effort looks at each level separately

Service1 Service2

Service3 Service4

Page 31: High Performance Data Streaming in a Service Architecture

31

3 Layer Programming Model

Application(level 1 Programming)

Application Semantics (Metadata, Ontology)Level 2 “Programming”

Basic Web Service Infrastructure

Web Service 1

Workflow (level 3) Programming BPEL

WS 2 WS 3 WS 4

MPI Fortran C++ etc.

Semantic Web

Workflow will be built on top of NaradaBrokering as messaging layer

Page 32: High Performance Data Streaming in a Service Architecture

32

Structure of SOAP• SOAP defines a very obvious message structure with a header

and a body just like email• The header contains information used by the “Internet operating

system”– Destination, Source, Routing, Context, Sequence Number …

• The message body is partly further information used by the operating system and partly information for application when it is not looked at by “operating system” except to encrypt, compress it etc.– Note WS-Security supports separate encryption for different parts of a

document

• Much discussion in field revolves around what is referenced in header

• This structure makes it possible to define VERY Sophisticated messaging

Page 33: High Performance Data Streaming in a Service Architecture

3333

Deployment Issues for “System Services” “System Services” (handlers/filters) are ones that act

before the real application logic of a service They gobble up part of the SOAP header identified by

the namespace they care about and possibly part or all of the SOAP body• e.g. the XML elements in header from the WS-RM

namespace They return a modified SOAP header and body to next

handler in chain

WS-RMHandler

WS-……..Handler

Header

Body

e.g. ……. Could be WS-Eventing WS-Transfer ….

Page 34: High Performance Data Streaming in a Service Architecture

34

Fast Web Service Communication I• Internet Messaging systems allow one to optimize message streams at the cost of

“startup time”, • Web Services can deliver the fastest possible interconnections with or without reliable

messaging• Typical results from Grossman (UIC) comparing Slow SOAP over TCP with binary and

UDP transport (latter gains a factor of 1000)

SOAP/XML WS-DMX/ASCII WS-DMX/Binary Record Count MB µ σ/µ MB µ σ/µ MB µ σ/µ

10000 0.93 2.04 6.45% 0.5 1.47 0.61% 0.28 1.45 0.38% 50000 4.65 8.21 1.57% 2.4 1.79 0.50% 1.4 1.63 0.27% 150000 13.9 26.4 0.30% 7.2 2.09 0.62% 4.2 1.94 0.85% 375000 34.9 75.4 0.25% 18 3.08 0.29% 10.5 2.11 1.11% 1000000 93 278 0.11% 48 3.88 1.73% 28 3.32 0.25% 5000000 465 7020 2.23% 242 8.45 6.92% 140 5.60 8.12%

Pure SOAP SOAP over UDP Binary over UDP

7020 5.60

Page 35: High Performance Data Streaming in a Service Architecture

35

Fast Web Service Communication II• Mechanism only works for streams – sets of related

messages• SOAP header in streams is constant except for

sequence number (Message ID), time-stamp ..• One needs two types of new Web Service

Specification• “WS-StreamNegotiation” to define how one can use

WS-Policy to send messages at start of a stream to define the methodology for treating remaining messages in stream

• “WS-FlexibleRepresentation” to define new encodings of messages

Page 36: High Performance Data Streaming in a Service Architecture

36

Fast Web Service Communication III• Then use “WS-StreamNegotiation” to negotiate stream in Tortoise

SOAP – ASCII XML over HTTP and TCP –

– Deposit basic SOAP header through connection – it is part of context for stream (linking of 2 services)

– Agree on firewall penetration, reliability mechanism, binary representation and fast transport protocol

– Naturally transport UDP plus WS-RM• Use “WS-FlexibleRepresentation” to define encoding of a Fast

transport (On a different port) with messages just having “FlexibleRepresentationContextToken”, Sequence Number, Time stamp if needed

– RTP packets have essentially this structure– Could add stream termination status

• Can monitor and control with original negotiation stream• Can generate different streams optimized for different end-points

Page 37: High Performance Data Streaming in a Service Architecture

3737

Data Deluged Science In the past, we worried about data in the form of parallel I/O or

MPI-IO, but we didn’t consider it as an enabler of new algorithms and new ways of computing

Data assimilation was not central to HPCC DoE ASC set up because didn’t want test data! Now particle physics will get 100 petabytes from CERN

• Nuclear physics (Jefferson Lab) in same situation

• Use around 30,000 CPU’s simultaneously 24X7

Weather, climate, solid earth (EarthScope) Bioinformatics curated databases (Biocomplexity only 1000’s of

data points at present) Virtual Observatory and SkyServer in Astronomy Environmental Sensor nets

Page 38: High Performance Data Streaming in a Service Architecture

38

Weather Requirements

Page 39: High Performance Data Streaming in a Service Architecture

Data

Information

Ideas

Simulation

Model

Assimilation

Reasoning

Datamining

ComputationalScience

Informatics

Data DelugedScienceComputingParadigm

Page 40: High Performance Data Streaming in a Service Architecture

4040

Virtual Observatory Astronomy GridIntegrate Experiments

Radio Far-Infrared Visible

Visible + X-ray

Dust Map

Galaxy Density Map

Page 41: High Performance Data Streaming in a Service Architecture

4141

In flight data

Airline

Maintenance Centre

Ground Station

Global NetworkSuch as SITA

Internet, e-mail, pager

Engine Health (Data) Center

DAME Data Deluged Engineering

Rolls Royce and UK e-Science ProgramDistributed Aircraft Maintenance Environment

~ Gigabyte per aircraft perEngine per transatlantic flight

~5000 engines

Page 42: High Performance Data Streaming in a Service Architecture

42

USArray

Seismic

Sensors

Page 43: High Performance Data Streaming in a Service Architecture

43

a

Topography1 km

Stress Change

Earthquakes

PBO

Site-specific IrregularScalar Measurements Constellations for Plate

Boundary-Scale Vector Measurements

aaIce Sheets

Volcanoes

Long Valley, CA

Northridge, CA

Hector Mine, CA

Greenland

Page 44: High Performance Data Streaming in a Service Architecture

4444

HPCSimulation

DataFilter

Data FilterD

ata

Filt

er

Data

Filter

Data

Filter

Distributed Filters massage dataFor simulation

Other

Grid

and W

eb

Servi

ces

AnalysisControl

Visualize

Data Deluged ScienceComputing Architecture

Grid

OGSA-DAIGrid Services

Grid Data Assimilation

Page 45: High Performance Data Streaming in a Service Architecture

4545

Data Assimilation Data assimilation implies one is solving some optimization

problem which might have Kalman Filter like structure

Due to data deluge, one will become more and more dominated by the data (Nobs much larger than number of simulation points).

Natural approach is to form for each local (position, time) patch the “important” data combinations so that optimization doesn’t waste time on large error or insensitive data.

Data reduction done in natural distributed fashion NOT on HPC machine as distributed computing most cost effective if calculations essentially independent • Filter functions must be transmitted from HPC machine

2 2

1

min ( , ) _obsN

i iTheoretical Unknownsi

Data position time Simulated Value Error

Page 46: High Performance Data Streaming in a Service Architecture

4646

Distributed Filtering

HPC Machine

Distributed Machine

Data FilterNobslocal patch 1

Nfilteredlocal patch 1

Data FilterNobslocal patch 2

Nfilteredlocal patch 2

GeographicallyDistributedSensor patches

Nobslocal patch >> Nfiltered

local patch ≈ Number_of_Unknownslocal patch

Send needed FilterReceive filtered data

In simplest approach, filtered data gotten by linear transformations on original data based on Singular Value Decomposition of Least squares matrix

Factorize Matrixto product oflocal patches