1
Go fast to the results you need to analyze ! Naviga&on over data results (par&al) added by provenance (log) ! Filter results and stage out only the data you need to analyze ! Interact with a web<based user interface Export and analyze par4al results to take decisions ! Efficient data consolida&on and data staging features to aggregate and stage out only relevant informa&on ! Use sta&s&cal and visualiza&on tools empowered by provenance data ! Draw preliminary conclusions Visualiza4on and User:Steering Introduc4on References Large:Scale Experiments Scenario ! Black Box execu4on " Use of different computer programs and scripts " Data heterogeneity and granularity " Hard to follow the execu&on ! Hard to visualize par4al results " A lot of informa&on to be filtered " Traceability of which input produces each output " Some results may not be useful " Data and meta<data associa&on " Fragmented Data " Relevance of data " Each analysis may focus on a subset of the obtained outputs " Laborious data transfers Scien4fic Workflows ! Improves experiment management ! May provide data uniformity ! Provenance Data Provenance is a key feature ! Keeps track of everything that happens during experiment execu&on ! A log that can be queried ! Allows for high<level and domain<specific queries " What are the maximum values for velocity and pressure on a given CFD simula&on? ! A powerful associa4on: Experiment meta:data with strategic experimental results " Run&me analysis " Good for &me<consuming experiments ! User:steering for convergence analysis " User interacts with data of interest at run&me " Interfere in the workflow execu&on to adjust Large:Scale Scien4fic Data Visualiza4on using Provenance Visualizes experiment data through the workflow ! Displays provenance data for informa&on browsing ! Provenance data collected by Chiron [8] or SciCumulus [9] workflow engines Allows for selec4ons and filtering over provenance data ! Let scien&sts select just the results that interest them Nest output files to visualize fragmented data formats ! PuWng tags on result files enables selec&ve data staging for data formats that store informa&on across separate files but need them together during visualiza&on. ! The scien&sts can specify ac&ons to visualize data with specific tags " The ac&on can be programs or scripts defined by the scien&sts " Scien&sts can choose to run the ac&on on server side or client side Stage:out only the data the scien4sts selected Display the results locally or on a display technologies ! Integra&on with &led wall displays technologies ! Currently implemented to use TACC DisplayCluster ! Can be extended to other plaYorms such as CGLX or SAGE ! Integra&on with Paraview Monitor and Analyze workflow execu4on ! Be aware of the current workflow status ! Iden&fy and solve problems during execu&on ! Recover from failures and mistakes ! Reduce &me spent in processing low<quality data ! Cut execu&on &me and reduce financial costs " Stage data out can be costly! " Remote clusters " Cloud environments Use high:resolu4on display environments ! Visualize mul&ple results to establish comparisons " On parameters explora&on scenarios ! Analyze highly detailed images and simula&ons ! Can take advantage of available &led display technologies such as TACC DisplayCluster [1], SAGE [2], CGLX [3], and Paraview [4] Felipe Horta, Jonas Dias, Renato Elias, Daniel de Oliveira, Alvaro Coutinho and Marta Mattoso COPPE, Federal University of Rio de Janeiro, Brazil Fluminense Federal University, Brazil Prov-Vis: Large-Scale Scientific Data Visualization using Provenance VisTrails ! Strong support for visualiza&on & provenance ! Visualiza&on on &led wall displays ! No HPC support integrated to visualize data directly from clusters or clouds SwiS/Turbine [10] or Pegasus [11] ! HPC in very large scale with provenance ! No run&me provenance support ! Hard to enrich visual results with provenance ParaView Co:Processing Libraries [12] ! Not related with SWfMS No solu4on that integrates HPC execu4on with visualiza4on enriching analysis with run4me provenance data Related Work [2] [3] [4] 2 1 3 4 ! We are currently exploring experiments from CFD [5] with Uncertainty Quan&fica&on and Bioinforma&cs [6][7]. Scien&fic Workflow Management Systems (SWfMS) ! Different flavors of SWfMS ! With Visualiza&on Support " VisTrails [2] " Integra&on with visualiza&on libraries " Allows visualiza&on on &led wall displays " No HPC support integrated to visualize data directly from clusters or clouds ! With High Performance Compu&ng Support Web interface connected to Provenance Database and visualiza4on environment ! Rich and interac&ve interface ! Navigate through its ac&vi&es and i/o data ! Filtering op&ons to select the desired results ! Stage data out of the execu&on environment ! Text, images, PDF, video and Paraview visualiza&on Web service in the visualiza4on cluster to display staged out data in the Tiled Wall Display ! Implemented as a Web Service ! Interface accessed by the web module in run&me ! Can also stage data out of the execu&on environment ! Pluggable to support different &led wall displays middlewares. Features Architecture 4 Conclusion 5 Web Applica4on to visualize experiment results enriched with provenance data ! Navigate through data, select desired results and visualize it on your worksta&on or on your &led wall display environment. Keep track of which data produced each result ! Query provenance data and visualize related results Organize and associate data using tags ! Provide a personalized view of the results ! Easier to navigate on large<scale datasets ! Good for fragmented outputs ! Personalized ac&ons to visualize data on specific formats 3 [10] ! Ac&on in tags can be one of paraview's recorded scripts that reads the experiment results and produces a video, for example. [11] [12] [9] [8] [4] [3] [2] [1] ! Provenance enriched visualiza&on may support run4me and systema&c analysis on results from large<scale experiments. Nodes with 8 cores Mesh Processing Domain Partitioning Parallel CFD Solver Input Mesh i Mesh i partitioned in M parts node-x node-x node-x node-z ./edgeCFDMesh mpirun –n 8 edgeCFDPre mpirun –n M edgeCFD 16 Mesh i partitions Solver executed with 16 cores for case i Job i Chiron is running in each core of each node: managing scheduling, fault-tolerance, provenance data gathering, runtime provenance queries CFD Workflow DisplayCluster Prov:Vis Tiled Wall Display Workflow Engine Provenance Tags [5] Web Client Remote Applica4on Explora&on 1 Explora&on 2 Explora&on N ... Explora&on 3 1 1 1 1 1 2 1 2 [5] Guerra, G., Rochinha, F., Elias, R., et al., 2012, "Uncertainty Quan&fica&on in Computa&onal Predic&ve Models for Fluid Dynamics Using Workflow Management Engine", Interna4onal Journal for Uncertainty Quan4fica4on, v. 2, n. 1. [6] Ocaña, K. A. C. S., Oliveira, D. de, Horta, F., et al., 2012, "Exploring Molecular Evolu&on Reconstruc&on Using a Parallel Cloud<based Scien&fic Workflow", Advances in Bioinforma4cs and Computa4onal Biology, chapter 7409, Springer. [7] Ocaña, K. A. C. S., Oliveira, D., Ogasawara, E., et al., 2011, "SciPhy: A Cloud<Based Workflow for Phylogene&c Analysis of Drug Targets in Protozoan Genomes", Advances in Bioinforma4cs and Computa4onal Biology, chapter 6832 Springer. [8] Ogasawara, E., Dias, J., Oliveira, D., et al., 2011, "An Algebraic Approach for Data< Centric Scien&fic Workflows", Proc. of VLDB Endowment, v. 4, n. 12. [9] Oliveira, D., Ogasawara, E., Baião, F., et al., 2010, "SciCumulus: A Lightweight Cloud Middleware to Explore Many Task Compu&ng Paradigm in Scien&fic Workflows”, IEEE Interna4onal Conference on Cloud Compu4ng, Washington, DC, USA. [10] Gadelha, L. M. R., Clifford, B., Mavoso, M., et al., 2011, "Provenance management in Swiw", Future Genera4on Computer Systems, v. 27, n. 6 (Jun.), DC, USA. [11] Deelman, E., Mehta, G., Singh, G., et al., 2007, "Pegasus: Mapping Large<Scale Workflows to Distributed Resources", Workflows for e:Science, Springer. [12] Fabian, N., Moreland, K., Thompson, D., et al., 2011, "The ParaView Coprocessing Library: A scalable, general purpose in situ visualiza&on library". [13] Valli, A. M. P., Elias, R. N., Carey, G. F., et al., 2009, "PID adap&ve control of incremental and arclength con&nua&on in nonlinear applica&ons", Interna4onal Journal for Numerical Methods in Fluids, v. 61, n. 11 (Dec.).

Prov-Vis: Large-Scale Scientific Data Visualization using ...sc13.supercomputing.org/sites/default/files/PostersArchive/tech... · Alvaro Coutinho and Marta Mattoso COPPE, Federal

Embed Size (px)

Citation preview

Page 1: Prov-Vis: Large-Scale Scientific Data Visualization using ...sc13.supercomputing.org/sites/default/files/PostersArchive/tech... · Alvaro Coutinho and Marta Mattoso COPPE, Federal

•  Go#fast#to#the#results#you#need#to#analyze#!  Naviga&on)over)data)results)(par&al))added)by)provenance)(log))!  Filter)results)and)stage)out)only)the)data)you)need)to)analyze)!  Interact)with)a)web<based)user)interface)

•  Export#and#analyze#par4al#results#to#take#decisions#!  Efficient)data)consolida&on)and)data)staging)features)to)aggregate)

and)stage)out)only)relevant)informa&on)!  Use)sta&s&cal)and)visualiza&on)tools)empowered))

by)provenance)data)!  Draw)preliminary)conclusions)

Visualiza4on#and#User:Steering#

))

Introduc4on#

References)

•  Large:Scale#Experiments#Scenario#!  Black#Box#execu4on#

" Use)of)different)computer)programs)and)scripts)"  Data)heterogeneity)and)granularity)"  Hard)to)follow)the)execu&on)

! Hard#to#visualize#par4al#results#"  A)lot)of)informa&on)to)be)filtered)"  Traceability)of)which)input)produces)each)output)"  Some)results)may)not)be)useful)"  Data)and)meta<data)associa&on))

"  Fragmented)Data)"  Relevance)of)data)"  Each)analysis)may)focus)on)a)subset)of)the)obtained)outputs)

"  Laborious)data)transfers)•  Scien4fic#Workflows#

!  Improves)experiment)management)!  May)provide)data)uniformity)!  Provenance)Data)

•  Provenance#is#a#key#feature#!  Keeps)track)of)everything)that)happens)during)

experiment)execu&on)!  A)log)that)can)be)queried)!  Allows)for)high<level)and)domain<specific)queries)

" What)are)the)maximum)values)for)velocity)and)pressure)on)a)given)CFD)simula&on?)

!  A#powerful#associa4on:#Experiment#meta:data#with#strategic#experimental#results#"  Run&me)analysis)" Good)for)&me<consuming)experiments))

!  User:steering#for#convergence#analysis ##" User)interacts)with)data)of)interest)at)run&me)"  Interfere)in)the)workflow)execu&on)to)adjust)

Large:Scale#Scien4fic#Data#Visualiza4on#using#Provenance#

•  Visualizes#experiment#data#through#the#workflow#!  Displays)provenance)data)for)informa&on)browsing)!  Provenance)data)collected)by)Chiron#[8])or)SciCumulus#[9]#

workflow)engines)•  Allows#for#selec4ons#and#filtering#over#provenance#data#

!  Let)scien&sts)select)just)the)results)that)interest)them)•  Nest#output#files#to#visualize#fragmented#data#formats#

!  PuWng)tags)on)result)files)enables)selec&ve)data)staging)for)data)formats)that)store)informa&on)across)separate)files)but)need)them)together)during)visualiza&on.)

!  The)scien&sts)can)specify)ac&ons)to)visualize)data)with)specific)tags)"  The)ac&on)can)be)programs)or)scripts)defined)by)the)scien&sts)"  Scien&sts)can)choose)to)run)the)ac&on)on)server)side)or)client)side)

•  Stage:out#only#the#data#the#scien4sts#selected#•  Display)the)results)locally)or)on)a)display)technologies)

!  Integra&on)with)&led)wall)displays)technologies)!  Currently)implemented)to)use)TACC)DisplayCluster)!  Can)be)extended)to)other)plaYorms)such)as)CGLX)or)SAGE)!  Integra&on)with)Paraview)

•  Monitor#and#Analyze#workflow#execu4on#!  Be)aware)of)the)current)workflow)status)!  Iden&fy)and)solve)problems)during)execu&on)!  Recover)from)failures)and)mistakes)!  Reduce)&me)spent)in)processing)low<quality)data)!  Cut)execu&on)&me)and)reduce)financial)costs)

"  Stage)data)out)can)be)costly!)"  Remote)clusters)"  Cloud)environments)

•  Use#high:resolu4on#display#environments##!  Visualize)mul&ple)results)to)establish)comparisons)

" On)parameters)explora&on)scenarios)!  Analyze)highly)detailed)images)and)simula&ons))!  Can)take)advantage)of)available)&led)display)technologies))

such)as)TACC)DisplayCluster)[1],)SAGE)[2],)CGLX)[3],)and)Paraview)[4])

Felipe Horta, Jonas Dias, Renato Elias, Daniel de Oliveira, Alvaro Coutinho and Marta Mattoso

COPPE, Federal University of Rio de Janeiro, Brazil Fluminense Federal University, Brazil

Prov-Vis: Large-Scale Scientific Data Visualization using Provenance

•  VisTrails#

!  Strong)support)for)visualiza&on)&)provenance)!  Visualiza&on)on)&led)wall)displays)!  No)HPC)support)integrated)to)visualize)data)

directly))from)clusters)or)clouds)•  SwiS/Turbine#[10]#or#Pegasus#[11]#

!  HPC)in)very)large)scale)with)provenance)!  No)run&me)provenance)support)!  Hard)to)enrich)visual)results)with)provenance)

•  ParaView#Co:Processing#Libraries#[12]#!  Not)related)with)SWfMS)

•  No#solu4on#that#integrates#HPC#execu4on#with#visualiza4on#enriching#analysis#with#run4me#provenance#data#

Related#Work#[2])

[3])

[4])

2)1)

3)

4)

!  We)are)currently)exploring)experiments))from)CFD)[5])with)Uncertainty)Quan&fica&on))and)Bioinforma&cs)[6][7].)

•  Scien&fic)Workflow)Management)Systems)(SWfMS))!  Different)flavors)of)SWfMS)!  With)Visualiza&on)Support)

"  VisTrails)[2])"  Integra&on)with)visualiza&on)libraries)"  Allows)visualiza&on)on)&led)wall)displays)" No)HPC)support)integrated)to)visualize)data)directly)from)clusters)or)clouds)

!  With)High)Performance)Compu&ng)Support)

•  Web#interface#connected#to#Provenance#Database#and#visualiza4on#environment#

!  Rich)and)interac&ve)interface)!  Navigate)through)its)ac&vi&es)and)i/o)data)!  Filtering)op&ons)to)select)the)desired)results)!  Stage)data)out)of)the)execu&on)environment)!  Text,)images,)PDF,)video)and)Paraview)visualiza&on)

•  Web#service#in#the#visualiza4on#cluster#to#display#staged#out#data#in#the#Tiled#Wall#Display#

!  Implemented)as)a)Web)Service)!  Interface)accessed)by)the)web)module)in)run&me)!  Can)also)stage)data)out)of)the)execu&on)environment)!  Pluggable)to)support)different)&led)wall)displays)

middlewares.)

Features# Architecture#

4) Conclusion# 5)•  Web#Applica4on#to#visualize#experiment#results#enriched#with#provenance#data#!  Navigate)through)data,)select)desired)results)

and)visualize)it)on)your)worksta&on)or)on)your)&led)wall)display)environment.)

•  Keep#track#of#which#data#produced#each#result#!  Query)provenance)data)and)visualize)related)

results)•  Organize#and#associate#data#using#tags#

!  Provide)a)personalized)view)of)the)results)!  Easier)to)navigate)on)large<scale)datasets)!  Good)for)fragmented)outputs)!  Personalized)ac&ons)to)visualize)data)on)

specific)formats)

3)

[10])

!  Ac&on)in)tags)can)be)one)of)paraview's)recorded)scripts)that)reads)the)experiment)results)and)produces)a)video,)for)example.)

[11]) [12])

[9])[8])

[4])[3])[2])[1])

!  Provenance)enriched)visualiza&on)may)support)run4me)and)systema&c)analysis)on)results)from)large<scale)experiments.)

Nodes with 8 cores

Mesh Processing

Domain Partitioning

Parallel CFD Solver

Input Mesh i Mesh i

partitioned in M parts

node-x node-x node-x

node-z

./edgeCFDMesh mpirun –n 8 edgeCFDPre

mpirun –n M edgeCFD

16 Mesh i partitions

Solver executed with 16 cores for

case i

Job i

Chiron is running in each core of each node: managing scheduling, fault-tolerance, provenance data gathering, … runtime provenance queries

CFD#Workflow#

DisplayCluster

Prov:Vis#

Tiled&Wall&Display&

Workflow#Engine# Provenance#

)

Tags#

[5])

Web#Client#

Remote#Applica4on#

Explora&on)1)

Explora&on)2)

Explora&on)N)

...#Explora&on)3)

1) 1) 1)1) 1)

2)

1)

2)

[5])Guerra,)G.,)Rochinha,)F.,)Elias,)R.,)et)al.,)2012,)"Uncertainty)Quan&fica&on)in)Computa&onal)Predic&ve)Models)for)Fluid)Dynamics)Using)Workflow)Management)Engine",)Interna4onal#Journal#for#Uncertainty#Quan4fica4on,)v.)2,)n.)1.)[6])Ocaña,)K.)A.)C.)S.,)Oliveira,)D.)de,)Horta,)F.,)et)al.,)2012,)"Exploring)Molecular)Evolu&on)Reconstruc&on)Using)a)Parallel)Cloud<based)Scien&fic)Workflow",)Advances#in#Bioinforma4cs#and#Computa4onal#Biology,)chapter)7409,)Springer.))[7])Ocaña,)K.)A.)C.)S.,)Oliveira,)D.,)Ogasawara,)E.,)et)al.,)2011,)"SciPhy:)A)Cloud<Based)Workflow)for)Phylogene&c)Analysis)of)Drug)Targets)in)Protozoan)Genomes",)Advances#in#Bioinforma4cs#and#Computa4onal#Biology,)chapter)6832)Springer.)[8])Ogasawara,)E.,)Dias,)J.,)Oliveira,)D.,)et)al.,)2011,)"An)Algebraic)Approach)for)Data<Centric)Scien&fic)Workflows",)Proc.#of#VLDB#Endowment,)v.)4,)n.)12.)))

[9])Oliveira,)D.,)Ogasawara,)E.,)Baião,)F.,)et)al.,)2010,)"SciCumulus:)A)Lightweight)Cloud)Middleware)to)Explore)Many)Task)Compu&ng)Paradigm)in)Scien&fic)Workflows”,)IEEE#Interna4onal#Conference#on#Cloud#Compu4ng,)Washington,)DC,)USA.)[10])Gadelha,)L.)M.)R.,)Clifford,)B.,)Mavoso,)M.,)et)al.,)2011,)"Provenance)management)in)Swiw",#Future#Genera4on#Computer#Systems,)v.)27,)n.)6)(Jun.),)DC,)USA.)[11])Deelman,)E.,)Mehta,)G.,)Singh,)G.,)et)al.,)2007,)"Pegasus:)Mapping)Large<Scale)Workflows)to)Distributed)Resources",)Workflows#for#e:Science,)Springer.))[12])Fabian,)N.,)Moreland,)K.,)Thompson,)D.,)et)al.,)2011,)"The)ParaView)Coprocessing)Library:)A)scalable,)general)purpose)in)situ)visualiza&on)library".))[13])Valli,)A.)M.)P.,)Elias,)R.)N.,)Carey,)G.)F.,)et)al.,)2009,)"PID)adap&ve)control)of)incremental)and)arclength)con&nua&on)in)nonlinear)applica&ons",#Interna4onal#Journal#for#Numerical#Methods#in#Fluids,)v.)61,)n.)11)(Dec.).))))