Upload
nguyet
View
26
Download
2
Tags:
Embed Size (px)
DESCRIPTION
MACCCR 5 th Fuels Research Review September 17, 2012. PrIMe Next Frontier: Large, Multi-dimensional Data Sets. Michael Frenklach. Supported by AFOSR. OUTLINE. PrIMe Cloud Infrastructure: Data Flow Network Remote Server: PrIMe-RMG Interfaces Big Data Other new developments: - PowerPoint PPT Presentation
Citation preview
MACCCR 5th Fuels Research ReviewSeptember 17, 2012
Michael Frenklach
Supported by AFOSR
PrIMe Next Frontier: Large, Multi-dimensional Data Sets
OUTLINE• PrIMe Cloud Infrastructure:
‒Data Flow Network‒Remote Server: PrIMe-RMG‒Interfaces‒Big Data
• Other new developments:‒Species identification app‒UQ: Statistical sampling of the feasible set
. . .
• PrIMe with Humanities
PrIMeProcess Informatics Model
INFRASTRUCTURE FOR UQ-PREDICTIVE MODELING
http://primekinetics.org
Data sharing
App sharing
Automation
PrIMe
Portal
Assess to distributed resourcesUser authorizationSocial networking
User forumsData evaluation panels
Help, tutorials, examples
Customized Drupal (PHP)
platform independent
Workflow
“Browser-based” softwareUser building projects
Data/app linkingBinary XML interfaces
Remote-server supportProject sharing
C#, Windows, IEapps: C#, Matlab
Warehouse
Data collectionsModels and ExperimentsControlled by schemas
Submission formsMultiple-mode access
WebDAVXML
domain 2 web page
Present-day Science Sharing:via web-page access
database
database
apps
apps
Internet
domain 1 web page
science domain 2
database
database
apps
apps
Internet
science domain 1
PrIMe Science Sharing:via web-service data/app
access
science domain 2
database
database
apps
apps
Internet
science domain 1
PrIMe Science Sharing:via web-service data/app
access
clientweb service data
flownetwork
clientworkflowapp
Initial Model:“Upload your data to PrIMe Warehouse” (“give me your data”)
New, Distributed Model:“You may, if choose, connect your data to the communal system”
• with a switch in the OFF position: “you can use the communal data and tools but your own data is private to you only”
• “but please flip the switch to the ON position when you are ready to share your own data”
PRIME DATA MODEL
“Connect your code to the communal system”- you control your own code:
• release version• user access, licenses• collect fees, if desired
SAME FOR APPS
TECHNOLOGY: HOW
Remote server app—PrIMe Web Services (PWS)• no restrictions on platform• no restrictions on data formats• no restrictions on local programming language(s)
PrIMe Workflow Interface (PWI) is the only “standard”• developed, maintained, and controlled by the community
client machine
client data
PrIMe web services
PrIMe Data Flow Network
PrIMe Dispatcher
BIG DATA
excessively large data sets• do not move the data
• but use “smart agents” (eg, HTML5 walkers)
web services with user-reloaded tasks:fetch data features for user-requested analysis
PrIMe remote-server webservices
PRIME REMOTE-SERVER WEBSERVICES
•Created ~2 years ago‒ installed by professional programmers‒ implemented on Reaction Design site
•Modified June 2012‒ can be installed by users‒ implemented with RMG at MIT site‒ installed by first-year grad students!
installation manual
PRIME – RMG• User creates a PrIMe Workflow
(PWA) project
• User submits a request: “create a reaction model for …”
• The request activates RMG code at MIT server
• User receives email when the model is generated
• User retrieves the model or it “moves” along the PWA project to the next component
PRIME INTERFACES
client machine
client data
PrIMe web services
binary XML – HDF5e.g., reaction model: GRI-Mech 3.0
• input data for UQ bypassing Warehouse
• species identification via crowd-sourcing
• UQ: sampling within the feasible region
• comparison between interval-to-interval UQ and rigorous Bayesian
• parallelization of Chemkin II
NEW DEVELOPMENTS
UPLOAD YOUR OWN DATASET TO RUN UQ
SPECIES IDENTIFICATION BY CROWD-SOURCING
SPECIES IDENTIFICATION BY CROWD-SOURCING
DATACOLLABORATION: BOUNDS-TO-BOUNDS PREDICTIONSCONSTRAINED TO THE FEASIBLE SET
expe
rim
enta
l un
cert
aint
y
M(x1,x2)
Ffeasible set
experiment/theory constrain feasible set
prior knowledge
FEASIBLE SET SAMPLING
PREDICTION ON THE FEASIBLE SET
COMPARISON BETWEEN BOUNDS-TO-BOUNDS UQ (DATACOLLABORATION)
ANDRIGOROUS BAYESIAN
An ongoing collaborative study withJerome Sacks, National Institute of Statistical SciencesRui Paulo, ISEG Technical University of LisbonGonzalo Garcia-Donato, Universidad de Castilla-La Mancha
Bayesian simulations:• no simplifying assumptions,• but utilize the Solution Mapping strategy for numerical efficiency
PARALLELIZATION: CHEMKIN IIExecution time of flame simulations with a large acetylene model
0 2 4 6 8 10 1250
100
150
200
250
300
350
Number of Threads
Tim
e (s
)
Execution time of Parallel PREMIX for Large Acetylene Model
PARALLELIZATION: CHEMKIN IIExecution time of flame simulations with a hydrogen model
0 2 4 6 8 10 1230
40
50
60
70
80
90
100
Number of Threads
Tim
e (s
)Execution time of Parallel PREMIX for Small Hydrgen Model
KNOWLEDGE UNIX A collaborative project of PrIMe with Humanities:
• Berkeley Electronic Cultural Atlas Initiative
“Study of Buddhist Texts”PrIMe is used to predict the past
The abstracted dots represent 166000 “panes”
KNOWLEDGE UNIX A collaborative project of PrIMe with Humanities:
• Berkeley Electronic Cultural Atlas Initiative• Berkeley Institute of Information: “Editors Notes”
Current and Next• Remote-server app and new apps
−RMG: interface (with MIT, Bill Green)
−Communal/User tools: Cantera (with NCSU, Phil Westmoreland)
−Big Data: feature collection for UQ (with Utah, Phil Smith)
• Enabling new science infrastucture−ALS-data analysis (with NCSU; Phil
Westmoreland)
−Species IDs (with Kaust; Mani Sarathy)
−H2-O2: automation/addition of flame targets (with Tsinghua, Xiaoqing You)
−Submission of Chemkin mechanisms (with Kaust and Tsinghua)