24
Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

Embed Size (px)

Citation preview

Page 1: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

Victor Eijkhout and Erika Fuentes,ICL, University of TennesseeSuperComputing 2003

A Proposed Standard for Numerical Metadata

Page 2: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 2

Introduction

Many numerical routines have parameters with settings that depend on the application context of the routine. Computing the parameter settings is now part of the numerical software, or is done by human intervention. We argue that this should be done by a separate software analysis component, and automatically. This, however, requires a higher level description of the application data. We formalize this by introducing our Numerical Metadata. Having analysis modules, and a formal almost-semantic description of numerical data, makes Component-based Programming Frameworks possible. We also show the feasibility of the automatic analysis approach.

Page 3: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 3

Traditional flow of control• Physics application produces

data• Numerical app analysis data to

find relevant characteristic, uses characteristic to decide on algorithm and set its parameters• => only `data’ interface

needed

Page 4: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 4

Improved scenario• Physics as before• Analysis module finds characteristics• Numerical algorithm choice and setting of

parameters • => also interface needed for

characteristics => metadata

Page 5: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 5

Usage scenario 1

Example: GMRES restart length as function of indefiniteness

Page 6: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 6

Usage scenario 2

Example: estimate fill-in, use iterative method if data wouldn’t fit

Page 7: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 7

Usage scenario 3

Page 8: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 8

• Numerical experimentation is held back by lack of available characteristics

• Separately available analysis modules should remedy that

Numerical experimentation

Many relevant matrix quantities are hard to compute and hard to implement: enclosing ellipse of the spectrum, departure from normality, &c. Availability of independent analysis modules should encourage further experimentation on the part of numerical analysists.

Page 9: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 9

Component-based Programming Frameworks

Applications: large, complex scientific applications (Composite Applications) that couple a variety of single-focus, scientific algorithms (Element Applications) along with other software support (e.g. visualization)

• Using behavioural metadata to assist in integrating single-focus algorithms into complex applications

• Metadata as semantic part of interface spec of numerical components

• (with Thomas Eidson)

Page 10: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 10

Page 11: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 11

Practical access to metadata• Store in XML format; use Schema for

validation; XSL for display• API for conversion XML <=> internal data

structure• API for retrieval / insertion of metadata

We need two-fold access to the metadata: inside a code and in more permanent form.

Conversion between the two forms

Page 12: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 12

API: creation routines

Page 13: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 13

API: Access routines

Page 14: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 14

API: Conversion routines

Page 15: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 15

Proposed metadata category 1

Page 16: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 16

Proposed metadata category 2

Page 17: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 17

Proposed metadata category 3

Page 18: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 18

Proposed metadata category 4

Page 19: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 19

Further categories

• Custom categories• Application properties:

discretisation, mesh

Even though we propose a core set of categories, our storage format, and the libraries implementing it, are general and open-ended. Thus we hope that people will propose categories that are inspired by other views of the same kind of data, or by different problem areas altogether.

In particular, categories that describe the application-derived properties of numerical data would be very useful in the analysis modules we proposed.

Page 20: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 20

Matrix metadata, issues

• Duplication of elements (e.g., simple->nnz == matrix_market->nnz)

• Relations between elements (e.g., if M-matrix then definite)

• Inheritance / derivation (e.g., dummy rows from bc, fictitious domain)

It is clear that certain pieces of information will appear in more than one category, especially if third-parties will start proposing their own categories. We want to introduce mechanisms for resolving or enforcing such implied relations.

Also, if one matrix is derived from another, there should be a linkage mechanism so that categories of metadata can be inherited where this is mathematically justified

Page 21: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 21

Matrix metadata, more issues

• Extensions beyond matrices and linear systems

• Language interoperability

The current proposal was clearly inspired by linear system solving, and the proposed categories are applicable to matrices, mostly in that context. However, the storage format is general enough to cover other numerical application areas and other types of data.

The library we have written uses and targets C. This obviously needs to be extended to Fortran and Java. We will use Babel for this.

Page 22: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 22

Proof of concept

• Predicting partitioning/distribution of linear solve• Analysis modules for structural,

scalar, spectral categories of metadata

Page 23: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 23

Proof of concept

• Heuristic: choice of permutation & partitioning before preconditioning

• Statistical analysis (parametric model, Bayesian decision rule)

• Analysis modules for features: bandwidth, sparsity, field-of-values

We ran exhaustive tests of a number of iterative methods on a collection of matrices. The results are used in a parametric model to classify the matrices. Dividing the test collection into a training and test set allows us to assess the predictive value of the model.

Three different methods can be predicted with accuracies 30,90,30%.

Average gain is approx factor of 5 (correct prediction over worst case).

Misprediction penalty is only 60%, but still factor of 2 gain over worst method.

Page 24: Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata

2003/11 Eijkhout / Metadata / SC2003 24

Software

• Metadata library based on libxml• Library, XML schema, XSL style

sheet• Currently only C support• See http://icl.cs.utk.edu/salsa/