Upload
cachet
View
34
Download
0
Embed Size (px)
DESCRIPTION
Chapel: The Cascade High Productivity Language. Ting Yang University of Massachusetts Amherst. Context. HPCS = High Productivity Computing Systems Programmability Performance Portability Robustness Cascade = Cray’s HPCS Project System-wide consideration of productivity impacts - PowerPoint PPT Presentation
Citation preview
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science
Chapel: The Cascade High Productivity
LanguageTing Yang
University of Massachusetts Amherst
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 2
Context
DARPA HPCS Program
Cray’s Cascade ProjectChapel
Language
HPCS =
High Productivity Computing Systems
Programmability Performance Portability Robustness
Cascade = Cray’s HPCS Project System-wide consideration of productivity
impacts Processors, memory, network, OS Runtime, compilers, languages
Chapel =Cascade High-Productivity Language IBM Sun
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 3
Introduction – Why Chapel
Fragmented Model: MPI, SHMEM, UPC Write code on processor-by-processor basis
Break data structure Break control flow
Mix algorithms with per-processor management details in the computation
Virtual processor topology Communication details Choice of data structures, memory layout
Fail to support composition of parallelism Lack of productivity, flexibility, portability. Difficult to understand and maintain
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 4
Introduction Global-view Model: HPF, OpenMP, ZPL, NESL
Need not decompose data and control flow Decomposition: compiler and runtime Users provide high level guides Natural and Intuitive
Lack of abstractions: set, hash, graph
Performance is not as good as MPL. Difficult to compile
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 5
Introduction - Chapel Chapel: Cascade High-Productivity Language
Built from HPF and ZPL Strictly typed
Overall goal: Simplify the creation of parallel programs Provide high-performance production-grade codes More generality
Motivating Language Technologies: Multithreaded parallel programming Locality-aware programming Object-oriented programming Generic programming and type inference
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 6
Outline Introduction Multithreaded Parallel Programming
Data Parallel Task Parallel
Locality-aware Programming Data Distribution Computation Distribution
Other Features Summery
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 7
Multithreaded Parallel Programming
Provide global view of computation and data structures
Composition of parallelism
Abstraction of data and task parallelism Data: domains, arrays, graphs, Task: cobegins, atomic, sync variables
Virtualization of threads locales
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 8
Data Parallelism: Domains
Domain: an index set (first class) Specifies the size and shape of “arrays” Support sequence and parallel iteration Potentially decomposed across locales Each domain has an index type: index(domain) Fundamental concept of data parallelism Generalization of ZPL’s region
Important Domains Arithmetic: indices are Cartesian tuples
Arrays, multidimensional Arrays Can be strided and arbitrarily sparse
Infinite: indices are hash keys Maps, hash tables, associative arrays
Opaque: anonymous Sets, trees, graphs
Others: Enumerate
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 9
Domain Declaration
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 10
More domain declarations
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 11
Domain Uses Declaring Arrays
var A, B [D] : float
Sub-array references A(DInner) = B(DInner);
Sequential iterationfor (i,j) in Dinner { … A(I,j)… }
or: for ij in Dinner { …A(ij)… }
Parallel iterationforall (i,j) in Dinner { … A(I,j)…
}
or: for [ij in Dinner { …A(ij)… }
Array re-allocationD = [1..2*m, 1..2/n]
AB
ADInner BDInner
D
D
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 12
Infinite Domainsvar People: domain( string);var Age: [People] integer;
var Birthdate: [People] string;
Age(“john”) = 60;
Birthdate[“john”] = “12/11/1946”
forall person in People {
if (Birthdate(person) == today ) {
Age(person) += 1;
}
}
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 13
Opaque Domains
var Vertices: domain( opaque)
for i in (1..5) {
Vertices.newIndex();
}
Var AV, BV: [Vertices] float
Vertices
AV
BV
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 14
Building A Treevar Vertices: domain( opaque);var left, right: [Vertices] index(Vertices);
var root: index(Vertices);
root = Vertices.newIndex();
left(root) = Vertices.newIndex();
right(root) = Vertices.newIndex();
left(right(root)) = Vertices.newIndex();
root
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 15
The Domain/Index Hierarchy
•Every Domain has an Index type
•Eliminates most runtime boundary checks
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 16
Task Parallelism co-begins: statements that may run in
parallelcobegin {
ComputeTaskA (…);
ComputeTaskB (…);
}
atomic blocksatomic {
newnode.next = insertpt;
newnode.prev = insertpt.prev;
insertpt.prev.next = newnode;
insertpt.prev = newnode;
}
sync and single-assignment variables Synchronize tasks
ComputeTaskA ( ) {
cobegin {
ComputeTaskC (…);
ComputeTaskD (…); }
ComputeTaskE(…);
}
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 17
Outline Introduction Multithreaded Parallel Programming
Data Parallel Task Parallel
Locality-aware Programming Data Distribution Computation Distribution
Other Features Summery
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 18
Locality-aware programming
locale: machine unit of storage and processing Specify number of locales on command-line
./myProgram –nl 8
Chapel provides with built-in locale array: const Locales: [1..numLocales] locale;
Users may define their own locale arrays: var CompGrid: [1..GridRows, 1..GridCols] locale = …;
var TaskALocs: [1..numTaskALocs] locale = …;
var TaskBLocs: [1..numTaskBLocs] locale = …;
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 19
Data Distribution Domains can be distributed across locales
var D: domain(2) distrubuted(block(2) to CompGrid) = …;
Distributions specified by Mapping of indices to locales Per-locale storage layout of domain indices and array
element Distributions implemented a a class hierarchy
Chapel provides a group of standard distributions User may also write their own ???
Support reduce and scan (parallel prefix) Including user-defined operations
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 20
Computation Distribution
“on” keyward associates tasks to locale(s)
“on” can also used as data-driven manner
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 21
Outline Introduction Multithreaded Parallel Programming
Data Parallel Task Parallel
Locality-aware Programming Data Distribution Computation Distribution
Other Features Summery
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 22
Other Features Object Oriented Interface
Optional OO style overloading Advanced language features expressed in class
Generics and Type Inferences Type variables and Parameters
Similar to class template in C++ Sequences (“seq”), iterators; “ordered” keyword suppresses parallelism Modules (for name-space management) Parallel garbage collection ???
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 23
Outline Introduction Multithreaded Parallel Programming
Data Parallel Task Parallel
Locality-aware Programming Data Distribution Computation Distribution
Other Features Chapel Status
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 24
Chapel Status First sequential prototype on one
locale Not finished yet
Currently can run programs simple domains up to 2-dimensions partial type Inferences
Threads locales processors A full prototype in one or two years