32
UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San Diego State University, 2 Lawrence Livermore Lab, 3 Harvard University Supported by DOE Grants DE-FG02-96ER40985,DE-FC02- 09ER41587, and DE-AC52-07NA27344

UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

Embed Size (px)

Citation preview

Page 1: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

UNEDF 2011 ANNUAL/FINAL MEETING

Progress report on the BIGSTICKconfiguration-interaction code

Calvin Johnson1

Erich Ormand2

Plamen Krastev1,2,3

1San Diego State University, 2Lawrence Livermore Lab, 3Harvard University

Supported by DOE Grants DE-FG02-96ER40985,DE-FC02-09ER41587, and DE-AC52-07NA27344

Page 2: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

UNEDF 2011 ANNUAL/FINAL MEETING

We have good news and bad news...

We have good news and bad news...

....both the same thing........both the same thing....

....the postdoc (Plamen Krastev) got a permanent staff position in scientific computing at Harvard.

....the postdoc (Plamen Krastev) got a permanent staff position in scientific computing at Harvard.

Page 3: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

BIGSTICK:

General purpose M-scheme configuration interaction (CI) code

On-the-fly calculation of the many-body Hamiltonian

Fortran 90, MPI and OpenMP

35,000+ lines in 30+ files and 200+ subroutines

Faster set-up

Faster Hamiltonian application

Rewritten for “easy” parallelization

New parallelization scheme

REDSTICK BIGSTICK

2

Page 4: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

BIGSTICK:

Flexible truncation scheme: handles ‘no core’ ab initio Nhw truncation, valence-shell (sd & pf shell) orbital truncation; np-nh truncations; and more.

Applied to ab initio calculations, valence shell calculations (in particular level densities, random interaction studies, and benchmarking projected HF), cold atoms, and electronic structure of atoms (benchmarking RPA and HF for atoms).

REDSTICK BIGSTICK

2

Version 6.5 is available at NERSC: unedf/lcci/BIGSTICK/v650/

Page 5: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

BIGSTICK uses factorization algorithm reduces storage of Hamiltonian arrays

5

Nuclide Space Basis dim matrix store factorization

56Fe pf 501 M 290 Gb 0.72 Gb7Li Nmax=12 252 M 3600 Gb 96 Gb7Li Nmax=14 1200 M 23 Tb 624 Gb

12C Nmax=6 32M 196 Gb 3.3 Gb12C Nmax=8 590M 5000 Gb 65 Gb12C Nmax=10 7800M 111 Tb 1.4 Tb16O Nmax=6 26 M 142 Gb 3.0 Gb16O Nmax=8 990 M 9700 Gb 130 Gb

Comparison of nonzero matrix storage with factorization

TRIUMF – Feb 2011

UNEDF 2011 ANNUAL/FINAL MEETING

Page 6: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

BIGSTICK:

2

Micah Schuster, Physics MS project

Page 7: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

BIGSTICK:

2

Joshua Staker, Physics MS project

Page 8: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

BIGSTICK:

2

Page 9: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

BIGSTICK:

2

Page 10: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

3

Page 11: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

BIGSTICK

3

Page 12: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

UNEDF 2011 ANNUAL/FINAL MEETING

Major accomplishment as of last year:excellent scaling of mat-vec multiply

This demonstrates our factorization algorithm, as predicted, facilitates

efficient distribution of mat-vec ops

This demonstrates our factorization algorithm, as predicted, facilitates

efficient distribution of mat-vec ops

Page 13: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

Major accomplishments after last UNEDF meeting:

Rebalanced workload with additional constraint for dimension of local Lanczos vectors (Krastev)

Fully distributed Lanczos vectors with hermiticity on (Krastev)

Major steps towards distributing Lanczos vectors with suppressed hermiticity (Krastev)

OpenMP implementations in matrix-vector multiply (Ormand & Johnson)

Significant progress in 3-body implementation (Johnson & Ormand)

Added restart option (Johnson)

Implemented in-lined 1-body density matrices (Johnson)

6

Page 14: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

UNEDF 2011 ANNUAL/FINAL MEETING

Highlighting accomplishments for 2010-2011:

Add OpenMP

Reduce memory load/ node -- Lanczos vectors-- matrix information (matrix elements/jumps)

Speed up reorthogonalization-- I/O is bottleneck

Page 15: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

UNEDF 2011 ANNUAL/FINAL MEETING

Highlighting accomplishments for 2010-2011:

Add OpenMP

-- Crude 1st generation by Johnson (about 70-80% efficiency)

-- 2nd generation by Ormand (nearly 100% efficiency)

Hybrid OpenMP+MPI implemented, full testing delayed due to reorthogonalization issues

Page 16: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

UNEDF 2011 ANNUAL/FINAL MEETING

Highlighting accomplishments for 2010-2011:

Add OpenMP

Reduce memory load/ node -- Lanczos vectors-- matrix information (matrix elements/jumps)

We break up the Lanczos vectors so only part on each node

Future: separate forward/backward multiplication

Page 17: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

4pzJ 4nzJ

3pzJ 3nzJ

Vin

1

2

3

4

Vout

1

2

3

4

1 1

2 2

Proton sector Neutron sector

Lanczos vectors distribution:

22

Page 18: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

4pzJ 4nzJ

3pzJ 3nzJ

Vin

1

2

3

4

Vout

1

2

3

4

1 1

2 2

Proton sector Neutron sector

Lanczos vectors distribution:

Hermiticity on

Forward and …

22

Page 19: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

4pzJ 4nzJ

3pzJ 3nzJ

Vin

1

2

3

4

Vout

1

2

3

4

1 1

2 2

Proton sector Neutron sector

Lanczos vectors distribution:

Hermiticity on

Forward and …… backward application of H

22

Page 20: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

4pzJ 4nzJ

3pzJ 3nzJ

Vin

1

2

3

4

Vout

1

2

3

4

1 1

2 2

Proton sector Neutron sector

Lanczos vectors distribution:

Hermiticity on

Each compute node needs at a minimum TWO sectors from initial and TWO sectors from final Lanczos vector

Forward and …… backward application of H

22

Page 21: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

Vin

1

2

Vout

1

2

Lanczos vectors distribution:

Hermiticity off

4pzJ 4nzJ

3pzJ 3nzJ

1 1

2 2

Proton sector Neutron sector

Forward application of H on one node and …

23

Page 22: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

Vin

1

2

Vout

1

2

Lanczos vectors distribution:

Hermiticity off

4pzJ 4nzJ

3pzJ 3nzJ

1 1

2 2

Proton sector Neutron sector

Forward application of H on one node and …

… backward application of H on another node

4pzJ 4nzJ

3pzJ 3nzJ

1 1

2 2

1

2

1

2

23

Page 23: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

Vin

1

2

Vout

1

2

Lanczos vectors distribution:

Hermiticity off

4pzJ 4nzJ

3pzJ 3nzJ

1 1

2 2

Proton sector Neutron sector

Forward application of H on one node and …

… backward application of H on another node

4pzJ 4nzJ

3pzJ 3nzJ

1 1

2 2

1

2

1

2

Each compute node needs ONE sector from initial and ONE sector from final Lanczos vector

23

Page 24: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

Comparison of memory requirements for distributing Lanczos vectors:

Nuclide Space Basis dim Store Hermiticity ON

Hermiticity OFF

12C Nmax = 10 7800M 117GB 8.44GB 4.39GB

60Zn pf 2300M 34GB 8.65GB 4.45GB

24

Memory required to store 2 Lanczos vectors (double precision) on a node

Page 25: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

Comparison of memory requirements for distributing Lanczos vectors:

Nuclide Space Basis dim Store Hermiticity ON

Hermiticity OFF

12C Nmax = 10 7800M 117GB 8.44GB 4.39GB

60Zn pf 2300M 34GB 8.65GB 4.45GB

24

Memory required to store 2 Lanczos vectors (double precision) on a node

Distribution scheme with suppressed hermiticity is the most memory efficient. This is the scheme of choice for us

Page 26: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

UNEDF 2011 ANNUAL/FINAL MEETING

Highlighting accomplishments for 2010-2011:

Add OpenMP

Reduce memory load/ node -- Lanczos vectors-- matrix information (matrix elements/jumps)

Speed up reorthogonalization-- I/O is bottleneck

Page 27: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

UNEDF 2011 ANNUAL/FINAL MEETING

Highlighting accomplishments for 2010-2011:

Add OpenMP

Reduce memory load/ node -- Lanczos vectors-- matrix information (matrix elements/jumps)

Speed up reorthogonalization-- I/O is bottleneck

We (i.e. PK) spent time trying to make MPI/IO efficient for our needs via striping, etc.

Analysis by Rebecca Hartman-Baker (ORNL) suggests our I/O still running sequentially rather than in parallel.

Now we will store all Lanczos vectors in memory a la MFDn(makes restarting an interrupted run difficult)

Page 28: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

UNEDF 2011 ANNUAL/FINAL MEETING

Next steps for remainder of project period:

•Store Lanczos vectors in RAM (end of summer)•Write paper on factorization algorithm (drafted, finish by9/2011)•Fully implement MPI/ OpenMP hybrid code (11/2011)•Write up paper for publication of code (early 2012)

Page 29: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

UNEDF 2011 ANNUAL/FINAL MEETING

UNEDF Deliverables for BIGSTICK

•The LCCI project will deliver final UNEDF versions of LCCI codes, scripts, and test cases will be completed and released. Current version (6.5) at NERSC; expect final version by end of year; plans to publish in CPC or similar venue.

•Improve the scalability of BIGSTICK CI code up to 50,000 cores.Main barrier was reorthogonalization; now putting Lanczos vectors in memory to minimize I/O

• Use BIGSTICK code to investigate isospin breaking in pf shell Delayed due to problem with I/O hardware on Sierra

Page 30: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

UNEDF 2011 ANNUAL/FINAL MEETING

SciDAC-3 possible deliverables for BIGSTICK

(End of SciDAC-2: 3-body forces on 100,000 cores)

•Run with 3-body up to 1,000,000 cores on Sequoia,Nmax =10/12 for 12,14C

•Add in 4-body forces; investigate alpha-clustering with effective 4-body forces (via SRG or Lee-Suzuki)

•Currently interfaces with Navratil’s TRDENS to generate densities, spectroscopic factors, etc, needed for RGM reactioncalculations; will improve this: develop fast post-processingwith factorization

•Investigate general unitary-transform effective interactions, adding constraint to observables

Page 31: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

31

Sample application: cold atomic gases at unitarity in a harmonic trap

Using only 1 generator (d/dr) (very much like UCOM)

Fit to A =3, 1-, 0+

A = 4, 0+,1+, 2+

UNEDF -- MSU June 2010

starting rms = 2.32final rms = 0.58

UNEDF 2011 ANNUAL/FINAL MEETING

Page 32: UNEDF 2011 ANNUAL/FINAL MEETING Progress report on the BIGSTICK configuration-interaction code Calvin Johnson 1 Erich Ormand 2 Plamen Krastev 1,2,3 1 San

UNEDF 2011 ANNUAL/FINAL MEETING

Cross-fertilization of LCCI project:

BIGSTICKMFDn

On-the-fly construction of basis states and matrix elements

On-the-fly construction of basis states and matrix elements

Reorthogonalization and Lanczos vector management

Reorthogonalization and Lanczos vector management

NuShellX

J-projecte

d basis

J-projecte

d basisJ-projected basis

J-projected basis