Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Final PresentationProject Dirac Equation in (1+1)-D
on a staggered gridwith perfectly matched boundary
High Performance Computing 1Seminar
Winter semester 2016/17
Short Recaptiulation
● Solution of the Dirac Equation on a staggered grid for spinors u and v
to avoid the Fermion-doubling problem, see e.g. [Gattringer/Lang: Quantum Chromodynamics on the Lattice, Springer 2010, p. 110f]
Perfectly Matched Layer Method
● Wave solution become unphysically reflected at the boundaries
● Perfectly Matched Boundary Method absorbs wave functions as the approach the boundary
● Method: Analytic continuation of the spatial coordinate into the complex plane:
● Exponential decay of wave function:
Dirac Equation with PML
Discretized Dirac Equation on the staggered grid
Numerical Solution for initial Gaussian wave package
Algorithm
● 4 equations for 2 spinors u, v, and 2 auxiliary fields psi_u, psi_v
● Corresponds to 4 for-loops (temporal integration) over the 4 quadratures
● Temporal quadrature for u and psi_u as well as v and psi_v can be combined: loop merging
Resolving Dependencies
First for-loop Second for-loop
Code optimization
● Version 1.0 original pure Python code
● Version 1.1 loop-merging● Version 2.0 transfer of
quadrature to Fortran subroutine (Python-Fortran wrapper code)
● Version 2.1 Implementation of OpenMP git version history
Loop merging
Interfacing Python with Fortran subroutine
● Program control (parameters such as grid size, inis, bcs) with Python
● Quadrature is transferred to Fortran subroutine called from Python (using f2py, generating shared object file, which can be accessed like a Python module)
● Result is returned to Python main script for postprocessing (plotting, movie, postprocessing the result, ...)
● Extreme speedup by transfer of the quadrature to Fortran
Compilation of Fortran source code with f2py:
#!/usr/bin/env python
import os
cmd0 = "rm solve_DiracEqn_on_the_lattice.so"
cmd1 = "f2py -c --fcompiler=gnu95 --f90flags='-fopenmp' -lgomp -m " \
+ "solve_DiracEqn_on_the_lattice solve_DiracEqn_on_the_lattice.f95"
cmd2 = "time ./DiracEqn1p1D_fortran_interface.py"
os.system(cmd0)
os.system(cmd1)
Calling the Fortran subroutine from Python
Computation time of pure Python code vs Python-Fortran interface code
Code Parallelization with OpenMP
● Implementation of OpenMP in Fortran● Only the quadrature (2 temporal do-
loops) are parallelized● Spatial grid size N_x = 512 chosen to
be multiple of 1, 2, 4, 8, 16 to allow for equal chunk sizes of spatial sections (figure on the right)
● Computation time measured only for this Fortran subroutine, not including the serial part
● Code run on 12-core machine 143.50.47.128 10 times for each setting of gridsize and number of cores – minimum taken for determining the speedup Thread no. 1 2 3 4 ...
Scheduled static do loops
● Spinors and auxiliary fields are shared between threads, only loop index is private
● Threads process chunks of size (N_x – 2)/num_cores
● Forcing ordered sequence not strictly needed
● omp do ordered (finally commented out here) does not lead to speedup while omp do schedule(static) does
Computation time vs. number of threads
Comparison of original pure Python code with OpenMP-Fortran version