12
ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

ARM and Mellanox Hackathon - GRChombo

Kacper Kornet

September 25, 2019

DAMTP, University of Cambridge

Page 2: ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

Table of contents

1

Page 3: ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

GRChombo

GRChombo is an AMR GR code developed by a team of researchers

from:

• Department of Applied Mathematics and Theoretical Physics

(DAMTP), University of Cambridge

• Argonne Leadership Computing Facility, Argonne National

Laboratory

• Department of Physics, King’s College London

• School of Mathematical Sciences, Queen Mary University of London

• Department of Physics, University of Oxford

• Institute of Mathematics and Physics, University of Louvain

Core developers: Josu C. Aurrekoetxea (KCL), Katy Clough (Oxford),

Amelia Drew (Cambridge), Pau Figueras (QMUL), Hal Finkel (ANL),

Tiago Frana (QMUL), Chenxia Gu (QMUL), Thomas Helfer (KCL),

Cristian Joana (UCLouvain), Kacper Kornet (Cambridge), Markus

Kunesch (Cambridge), Eugene Lim (KCL), Miren Radia (Cambridge),

James Widdicombe (KCL) 2

Page 4: ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

GRCombo

3

Page 5: ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

GRChombo: Parallelization levels

• Set of boxes distributed among with MPI

• Inside boxes outer loops parallelized with OpenMP

• Innermost loops vectorized with intrinsics

4

Page 6: ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

GRChombo: C++ template library

void BinaryBHLevel::specificEvalRHS(GRLevelData &a_soln, GRLevelData &a_rhs,

const double a_time)

{

// Enforce positive chi and alpha and trace free A

BoxLoops::loop(make_compute_pack(TraceARemoval(),

PositiveChiAndAlpha()),

a_soln, a_soln, INCLUDE_GHOST_CELLS);

// Calculate CCZ4 right hand side and set constraints

// to zero to avoid undefined values

BoxLoops::loop(

make_compute_pack(CCZ4(m_p.ccz4_params, m_dx, m_p.sigma),

SetValue(0, Interval(c_Ham, NUM_VARS - 1))),

a_soln, a_rhs, EXCLUDE_GHOST_CELLS);

}

5

Page 7: ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

GRChombo: C++ template library

// Compute the value of phi at the current point

template <class data_t>

data_t ScalarBubble::compute_phi(Coordinates<data_t> coords) const

{

data_t rr = coords.get_radius();

data_t rr2 = rr * rr;

data_t out_phi = m_params.amplitudeSF * rr2 *

exp(-sqr(rr - m_params.r_zero

/ m_params.widthSF));

return out_phi;

}

6

Page 8: ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

GRChombo: instrinsics classes

template <> struct simd\_traits<float>

{

typedef __m512 data_t;

typedef __mmask16 mask_t;

static const int simd_len = 16;

};

template <> struct simd<double> : public simd_base<double>

{

typedef typename simd_traits<double>::data_t data_t;

typedef typename simd_traits<double>::mask_t mask_t;

ALWAYS_INLINE

simd() : simd_base<double>(_mm512_setzero_pd()) {}

ALWAYS_INLINE

simd(double x) : simd_base<double>(_mm512_set1_pd(x)) {} 7

Page 9: ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

Porting GRChombo to ARM

• Finding best compiler options (-fno-fast-errno)

• Replacing x86 specific bits with general one

• NEON port

• rudimentary SVE port (not vector length agnostic yet)

8

Page 10: ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

GRCombo benchmarks on ARM cluster

9

Page 11: ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

GRCombo benchmarks on ARM cluster

10

Page 12: ARM and Mellanox Hackathon - GRChombo · 2020. 5. 11. · ARM and Mellanox Hackathon - GRChombo Kacper Kornet September 25, 2019 DAMTP, University of Cambridge

GRCombo on Bluefield

• Runs without source modifications (although one needs to be careful

about architecture options)

• Using same number of cores ∼ 3 slower then ThunderX2

11