6
National Institute of Advanced Industrial Science and Technology Status report on the large-scale long-run simulation on the grid - Hybrid QM/MD simulation - Grid Technology Research Center Grid Technology Research Center AIST AIST Hiroshi Takemiya, Yoshio Tanaka Hiroshi Takemiya, Yoshio Tanaka

Status report on the large-scale long-run simulation on the grid - Hybrid QM/MD simulation -

  • Upload
    muncel

  • View
    51

  • Download
    0

Embed Size (px)

DESCRIPTION

Status report on the large-scale long-run simulation on the grid - Hybrid QM/MD simulation -. Grid Technology Research Center AIST Hiroshi Takemiya, Yoshio Tanaka. v=0.009 Å/fs. 525fs. 40Å. Goal of the experiment. - PowerPoint PPT Presentation

Citation preview

  • Goal of the experimentTo verify the effectiveness of our programming approach for large-scale long-run grid applications FlexibilityRobustnessefficiencyFriction simulationNano-scale prober moves on the Si substrateRequiring hundreds of CPUsRequiring long simulation time over a few monthsNo. of QM regions and No. of QM atoms change dynamicallyGridifying the application Using GridRPC + MPI< Initial condition> 2 QM regions with 72 + 41 QM atoms Totally 28598 atoms

  • Testbed for the Friction SimulationUsed 11 clusters with totally 632 CPUs in 8 organizations. PRAGMA ClustersSDSC (32 CPUs), KU (8 CPUs), NCSA (8 CPUs), NCHC (8 CPUs)Titech-1(8 CPUs), AIST(8 CPUs)AIST Super ClusterM64 (128 CPUs), F32-1(128 CPUs + 128 CPUs)Japan ClustersU-Tokyo (128 CPUs), Tokushima-U (32 CPUs), Titech-2 (16 CPUs)M64F32NCHCNCSASDSCU-TokyoTitech-2Tokushima-UTitech-1AISTKU

  • Result of the Friction Simulation Experiment Time: 52. 5 daysLongest Calculation Time: 22 dayManual restart: 2 timesExecution failure: 165 timesSucceeded in recovering these failuresChanging the No. of CPUs used: 18 timessucceeded in adjusting No. of CPUs to the No. of QM regions/QM atoms

  • Summary and future workOur approach is effective for running large-scale grid applications for a long timeNeed more grid servicesGetting information on available resourcesResource reservation

    Coordinating with resource manager/schedulerNeed cleaner MPImpich quits leaving processes/IPC resources

    Using GridMPI in place of mpich