9
Performance and Scaling Effects of MD Simulations using NAMD 2.7 and 2.8 GradOS Course Project Progress Report Kevin Kastner Xueheng Hu

Performance and Scaling Effects of MD Simulations using NAMD 2.7 and 2.8

Embed Size (px)

DESCRIPTION

Performance and Scaling Effects of MD Simulations using NAMD 2.7 and 2.8. GradOS Course Project Progress Report Kevin Kastner Xueheng Hu. Introduction. Molecular Dynamics (MD) MD is extremely computationally intensive Primarily due to the sheer size of the system - PowerPoint PPT Presentation

Citation preview

Performance and Scaling Effects of MD Simulations using NAMD 2.7 and 2.8

Performance and Scaling Effects of MD Simulations using NAMD 2.7 and 2.8GradOS Course Project Progress ReportKevin KastnerXueheng Hu

1Introduction Molecular Dynamics (MD)MD is extremely computationally intensivePrimarily due to the sheer size of the systemLarge system simulation can potentially take thousands of years on a modern desktopNAMD Parallelized simulation tool for MDRecent release is 2.82Our course project is mainly about investigating the performance attribute in molecular dynamics simulations.

Molecular dynamics is a virtual simulation that depicts the movements of individual atoms and simple molecules in a given system. AndMD simulations are usually very computationally intensive, which is primarily due to the sheer size of the systems being simulated. Large system simulations could potentially take thousands of years to complete on a modern desktop.

The simulation tool that were using is NAMD, and the most recent release is 2. 8

2GPCR Simulation Example

3Here are some videos of what a molecular dynamics simulation does.(Start left video)On the left, is a G-Protein Coupled Receptor, or GPCR protein in a 10 ns simulation, that is, it shows the amount of movement the actual protein would do in only 10 ns real-time.You can basically think of this video as the protein being slowed down to about one-billionth its original speed.Now, the reason that the proteins movements in the simulation appears jagged is simply because not every step is recorded.In this case, only every 100,000 steps are shown.(Start right video)On the right is the same protein as on the left, except this one also shows all of the surrounding water and lipid atoms that are also being calculated in the MD simulation. It should also be noted that there are more atoms in the protein as well that are not shown.This ribbon form of the protein was chosen for simplicity and so that you could actually see a protein, instead of just a mass of stuff.There are actually many more atoms present in the protein than it seems.As all of these atoms are being considered when doing calculations in MD, you can see how it would be quite computationally intensive.3(Original) Proposed WorkPerformance Comparison: NAMD 2.7 vs 2.8Investigate the main cause of performance decrease of NAMD 2.8Will test same protein system using each version, comparing efficiency of eachHow different size/complexity of the system affects the performance of NAMDScalability observationDetermine max performance

4Based on previous work done by Kevin, it appeared that there was a performance decrease in the 2.8 version of NAMD compared to the previous version. As its shown in the graph, when the number of cores increases beyond 48, NAMD 2.8 actually performs worse than 2.7 does.

Our work contains two main parts: In the first place, we are going to investigate the main cause of performance decrease of NAMD 2.8Secondly the second is to explore how different sizes of systems affect the performance of NAMD.

Molecular simulations involve systems of different sizes. There will be an upper bound where adding more cores will never improve the performance any more. One of our goal is to explore as the system becomes more and complex, how the performance will be affected by increasing the number of cores. By doing this we could potentially find out the optimal # of cores for systems of different sizes, and thus accumulate experimentation experiences for future research in MD simulation.4Preliminary Testing

G-Protein Coupled Receptor (GPCR) simulation system contains ~57000 atomsTesting simulation efficiency with varying core amounts12, 24, 48, 96, 120, 196300, 504, 10085 runs done per core set for each version5The protein system which was used for our preliminary results is the GPCR protein system mentioned earlier.The entire system contains about 57000 atoms.The simulation system was tested on the Kraken high-performance computing cluster, which as you may recall was discussed by Dr. Timothy Stitt in one of our previous guest lectures.We tested the simulation efficiency of the two NAMD versions with varying core amounts, which I will refer to as core sets.In addition to the 12-196 cores that were tested previously, we also included 300, 504, and 1008 cores to determine where each version starts to decrease in performance.We did 5 runs for each core set for each version of NAMD.5Preliminary Results6Shown here are the average performance efficiencies of all 5 runs, with corresponding standard deviation displayed as error bars.As can be seen here, NAMD 2.7 does the same or better than version 2.8 for up to and including 300 cores, supporting our results from earlier.However, an unexpected event happened, in that for the 504 and 1008 tests, NAMD 2.8 did much better than NAMD 2.7.Furthermore, NAMD 2.7s efficiency begins to decline at approximately the same point as NAMD 2.8s efficiency has its most drastic increase, with the exception of the beginning, of course.We are as of yet uncertain as to why this occurs.This graph also gives us an approximate optimal number of cores for our 57000 atom system on each version of NAMD, with 2.7s optimum being around 300 and 2.8s being around 500-600, most likely.More core sets would need to be tested to find the true optimal number of cores, which we may try to get to if we have the time.6Preliminary Results7This chart shows the average estimated efficiency of each core in each core set compared to our baseline metric of 12 cores.Note that 12 cores are used instead of 1 as this is the lowest amount of cores that can be used in Kraken.As is expected, you see a decrease in the efficiency of each core with the increasing number of cores used.This is of course due to the need for increasing amount of communication between the cores to complete the task.In agreement with the previous chart, NAMD 2.7 makes better use of each core in each core set up to and including 300 cores.However, NAMD 2.8 outperforms 2.7 in the 504 and 1008 core sets, which had also been shown in the previous chart.7Future WorkThe Scaling Observation How different sizes/complexity of system affect the performance of NAMDFurther investigation of performance diff. between NAMD2.7b1 and NAMD2.8Capture the system and network informationStrace, NetHogs

8The future work will be concentrated on two aspects:

The first one is the Scaling Observation: which is to investigate how different sizes of system will affect the performance, and also try to accumulate experiences in the optimal # of cores that should be used for systems of different sizes.

Another aspect is to explore the specific reasons that cause the performance difference between NAMD 2.7 and NAMD 2.8 when simulating the same system, from the perspective of both system calls and networking communications. So tools such as strace and NetHogs will be needed for this purpose.

We had originally planned to use another tool, tcpdump, to help in this investigation but are no longer pursuing that course as this tool requires root access to Kraken, which we do not have and they would most likely not give us.8Questions?

9(Start video)So with that, are there any questions?9