Parallelization and Optimization of Numerical Methods in the Polaris(MD) Molecular Dynamics Software

Applicant

Prof. Dr. Martin Zacharias
Lehrstuhl für Theoretische Biophysik (T38)
TU München

Project Overview

Proteins are one of the most important building blocks of life. They are the machinery of the cell, controlling, regulating, and signalling almost all functions. Over 20,000 different proteins are encoded for in the human genome and an average human cell may contain billions of proteins. Understanding their complex behavior and interaction is critical to the development of new medicines and the prevention of afflictions such as Alzheimer’s Disease and cancer.

One step towards this understanding is in the computer simulation of proteins. These molecular dynamics (MD) simulations follow the individual motions of the atoms of a protein or proteins, calculating the forces between the atoms at each step and advancing their positions. In this way, the behavior of the proteins can be modeled and predicted. Simulations also provide a controlled environment to experiment with new systems in a way that is far simpler than in the laboratory.

Significant biological processes occur across many orders of magnitude in time, from nanoseconds to minutes, and molecular systems may range from a few tens of angstroms, to hundreds of nanometers in size. At the atomic level, this may constitute thousands to many millions of atoms. The environment also plays an important role and can easily contribute many more millions of atoms for an aqueous solution. It is therefore critical that simulation software is fast and can take full advantage of the computing resources available at today’s—and tomorrow’s—supercomputing centers.

The aim of this proposal is to fund at least two students for one year. Paul Westphälinger has just completed his Master’s degree with Prof. Zacharias and Falko Späth, a computer science/engineering (CSE) student, is currently studying in the group. They will develop important extensions to the molecular dynamics software Polaris(MD) that will take advantage of the parallel and distributed supercomputing technology as well as accelerator cards such as GPUs and Intel MICs. These additions will allow larger and more complex protein simulations to be performed in more realistic environments, while at the same time maximizing the use of the computing resources.

Many other molecular dynamics packages, of course, exist such as Amber, Gromacs, and NAMD. The force-fields typically used with these programs are based on fixed-charge models that have been parameterized based on experimental data. These models allow very little flexibility to react to changes in the environment, as may happen, for example, during the folding of a protein. The parameterization may also capture unknown effects and properties within the data. This can make it challenging to understand simulation results as a consequence of the physical model.

It is largely recognized that a significant energy component that is missing from these standard force-fields is from induced polarization effects. As such, there is a great deal of research into developing models that can capture these effects and parameterizing these new force fields. Polarization is generally captured with one of three methods–fluctuating charges, the Drude model, and induced point dipoles –and variations of these methods are implemented in the above mentioned packages. Higher order multipole force fields are also being developed as part of the AMOEBA force field.

Polaris(MD) has been under development for 20 years by Dr. Michel Masella at the French Alternative Energies and Atomic Energy Commission (CEA) and is used by several groups within France, the United States, and now in Germany at the TUM. The key feature of Polaris(MD) is the implementation of an efficient, novel, induced-dipole polarization model for atoms and a polarizable coarse-grain water model. The model also includes important dispersion and repulsion terms and hydrogen bonding terms that are critical for realistic water behavior. In addition to the mathematical formulation, the parameters for the model have been exclusively derived from quantum computation. This avoids over constraining the model to a particular environment or structure.

One computation in particular that would benefit from this project is the simulation of the PMV capsid. This bacterial virus capsid contains over 500,000 atoms and is embedded in over 2 million water molecules. The computation will be run for about 10 ns of simulated time. This is a significant challenge as it tests the limits of the computational model in terms of the size of the problem and the length of time to be simulated.

The project will be coordinated by two PIs: Prof. Dr. Martin Zacharias of the Lehrstuhl für Biophysik Institute at TUM, and Dr. Jonathan Coles, a research associate in the same group. The student will be supervised by Jonathan Coles, who has degrees in Computer Science and Theoretical Astrophysics, as well as extensive experience developing high performance parallel scientific software. He is the second main developer of Polaris(MD) and responsible for the distributed parallelization of the code and the FMM implementation. The results of this project will therefore be merged with the main development branch.