top of page
Quemix image.jpg

BLOG

Intended Audience

  • Individuals interested in performing Molecular Dynamics (MD) simulations

  • Those looking to create machine learning potentials

  • Individuals facing challenges with the time-consuming process of generating machine learning potentials

  • Users of machine learning potential databases like CHGNet

  • Those aiming to accelerate ab initio MD simulations

FOR YOU

WHAT IS

What is Molecular Dynamics (MD) Simulation?

Materials are composed of electrons and atomic nuclei. Some material properties are primarily influenced by electrons, while others are dominated by the dynamics of atomic nuclei. When the goal is to observe such atomic dynamics, Molecular Dynamics (MD) simulations are employed. 

Since atomic nuclei are over 1800 times heavier than electrons, their response time is much slower in comparison. Electrons react quickly to external stimuli (with frequency responses in the visible or ultraviolet regions), while atomic nuclei respond more slowly (with frequency responses in the infrared or microwave regions). From the perspective of the fast-moving electrons, the movement of atomic nuclei appears slow. Therefore, in MD simulations focusing on atomic dynamics, it is typically assumed that the electrons remain in their ground state throughout the process.

画像1.png

DATABASE

The Marvel of Machine Learning Potential Databases

As explained above, machine learning MD consists of two phases: the phase of "creating" the interatomic potential and the phase of "using" the interatomic potential. I would like to emphasize that the "creating" phase is quite a labor-intensive step. After all, to create a good potential, one must first generate hundreds of training data points. This means performing hundreds of DFT calculations, which is a time-consuming task. Typically, during the "creating" step, the time spent running DFT calculations is overwhelming.

Moreover, for example, if you wish to conduct MD calculations to observe diffusion phenomena, the training data must include data from the diffusion process; without data from the diffusion pathway, it will not accurately describe the interatomic potential for diffusion. Therefore, if there is insufficient training data, the accuracy will suffer. It is common for one to take several days to create an interatomic potential on their own, making this a quite laborious step.

However, in recent years, there has been a global movement to collaboratively share learned interatomic potentials. Governments in various countries have supported initiatives to generate machine learning potentials, and academia has begun to publish databases of these potentials for free. This remarkable effort to take on the labor-intensive task for free provides tremendous benefits. Furthermore, since this database creation is conducted under competitive principles worldwide, anyone can utilize high-quality machine learning potentials for free.

If you look at the following site (https://matbench-discovery.materialsproject.org/), you will see how each database competes fairly and objectively against the same criteria for performance comparison. Additionally, the data is constantly updated, making it easy to see the best options available at any given time. The training data has been implemented for 146,000 materials, with numerous atomic configurations learned for each material, resulting in an astonishing total of 1.6 million learning data points. Given that this is available for free, there is no reason not to use it.

In terms of the number of materials learned, 146,000 comes very close to the coverage rate of 98% of the 150,000 materials in the massive inorganic database, Materials Project. This means that 98% of the structural data for inorganic materials synthesized in this world is covered.

By the way, in our product Quloud, these databases are accessible with just the push of a button. One of the most well-known representative databases among them is CHGNet. As of September 2024, it ranks as the fourth in scoring. Since it is a representative database, I often refer to CHGNet in the following discussions.

画像3.png

Excerpt from https://matbench-discovery.materialsproject.org/ as of September 2024.

ASSIGNMENT

Challenges of Using Machine Learning Potential Databases

Up to this point, I have explained that machine learning molecular dynamics (MD) is a remarkably well-balanced and effective method. Here, I will discuss some challenges associated with machine learning MD.

  1. Lower Accuracy for Unknown Substances: The first challenge is that, due to the pursuit of versatility, the accuracy is reduced for unknown substances. As explained in the section on how machine learning potential databases are created, data for 156,000 substances has been trained to cover a wide range of situations. However, regardless of how universally the training is conducted, it is inevitable that accuracy will drop for structures that are not included in the training data. This is a persistent challenge when using machine learning MD, irrespective of the use of databases. A solution to this first challenge is to apply a technique called fine-tuning, where some parameters of the obtained interatomic potential are adjusted for the specific substance of interest, or to create a new interatomic potential specifically for that substance. However, if one attempts to create the interatomic potential anew, they will once again have to bear the effort of “creating” the interatomic potential, as described above.

  2. Heavy MD Calculations Due to Versatility: The second challenge is that, in the pursuit of versatility, the MD calculations in the "use" phase of the interatomic potential become computationally intensive. This is due to the increase in the number of model parameters (hyperparameters) within machine learning when trying to achieve versatility. Consequently, the calculations for the interatomic potential itself become heavy. A common solution to this challenge is to use GPU machines to speed up calculations through enhanced computational power; however, this is not a fundamental solution, and the recent rise in GPU prices must also be considered. If calculations could be performed with models having fewer hyperparameters, the calculations in the classical MD portion for "using" the interatomic potential would also become faster.

MECANISMS

Ab Initio MD and Classical MD Calculations

The substances in our world are composed of electrons and atomic nuclei, both of which behave according to quantum mechanics. When simulating materials, it is common to treat electrons as quantum mechanical particles due to their mass difference from atomic nuclei, which are treated as classical particles. This theoretical separation is known as the Born-Oppenheimer approximation.

Now, focusing on the motion of atomic nuclei in MD calculations, we may wish to momentarily disregard the electrons. However, the interactions that hold atomic nuclei together are facilitated by the electrons, acting like a "glue." Thus, it is impossible to completely ignore electrons while considering only the atomic nuclei.

In ab initio MD, the behavior of electrons is solved using Density Functional Theory (DFT). The electronic states obtained through DFT provide the forces acting between atoms, allowing the atomic nuclei to be described as classical particles. Since the behavior of electrons is derived from a quantum mechanical foundation, ab initio MD can track the motion of atomic nuclei without requiring prior knowledge from experimental data.

Ab initio MD is known for its high precision and is a reliable method that aligns well with experimental values. However, it has the drawback of being computationally intensive, resulting in long computation times. While it is rewarding to perform ab initio MD calculations, executing long simulations over nanosecond (10^-9 seconds) time scales can be quite challenging from an atomic perspective.

On the other hand, there is classical MD, which allows us to disregard the electrons entirely and represent the forces between atoms with a potential, known as interatomic potential. In classical MD, the role of electrons as the "glue" that binds the atoms is incorporated into the interatomic potential. Some may be familiar with the Lennard-Jones potential, which describes the attractive behavior between molecules as a type of interatomic potential that models van der Waals forces.

By making these simplifications, classical MD significantly reduces computational costs, enabling faster simulations. It also allows for the simulation of larger systems. However, the drawback of classical MD is its accuracy. The precision is contingent upon the interatomic potential used, which generally tends to be lower. If interatomic potentials can be developed to better reproduce the results of ab initio MD, accuracy can improve, but achieving universally applicable potentials remains a challenge. Many researchers have worked on developing interatomic potentials, yet creating a versatile and effective potential is still difficult.

In the search for various magnetic materials, attempts have been made to use first-principles material calculations and empirical material simulations. In first-principles material calculations, research has been conducted to examine how the magnetic moment and magnetic anisotropy change when the elemental composition of a magnetic material is changed. However, the disadvantage of all of these is that they basically behave in the ground state, that is, at absolute zero. This is because magnetic materials have characteristic temperatures unique to each material, called the Curie temperature and Neel temperature, above which the magnetic properties disappear. Depending on the temperature, it will change whether or not the material can behave as a magnetic material even at room temperature. In other words, the effect of finite temperature is a factor that must be considered for magnetic materials. The fact that it cannot be considered is a major disadvantage. In addition, most magnetic materials are ceramics made of solidified particles, and particle size dependence is also a major factor of interest. For example, in magnetic memory, a particle of about 10 nanometers in size functions as one bit. However, in first-principles material simulation, it is hopeless to simulate a size of 10 nanometers in terms of calculation time estimates, and current computers cannot handle such a large system in its entirety. On the other hand, empirical material simulation has the advantage of being able to calculate the nominal temperature and to simulate large system sizes of 10 nanometers or more. However, the disadvantage is that parameters must be prepared in advance, and in the end, experiments must be conducted to obtain the parameters from the experimental results, so its use in actual fields has been limited.

Light

Heavy

Computational Cost (Time)

Low

High

Accuracy

Classical MD

Ab Initio MD

画像2.png

Lennard-Jones Potential
Expresses the potential acting between atoms as a function of the interatomic distance r. Here, A and B are fitting parameters.

EMERGENCE

Introduction of Machine Learning MD Calculations

As we have seen, classical MD has the extremely attractive feature of low computational cost. Efforts to create better interatomic potentials that can reproduce ab initio MD have recently combined with the machine learning boom, leading to significant advancements. This endeavor aims to create interatomic potentials using machine learning to replicate ab initio MD. It traces back to the work of Behler and Parrinello in 2007 (PRL 98, 146401(2007)). In their 2007 paper, they attempted to create interatomic potentials using neural networks to replicate ab initio MD. With the emergence of machine learning MD, the accuracy of classical MD has significantly improved, as shown in the following table. While machine learning MD does not reach the accuracy of ab initio MD, it still offers greater accuracy than classical MD. Consequently, machine learning MD has come to occupy an important position with a balance between computational cost and accuracy.

Typical Steps in Machine Learning MD

Step 1: "Creating" the interatomic potential  
DFT calculation results for various atomic configurations that serve as training data are obtained. Recently, machine learning methods, such as neural networks, are used to create the interatomic potential based on these results.  

Step 2: "Using" the interatomic potential obtained from the training  
In this step, the interatomic potential obtained in Step 1 is "used" to perform full-scale MD simulations, including long-time simulations or conducting MD simulations with larger system sizes.

Light

Light

Heavy

Computational Cost (Time)

Low

Medium

High

Accuracy

Classical MD

Machine Learning MD

Ab Initio MD

On-The-Fly MD Overview (1): Machine Learning MD

We at Quemix are excited to announce the implementation of a new feature in Quloud that enables On-The-Fly MD simulations. With this feature, MD simulations can be performed in a significantly shorter time. Since some readers might not be familiar with On-The-Fly MD, this article will provide an overview starting from "What is MD simulation?" and "What can be done with MD simulations?" We'll then explain ab initio MD, machine learning MD, and On-The-Fly MD simulations. Additionally, we'll discuss the capabilities unlocked by executing On-The-Fly MD simulations on Quloud.

In Part 1, we will explain what Molecular Dynamics (MD) simulations are, and clarify the differences between ab initio MD and machine learning MD. We will also touch on the challenges of using machine learning potential databases, which have gained attention in recent years.

INTRODUCTION

CALCULATION

What Can Be Calculated with MD Simulations?

Molecular Dynamics (MD) simulations enable the calculation of various phenomena, including ionic diffusion, chemical reactions, and infrared responses. In recent years, many researchers have shown interest in the diffusion of lithium ions, particularly in the context of lithium-ion batteries. Additionally, MD simulations can also be used to calculate dielectric properties.

bottom of page