Batteries are one of the single most important devices in today’s world. They are used in pretty much any portable electronic device: phones, cars, toys, and even satellites. As a consequence, they present several fundamental challenges that need to be surpassed for society to become sustainable.
One important issue slowing down the motion toward a less fossil dependent society is that renewable energy sources do not, for the majority, provide a constant flux of electricity: solar is available only during the sunny days (sorry U.K.), wind turbines need, well, wind, and so on. Because of this, the electricity produced needs to be stored, which is currently done in very inefficient ways.
For instance, one of the most common energy conservation mechanism are “pump-storage” hydroelectric dams. Water is converted to potential energy by “pumping it up the dam” using the electricity coming from renewable sources, which is then transformed into electricity again when the network needs it by letting it flow back down through hydro turbines. Needless to say, a lot of energy is lost in this process.
Additionally, current battery technologies are made of rare minerals, mainly cobalt and lithium, which are rare and with obviously finite reserves. Since these batteries are ubiquitous, there is a strong need to either make the current ones more efficient, to find a way to cleanly recycle them, or to find other compounds in order to diversify the mineral consumption and hopefully be sustainable over the long run.
Furthermore, battery production currently has a large carbon footprint. The production of an electric car contributes twice as much, mostly due to the batteries. All together, these problems provide a strong incentive for battery research.
The problem with battery research is the size of the solution space. There are hundreds of billions of potential compounds of molecules, and to even start testing these compounds, one needs to find the structure that will make batteries which work, which conserve energy well, and very importantly, which are safe.
Therefore, the holy grail here would be a function which, given a structure, quickly outputs the various characteristics, but of course we’re still very far away from that. Instead, the aim of the paper is to try and see if it’s possible to predict the atomisation energy of the molecules. This property can be used to immediately filter out bad compounds, and therefore allow researchers to focus on the ones with more potential. There are two currently used methods to compute these, one which lacks speed (G4MP2), and the other accuracy (B3LYP).
This is where the ML approach starts. They approached three methods: kernel based ridge regression (FCHL), and continuous filter convolutional neural network (SchNet), and with various targets and model. The method that worked the best was using Δ-learning. that is training on the difference between B3LYP and G4MP2 energies.
Overall, SchNet-delta and FCHL-delta were the best perfmorming models, achieving similar results after training. However, FCHL required less training data but more computing power, whereas the SchNet had the opposite property. Furthermore, FCHL execution time scales linearly with the size of the training dataset, since it needs to compare the current molecule with known ones, whereas one of the great advantages of neural networks is their execution speed. Once the network is trained, it amounts to a series of matrix multiplications. Even deep convolutional neural networks on an average CPU will still give results of the order of tenths of seconds per prediction. Therefore, if the neural network was able to get good accuracy, it would get the speed for free.
Even on heavier molecules, which are much harder to predict and for which there was less and even no training data available, the models still achieved a better accuracy than B3LYP.
Our best-performing model, SchNet delta, predicts G4MP2 energies with an MAE of only 4.5 meV (0.1 kcal/mol) after being trained on 117,232 molecules: much less than that between the experiment and G4MP2 (~0.8 kcal/mol)
Overall, this shows a great example of applying machine learning for research purposes. It is not about solving the problem or writing the equations for us, but instead helping the researcher focus on the more promising paths. This is the use case I personally envisage Machine Learning will develop and achieve the most: not as central decision making computer, but rather as an assistant to human decision.
The authors did a lot more work than shown in this article, testing more methods, in more varied conditions, that were not discussed in this article. If interested, the link and reference is provided below.
Have a good day!
Ward, L., Blaiszik, B., Foster, I., Assary, R., Narayanan, B., & Curtiss, L. (2019). Machine learning prediction of accurate atomization energies of organic molecules from low-fidelity quantum chemical calculations. MRS Communications, 9(3), 891–899. doi:10.1557/mrc.2019.107
“Ricardo study finds electric and hybrid cars have a higher carbon footprint during production than conventional vehicles, but still offer a lower footprint over the full life cycle”. Green Car Congress. 2011–06–08. Retrieved 2011–06–11.