7x70

home

Neural Message Passing for Quantum Chemistry

Neural Message Passing for Quantum Chemistry Paper

Before Reading:

Honestly, I had no idea what this was going to be

Plan

LLM summary of paper

Ask questions from LLM

Watch YouTube video on it (didn't need to)

Read actual paper

Teach out loud

LLM Summary Notes:

Some kind of graph neural network?

Key Concepts

Density functional theory: wave function that calculates the electronic structure of atoms/molecules/solids
Goal of network is to efficiently predict quantum mechanical properties of molecules

Paper Notes:

Introduces MPNNs to address chemistry calculations, achieving SOTA (in 2017)

Feature engineering: making datasets that have more and more labels and stuff (features) and chem industry has mostly focused on this (yup it's on the first paragraph of ML in bioinformatics Wikipedia ML in Bioinformatics)

Network went from 1000s to 0.01s to calculate properties of molecules

Currently in chem NN industry, tons of data is becoming available, authors want to make use of it, and think graph networks are the way to go due to inductive biases possibly favoring molecular info.

Authors took the most common NNs for chem/graph stuff and turned it into MPNNs

MPNN is a framework not just a model

From the image provided in the paper, it looks like MPNNs are like diffusion models. Every iteration they add something new to the last iteration, and the final iteration contains the end output.

Graph Networks:

Each node has a hidden state. It's updated based on the hidden states of its neighbors, and edges connecting it to them (I wonder if self-attention could be used to replace this... to transformer-like models but for graph data - the 'neighbor gives me information' thing seems inefficient, especially for molecules where a far-off node in the graph can have an effect on the nature of the current node (for example with long polar molecules where one end may affect another))

This update happens iteratively

Finally, there is a decoder stage (the authors called this 'readout')

Yeah ok the whole idea is that there's a class of networks that use an attention-like mechanism on graphs. And they have good results for approximating results from DFT. Ok lol.

Author is SUPER bullish on this family of models and urges everybody to move forward with it in his field (chemistry or bioinformatics or something idk)