8th Users' Conference of IT4Innovations

Name: 8th Users' Conference of IT4Innovations
Start: 2024-11-04T10:00:00+01:00
End: 2024-11-05T14:00:00+01:00
Location: IT4Innovations

4–5 Nov 2024

IT4Innovations

Europe/Prague timezone

Support

pr@it4i.cz

Bottom-up machine learning potentials for peptides

4 Nov 2024, 12:40

20m

atrium (IT4Innovations)

atrium

IT4Innovations

Studentská 6231/1B 708 00 Ostrava-Poruba

User's talk Users' talks Users' Talks I

Erik Andris (IOCB Prague)

Can larger peptides be described just knowing their smaller constituents? Concretely: can we infer the potential energy surface (PES) of a 20-peptide just from the PESs of single amino acids and dipeptides? To answer these long-standing questions,[1] we trained equivariant neural network potentials[2] on oligopeptides of varying sizes (1-3; taken from PeptideCS[3] and P-CONF_1.6M[4] datasets) and tested the performance of these potentials on larger peptides. The training as well as the evaluation data consisted of structures optimized at GFN-2+ALPB(water) level of theory, some of them with fixed main chain and side chain dihedral angles. The energies were calculated at BP86/D3Rezac, COSMO-RS level described previously.[3] Because the neural network has no built-in inductive biases besides the locality of interactions (5 Å distance cutoff) and equivariance, we can test if any new interactions appear in larger peptides that are not present in the dipeptides used for training. Previous research in our group indicated that this should not be possible, and energy function that would estimate energy of longer chains from shorter ones could not be constructed.[5] Interestingly, a system trained on dipeptides and amino acids only can already predict energy of pentapeptides with 1 kcal mol-1 RMSE and it can also correctly identify the global minimum of a larger protein out of 1000 structures (Figure 1). We believe that resulting potentials can be immediately used to significantly accelerate calculations. In addition, the excellent performance of the ML potentials also indicates that a bottom-up theoretical approach to predicting protein structures from first principles might be possible.

Figure 1. (a) Description of the training process and the training peptide structures. (b) Example of a larger test peptide. (c,d) Actual (DFT) vs predicted (NN trained on mono- and dipeptides) absolute energies of (c) pentapeptides and (d) conformers of (b) (in eV; atomic energies were subtracted).

References
[1] Schweitzer-Stenner, R. The relevance of short peptides for an understanding of unfolded and intrinsically disordered proteins. Phys. Chem. Chem. Phys. 2023, 25, 11908-11933.
[2] Batzner, S.; Musaelian, A.; Sun, L.; Geiger, M.; Mailoa, J. P.; Kornbluth, M.; Molinari, N.; Smidt, T. E.; Kozinsky, B. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 2022, 13, 2453.
[3] Kalvoda, T.; Culka, M.; Rulíšek, L.; Andris, E. Exhaustive Mapping of the Conformational Space of Natural Dipeptides by the DFT-D3//COSMO-RS Method. J. Phys. Chem. B 2022, 126, 5949–5958.
[4] Culka, M.; Kalvoda, T.; Gutten, O.; Rulíšek, L. Mapping Conformational Space of All 8000 Tripeptides by Quantum Chemical Methods: What Strain Is Affordable within Folded Protein Chains? J. Phys. Chem. B 2021, 125, 58–69.
[5] Kalvoda, T. Studium konformačního chování krátkých peptidových fragmentů metodami kvantové chemie. Master Thesis [Online], Charles University, Prague, July 2020. http://hdl.handle.net/20.500.11956/122714 (accessed Sep. 4, 2024).

Erik Andris (IOCB Prague)

Lubomir Rulisek Tadeas Kalvoda (Institute of Organic chemistry and Biochemistry of the CAS) Ján Michael Kormaník (IOCB Prague)

There are no materials yet.

8th Users' Conference of IT4Innovations

Support

Bottom-up machine learning potentials for peptides

atrium

IT4Innovations

Speaker

Description

Author

Co-authors

Presentation materials

Choose timezone

8th Users' Conference of IT4Innovations

Support

Speaker

Description

Author

Co-authors

Presentation materials