Small molecule autoencoders: architecture engineering to optimize latent space utility and sustainab...

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_doaj_primary_oai_doaj_org_article_9564c81fcebf465298f539ee1d29ff16

Small molecule autoencoders: architecture engineering to optimize latent space utility and sustainability

About this item

Full title

Small molecule autoencoders: architecture engineering to optimize latent space utility and sustainability

Author / Creator

Oestreich, Marie , Ewert, Iva and Becker, Matthias

Publisher

Cham: Springer International Publishing

Journal title

Journal of cheminformatics, 2024-03, Vol.16 (1), p.26-14, Article 26

Language

English

Formats

Articles

Publication information

Publisher

Cham: Springer International Publishing

More information

Scope and Contents

Contents

Autoencoders are frequently used to embed molecules for training of downstream deep learning models. However, evaluation of the chemical information quality in the latent spaces is lacking and the model architectures are often arbitrarily chosen. Unoptimized architectures may not only negatively affect latent space quality but also increase energy consumption during training, making the models unsustainable. We conducted systematic experiments to better understand how the autoencoder architecture affects the reconstruction and latent space quality and how it can be optimized towards the encoding task as well as energy consumption. We can show that optimizing the architecture allows us to maintain the quality of a generic architecture but using 97% less data and reducing energy consumption by around 36%. We additionally observed that representing the molecules as SELFIES reduced the reconstruction performance compared to SMILES and that training with enumerated SMILES drastically improved latent space quality.
Scientific Contribution:
This work provides the first comprehensive systematic analysis of how choosing the autoencoder architecture affects the reconstruction performance of small molecules, the chemical information content of the latent space as well as the energy required for training. Demonstrated on the MOSES benchmarking dataset it provides first valuable insights into how autoencoders for the embedding of small molecules can be designed to optimize their utility and simultaneously become more sustainable, both in terms of energy consumption as well as the required amount of training data. All code, data and model checkpoints are made available on Zenodo (Oestreich et al. Small molecule autoencoders: architecture engineering to optimize latent space utility and sustainability. Zenodo, 2024). Furthermore, the top models can be found on GitHub with scripts to encode custom molecules:
https://github.com/MarieOestreich/small-molecule-autoencoders
.
Graphical Abstract...

Alternative Titles

Full title

Small molecule autoencoders: architecture engineering to optimize latent space utility and sustainability

Authors, Artists and Contributors

Author / Creator

Oestreich, Marie
Ewert, Iva
Becker, Matthias

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_doaj_primary_oai_doaj_org_article_9564c81fcebf465298f539ee1d29ff16

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_doaj_primary_oai_doaj_org_article_9564c81fcebf465298f539ee1d29ff16

Other Identifiers

ISSN

1758-2946

E-ISSN

1758-2946

DOI

10.1186/s13321-024-00817-0

How to access this item

View record in Gale

About resource

View in old catalogue

Small molecule autoencoders: architecture engineering to optimize latent space utility and sustainab...

Small molecule autoencoders: architecture engineering to optimize latent space utility and sustainab...

Small molecule autoencoders: architecture engineering to optimize latent space utility and sustainability

About this item

Publication information

Subjects

More information

Scope and Contents

Alternative Titles

Authors, Artists and Contributors

Identifiers

Primary Identifiers

Other Identifiers

How to access this item

Connecting people and collections

Indigenous engagement

Schools and teachers

Stories