liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Chemformer: a pre-trained transformer for computational chemistry
AstraZeneca, Sweden.
Linköping University, Department of Computer and Information Science. Linköping University, Faculty of Science & Engineering. AstraZeneca, Sweden.
AstraZeneca, Sweden.
AstraZeneca, Sweden.
2022 (English)In: Machine Learning: Science and Technology, E-ISSN 2632-2153, Vol. 3, no 1, article id 015022Article in journal (Refereed) Published
Abstract [en]

Transformer models coupled with a simplified molecular line entry system (SMILES) have recently proven to be a powerful combination for solving challenges in cheminformatics. These models, however, are often developed specifically for a single application and can be very resource-intensive to train. In this work we present the Chemformer model-a Transformer-based model which can be quickly applied to both sequence-to-sequence and discriminative cheminformatics tasks. Additionally, we show that self-supervised pre-training can improve performance and significantly speed up convergence on downstream tasks. On direct synthesis and retrosynthesis prediction benchmark datasets we publish state-of-the-art results for top-1 accuracy. We also improve on existing approaches for a molecular optimisation task and show that Chemformer can optimise on multiple discriminative tasks simultaneously. Models, datasets and code will be made available after publication.

Place, publisher, year, edition, pages
IOP Publishing Ltd , 2022. Vol. 3, no 1, article id 015022
Keywords [en]
transformer; self-supervision; chemistry; reaction prediction; molecular optimization; QSAR
National Category
Bioinformatics (Computational Biology)
Identifiers
URN: urn:nbn:se:liu:diva-182947DOI: 10.1088/2632-2153/ac3ffbISI: 000749512700001OAI: oai:DiVA.org:liu-182947DiVA, id: diva2:1637980
Available from: 2022-02-15 Created: 2022-02-15 Last updated: 2022-12-08

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Search in DiVA

By author/editor
Dimitriadis, Spyridon
By organisation
Department of Computer and Information ScienceFaculty of Science & Engineering
In the same journal
Machine Learning: Science and Technology
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 112 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf