liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
InterPepRank: Assessment of Docked Peptide Conformations by a Deep Graph Network
Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-1696-0183
Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0001-7868-034X
Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-3772-8279
2021 (English)In: Frontiers in Bioinformatics, E-ISSN 2673-7647, Vol. 1, article id 763102Article in journal (Refereed) Published
Abstract [en]

Peptide-protein interactions between a smaller or disordered peptide stretch and a folded receptor make up a large part of all protein-protein interactions. A common approach for modeling such interactions is to exhaustively sample the conformational space by fast-Fourier-transform docking, and then refine a top percentage of decoys. Commonly, methods capable of ranking the decoys for selection fast enough for larger scale studies rely on first-principle energy terms such as electrostatics, Van der Waals forces, or on pre-calculated statistical potentials. We present InterPepRank for peptide-protein complex scoring and ranking. InterPepRank is a machine learning-based method which encodes the structure of the complex as a graph; with physical pairwise interactions as edges and evolutionary and sequence features as nodes. The graph network is trained to predict the LRMSD of decoys by using edge-conditioned graph convolutions on a large set of peptide-protein complex decoys. InterPepRank is tested on a massive independent test set with no targets sharing CATH annotation nor 30% sequence identity with any target in training or validation data. On this set, InterPepRank has a median AUC of 0.86 for finding coarse peptide-protein complexes with LRMSD < 4Å. This is an improvement compared to other state-of-the-art ranking methods that have a median AUC between 0.65 and 0.79. When included as a selection-method for selecting decoys for refinement in a previously established peptide docking pipeline, InterPepRank improves the number of medium and high quality models produced by 80% and 40%, respectively. The InterPepRank program as well as all scripts for reproducing and retraining it are available from: http://wallnerlab.org/InterPepRank.

Place, publisher, year, edition, pages
Lausanne, Switzerland: Frontiers Media S.A., 2021. Vol. 1, article id 763102
Keywords [en]
protein-protein interaction, machine learning, protein-peptide interaction, graph neural net, quality assesment
National Category
Bioinformatics (Computational Biology)
Identifiers
URN: urn:nbn:se:liu:diva-182180DOI: 10.3389/fbinf.2021.763102ISI: 001085563700001OAI: oai:DiVA.org:liu-182180DiVA, id: diva2:1626048
Funder
Swedish Research Council, 2016-05369, 2020-03352Carl Tryggers foundation , 20:453Swedish e‐Science Research CenterAvailable from: 2022-01-10 Created: 2022-01-10 Last updated: 2024-11-07Bibliographically approved
In thesis
1. Development and Application of Computational Models for Peptide-Protein Complexes
Open this publication in new window or tab >>Development and Application of Computational Models for Peptide-Protein Complexes
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Protein-protein interactions between a protein and a smaller protein fragment or a disordered segment of a protein are called peptide-protein interactions. Such interactions are commonplace in nature and vital for normal cell function in humans. For example, the onco-protein Myc con- tains a large disordered region with several segments involved in peptide-protein interactions as part of transcription regulation, and it is mis-regulated in the vast majority of all human can- cers. As such, understanding the structural details of peptide-protein interactions on an atomic level is a necessary endeavor for understanding disease pathways as well as facilitating targeted drug-design. 

While experimental methods for structure determination such as X-ray crystallography and NMR can determine the structure of many peptide-protein complexes, these methods are time- consuming and costly. Additionally, the disordered nature of peptides and a sometimes lower binding affinity than for protein-protein binding can lead to transient or weak (but still highly specific) interactions impossible to fully capture with experimental methods. This leads to the need for computational methods as support and complement. Such methods have classically used statistical potentials or simple template search approaches, but as the number of deposited structures in the protein databank (PDB) grows so does the potential for supervised machine learning. 

The papers in this thesis present the contributions of the author to the field of peptide-protein in- teraction complex prediction, mainly through use of machine learning models. The first papers apply a Random Forest classifier to detect similarities between binding interfaces deposited in the PDB and a peptide-protein pair being investigated to find the optimal templates for struc- ture prediction. In excess of producing predictions with good self-evaluation of performance, the development of the method also confirmed theories on the similarity of protein-protein, domain-domain, and peptide-protein interfaces. Two more method for peptide-protein docking are presented in later papers. One utilizes graph convolution neural networks to improve model selection from rigid-body-docking methods by including MSA profile information as a feature, which also lead to the discovery that while profile information such as position conservation does improve predictive performance, something also seen in the first papers, the most impor- tant features are the ones describing the structural details of the complex and the bonds between residues. The other uses a graph neural network as an additional scoring term to improve upon the already state-of-the-art performing local refinement method FlexPepDock, and is capable of refining even models generated by AlphaFold-multimer. 

Finally, two manuscripts focus on the application of computational approaches for research into the interactions of human cMyc with TBP and PPP1R10, respectively. In the first of these pa- pers, the template-based peptide-protein complex prediction methods developed in the earlier papers of the thesis are employed together with prior knowledge of the interaction to model the complex to a high degree of certainty not achievable by NMR alone. In the second of these papers, experimental data is used as a basis for computational modeling of the complex, and the modeled complex could act as a basis for further experiments characterizing the interaction. 

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2022. p. 225
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2206
National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:liu:diva-182181 (URN)10.3384/9789179291945 (DOI)9789179291938 (ISBN)9789179291945 (ISBN)
Public defence
2022-04-26, Planck, F-Building, Campus Valla, Linköping, 09:00 (English)
Opponent
Supervisors
Available from: 2022-03-25 Created: 2022-01-10 Last updated: 2023-12-28Bibliographically approved

Open Access in DiVA

fulltext(2645 kB)356 downloads
File information
File name FULLTEXT02.pdfFile size 2645 kBChecksum SHA-512
c6de35f47e992b1bb04c71f311364d4bcc3909931cb68f68de10671aea63da3595ebf8f6b7a0f01b4567e40d2118abb9c5ed885505702c80c028e3b4913ae3d3
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Johansson-Åkhe, IsakMirabello, ClaudioWallner, Björn

Search in DiVA

By author/editor
Johansson-Åkhe, IsakMirabello, ClaudioWallner, Björn
By organisation
BioinformaticsFaculty of Science & Engineering
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar
Total: 357 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 338 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf