Proteus: A new predictor for protean segments
Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
The discovery of intrinsically disordered proteins has led to a paradigm shift in protein science. Many disordered proteins have regions that can transform from a disordered state to an ordered. Those regions are called protean segments.
Many intrinsically disordered proteins are involved in diseases, including Alzheimer's disease, Parkinson's disease and Down's syndrome, which makes them prime targets for medical research. As protean segments often are the functional part of the proteins, it is of great importance to identify those regions.
This report presents Proteus, a new predictor for protean segments. The predictor uses Random Forest (a decision tree ensemble classifier) and is trained on features derived from amino acid sequence and conservation data.
Proteus compares favourably to state of the art predictors and performs better than the competition on all four metrics: precision, recall, F1 and MCC.
The report also looks at the differences between protean and non-protean regions and how they differ between the two datasets that were used to train the predictor.
Place, publisher, year, edition, pages
2015. , 49 p.
bioinformatics, protein, machine learning, predictor, protean segments, molecular recognition feature, intrinsically disordered proteins, proteus
Bioinformatics and Systems Biology Bioinformatics (Computational Biology)
IdentifiersURN: urn:nbn:se:liu:diva-121260ISRN: LITH-IFM-A-EX--15/3118--SEOAI: oai:DiVA.org:liu-121260DiVA: diva2:852903
Subject / course
Pilstål, Robert, Ph.D. student
Wallner, Björn, Associate professor