liu.seSearch for publications in DiVA
Change search
Refine search result
123 1 - 50 of 144
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Abraham-Nordling, Mirna
    et al.
    Karolinska institutet.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Nordling, Erik
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Model of the complex of Parathyroid hormone-2receptor and Tuberoinfundibular peptide of39 residues2010In: BMC Reseach Notes, ISSN 1756-0500, Vol. 3, no 270Article in journal (Refereed)
    Abstract [en]

    Background

    We aim to propose interactions between the parathyroid hormone-2 receptor (PTH2R) and its ligand the tuberoinfundibular peptide of 39 residues (TIP39) by constructing a homology model of their complex. The two related peptides parathyroid hormone (PTH) and parathyroid hormone related protein (PTHrP) are compared with the complex to examine their interactions.

    Findings

    In the model, the hydrophobic N-terminus of TIP39 is buried in a hydrophobic part of the central cavity between helices 3 and 7. Comparison of the peptide sequences indicates that the main discriminator between the agonistic peptides TIP39 and PTH and the inactive PTHrP is a tryptophan-phenylalanine replacement. The model indicates that the smaller phenylalanine in PTHrP does not completely occupy the binding site of the larger tryptophan residue in the other peptides. As only TIP39 causes internalisation of the receptor and the primary difference being an aspartic acid in position 7 of TIP39 that interacts with histidine 396 in the receptor, versus isoleucine/histidine residues in the related hormones, this might be a trigger interaction for the events that cause internalisation.

    Conclusions

    A model is constructed for the complex and a trigger interaction for full agonistic activation between aspartic acid 7 of TIP39 and histidine 396 in the receptor is proposed.

  • 2.
    Al-Absi, Thabit
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Efficient Characterization of Short Anelloviruses Fragments Found in Metagenomic Samples2012Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Some viral metagenomic serum samples contain a huge amount of Anellovirus, which is a genetically diverse family with a few conserved regions making it hard to efficiently characterize. Multiple sequence alignment of the Anelloviruses found in the sample must be constructed to get a clear picture of Anellovirus diversity and to identify stable regions. Using available multiple sequence alignment software directly on these fragments results in an MSA of a very poor quality due to their diversity, misaligned regions and low-quality regions present in the sequence.

    An efficient MSA must be constructed in order to characterize these Anellovirus present in the samples. Pairwise alignment is used to align one fragment to the database sequences at a time. The fragments are then aligned to the database sequences using the start and end position from the pairwise alignment results. The algorithm will also exclude non-aligned portions of the fragments, as these are very hard to handle properly and are often products of misassembly or chimeric sequenced fragments. Other tools to aid further analysis were developed, such as finding a non-overlapping window that contains the most fragments, find consensus of the alignment and extract any regions from the MSA for further analysis.

    An MSA was constructed with a high percent of correctly aligned bases compared to an MSA constructed using MSA softwares. The minimal number of genomes found in the sampled sequence was found as well as a distribution of the fragments along the database sequence. Moreover, highly conserved region and the window containing most fragments were extracted from the MSA and phylogenetic trees were constructed for these regions. 

  • 3.
    Alexsson, Andrei
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics .
    Unsupervised hidden Markov model for automatic analysis of expressed sequence tags2011Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This thesis provides an in-depth analyze of expressed sequence tags (EST) that represent pieces of eukaryotic mRNA by using unsupervised hidden Markov model (HMM). ESTs are short nucleotide sequences that are used primarily for rapid identificationof new genes with potential coding regions (CDS). ESTs are made by sequencing on double-stranded cDNA and the synthesizedESTs are stored in digital form, usually in FASTA format. Since sequencing is often randomized and that parts of mRNA contain non-coding regions, some ESTs will not represent CDS.It is desired to remove these unwanted ESTs if the purpose is to identifygenes associated with CDS. Application of stochastic HMM allow identification of region contents in a EST. Softwares like ESTScanuse HMM in which a training of the HMM is done by supervised learning with annotated data. However, because there are not always annotated data at hand this thesis focus on the ability to train an HMM with unsupervised learning on data containing ESTs, both with and without CDS. But the data used for training is not annotated, i.e. the regions that an EST consists of are unknown. In this thesis a new HMM is introduced where the parameters of the HMM are in focus so that they are reasonablyconsistent with biologically important regionsof an mRNA such as the Kozak sequence, poly(A)-signals and poly(A)-tails to guide the training and decoding correctly with ESTs to proper statesin the HMM. Transition probabilities in the HMMhas been adapted so that it represents the mean length and distribution of the different regions in mRNA. Testing of the HMM's specificity and sensitivityhave been performed via BLAST by blasting each EST and compare the BLAST results with the HMM prediction results.A regression analysis test shows that the length of ESTs used when training the HMM is significantly important, the longer the better. The final resultsshows that it is possible to train an HMM with unsupervised machine learning but to be comparable to supervised machine learning as ESTScan, further expansion of the HMM is necessary such as frame-shift correction of ESTs byimproving the HMM's ability to choose correctly positioned start codons or nucleotides. Usually the false positive results are because of incorrectly positioned start codons leadingto too short CDS lengths. Since no frame-shift correction is implemented, short predicted CDS lengths are not acceptable and is hence not counted as coding regionsduring prediction. However, when there is a lack of supervised models then unsupervised HMM is a potential replacement with stable performance and able to be adapted forany eukaryotic organism.

  • 4.
    Almgren, Malin
    et al.
    Karolinska Institutet.
    Nyengaard, Jens R
    Aarhus University.
    Persson, Bengt
    Linköping University, The Institute of Technology. Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics .
    Lavebratt, Catharina
    Karolinska Institutet.
    Carbamazepine protects against neuronal hyperplasia and abnormal gene expression in the megencephaly mouse2008In: Neurobiology of Disease, ISSN 0969-9961, E-ISSN 1095-953X, Vol. 32, p. 364-376Article in journal (Refereed)
  • 5.
    Almstedt, Karin
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Biochemistry. Linköping University, The Institute of Technology.
    Lundqvist, Martin
    Linköping University, Department of Physics, Chemistry and Biology, Molecular Biotechnology . Linköping University, The Institute of Technology.
    Carlsson, Jonas
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Karlsson, Martin
    Linköping University, Department of Physics, Chemistry and Biology, Biochemistry. Linköping University, The Institute of Technology.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Jonsson, Bengt-Harald
    Linköping University, Department of Physics, Chemistry and Biology, Molecular Biotechnology . Linköping University, The Institute of Technology.
    Carlsson, Uno
    Linköping University, Department of Physics, Chemistry and Biology, Biochemistry. Linköping University, The Institute of Technology.
    Hammarström, Per
    Linköping University, Department of Physics, Chemistry and Biology, Biochemistry. Linköping University, The Institute of Technology.
    Unfolding a folding disease: folding, misfolding and aggregation of the marble brain syndrome-associated mutant H107Y of human carbonic anhydrase II2004In: Journal of Molecular Biology, ISSN 0022-2836, E-ISSN 1089-8638, Vol. 342, no 2, p. 619-633Article in journal (Refereed)
    Abstract [en]

    Most loss-of-function diseases are caused by aberrant folding of important proteins. These proteins often misfold due to mutations. The disease marble brain syndrome (MBS), known also as carbonic anhydrase II deficiency syndrome (CADS), can manifest in carriers of point mutations in the human carbonic anhydrase II (HCA II) gene. One mutation associated with MBS entails the His107Tyr substitution. Here, we demonstrate that this mutation is a remarkably destabilizing folding mutation. The loss-of-function is clearly a folding defect, since the mutant shows 64% of CO2 hydration activity compared to that of the wild-type at low temperature where the mutant is folded. On the contrary, its stability towards thermal and guanidine hydrochloride (GuHCl) denaturation is highly compromised. Using activity assays, CD, fluorescence, NMR, cross-linking, aggregation measurements and molecular modeling, we have mapped the properties of this remarkable mutant. Loss of enzymatic activity had a midpoint temperature of denaturation (Tm) of 16 °C for the mutant compared to 55 °C for the wild-type protein. GuHCl-denaturation (at 4 °C) showed that the native state of the mutant was destabilized by 9.2 kcal/mol. The mutant unfolds through at least two equilibrium intermediates; one novel intermediate that we have termed the molten globule light state and, after further denaturation, the classical molten globule state is populated. Under physiological conditions (neutral pH; 37 °C), the His107Tyr mutant will populate the molten globule light state, likely due to novel interactions between Tyr107 and the surroundings of the critical residue Ser29 that destabilize the native conformation. This intermediate binds the hydrophobic dye 8-anilino-1-naphthalene sulfonic acid (ANS) but not as strong as the molten globule state, and near-UV CD reveals the presence of significant tertiary structure. Notably, this intermediate is not as prone to aggregation as the classical molten globule. As a proof of concept for an intervention strategy with small molecules, we showed that binding of the CA inhibitor acetazolamide increases the stability of the native state of the mutant by 2.9 kcal/mol in accordance with its strong affinity. Acetazolamide shifts the Tm to 34 °C that protects from misfolding and will enable a substantial fraction of the enzyme pool to survive physiological conditions.

  • 6.
    Anandapadmanaban, Madhanagopal
    et al.
    Linköping University, Department of Physics, Chemistry and Biology. Linköping University, Faculty of Science & Engineering.
    Pilstål, Robert
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Andrésen, Cecilia
    Linköping University, Department of Physics, Chemistry and Biology, Chemistry. Linköping University, Faculty of Science & Engineering.
    Trewhella, Jill
    Linköping University, Department of Physics, Chemistry and Biology, Chemistry. Linköping University, Faculty of Science & Engineering. University of Sydney, Australia.
    Moche, Martin
    Karolinska Institute, Sweden.
    Wallner, Björn
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Sunnerhagen, Maria
    Linköping University, Department of Physics, Chemistry and Biology, Chemistry. Linköping University, Faculty of Science & Engineering.
    Mutation-Induced Population Shift in the MexR Conformational Ensemble Disengages DNA Binding: A Novel Mechanism for MarR Family Derepression2016In: Structure, ISSN 0969-2126, E-ISSN 1878-4186, Vol. 24, no 8, p. 1311-1321Article in journal (Refereed)
    Abstract [en]

    MexR is a repressor of the MexAB-OprM multidrug efflux pump operon of Pseudomonas aeruginosa, where DNA-binding impairing mutations lead to multidrug resistance (MDR). Surprisingly, the crystal structure of an MDR-conferring MexR mutant R21W (2.19 angstrom) presented here is closely similar to wildtype MexR. However, our extended analysis, by molecular dynamics and small-angle X-ray scattering, reveals that the mutation stabilizes a ground state that is deficient of DNA binding and is shared by both mutant and wild-type MexR, whereas the DNA-binding state is only transiently reached by the more flexible wild-type MexR. This population shift in the conformational ensemble is effected by mutation-induced allosteric coupling of contact networks that are independent in the wild-type protein. We propose that the MexR-R21W mutant mimics derepression by small-molecule binding to MarR proteins, and that the described allosteric model based on population shifts may also apply to other MarR family members.

  • 7.
    Andrésen, Cecilia
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Molecular Biotechnology. Linköping University, The Institute of Technology.
    Helander, Sara
    Linköping University, Department of Physics, Chemistry and Biology, Chemistry. Linköping University, Faculty of Science & Engineering.
    Lemak, Alexander
    University of Toronto, Canada .
    Fares, Christophe
    University of Toronto, Canada .
    Csizmok, Veronika
    Hospital for Sick Children, Canada .
    Carlsson, Jonas
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Penn, Linda Z
    University of Toronto, Canada .
    Forman-Kay, Julie D
    Hospital Sick Children, Canada University of Toronto, Canada .
    Arrowsmith, Cheryl H
    University of Toronto, Canada.
    Lundström, Patrik
    Linköping University, Department of Physics, Chemistry and Biology, Molecular Biotechnology. Linköping University, The Institute of Technology.
    Sunnerhagen, Maria
    Linköping University, Department of Physics, Chemistry and Biology, Molecular Biotechnology. Linköping University, The Institute of Technology.
    Transient structure and dynamics in the disordered c-Myc transactivation domain affect Bin1 binding2012In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 40, no 13, p. 6353-6366Article in journal (Refereed)
    Abstract [en]

    The crucial role of Myc as an oncoprotein and as a key regulator of cell growth makes it essential to understand the molecular basis of Myc function. The N-terminal region of c-Myc coordinates a wealth of protein interactions involved in transformation, differentiation and apoptosis. We have characterized in detail the intrinsically disordered properties of Myc-1-88, where hierarchical phosphorylation of S62 and T58 regulates activation and destruction of the Myc protein. By nuclear magnetic resonance (NMR) chemical shift analysis, relaxation measurements and NOE analysis, we show that although Myc occupies a very heterogeneous conformational space, we find transiently structured regions in residues 22-33 and in the Myc homology box I (MBI; residues 45-65); both these regions are conserved in other members of the Myc family. Binding of Bin1 to Myc-1-88 as assayed by NMR and surface plasmon resonance (SPR) revealed primary binding to the S62 region in a dynamically disordered and multivalent complex, accompanied by population shifts leading to altered intramolecular conformational dynamics. These findings expand the increasingly recognized concept of intrinsically disordered regions mediating transient interactions to Myc, a key transcriptional regulator of major medical importance, and have important implications for further understanding its multifaceted role in gene regulation.

  • 8.
    Andrésen, Cecilia
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Chemistry. Linköping University, Faculty of Science & Engineering.
    Niklasson, Markus
    Linköping University, Department of Physics, Chemistry and Biology, Chemistry. Linköping University, Faculty of Science & Engineering.
    Cassman Eklöf, Sofie
    Linköping University, Department of Physics, Chemistry and Biology, Chemistry. Linköping University, Faculty of Science & Engineering.
    Wallner, Björn
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Lundström, Patrik
    Linköping University, Department of Physics, Chemistry and Biology, Chemistry. Linköping University, Faculty of Science & Engineering.
    Biophysical characterization of the calmodulin-like domain of Plasmodium falciparum calcium dependent protein kinase 32017In: PLoS ONE, ISSN 1932-6203, E-ISSN 1932-6203, Vol. 12, no 7, article id e0181721Article in journal (Refereed)
    Abstract [en]

    Calcium dependent protein kinases are unique to plants and certain parasites and comprise an N-terminal segment and a kinase domain that is regulated by a C-terminal calcium binding domain. Since the proteins are not found in man they are potential drug targets. We have characterized the calcium binding lobes of the regulatory domain of calcium dependent protein kinase 3 from the malaria parasite Plasmodium falciparum. Despite being structurally similar, the two lobes differ in several other regards. While the monomeric N-terminal lobe changes its structure in response to calcium binding and shows global dynamics on the sub-millisecond time-scale both in its apo and calcium bound states, the C-terminal lobe could not be prepared calcium-free and forms dimers in solution. If our results can be generalized to the full-length protein, they suggest that the C-terminal lobe is calcium bound even at basal levels and that activation is caused by the structural reorganization associated with binding of a single calcium ion to the N-terminal lobe.

  • 9.
    Augusto Berrocal, Jose
    et al.
    Eindhoven University of Technology, Netherlands.
    Di Meo, Florent
    Linköping University, Department of Physics, Chemistry and Biology, Theoretical Chemistry. Linköping University, Faculty of Science & Engineering. University of Limoges, France.
    Garcia-Iglesias, Miguel
    Eindhoven University of Technology, Netherlands.
    Gosens, Ronald P. J.
    Eindhoven University of Technology, Netherlands.
    Meijer, E. W.
    Eindhoven University of Technology, Netherlands.
    Linares, Mathieu
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Palmans, Anja R. A.
    Eindhoven University of Technology, Netherlands.
    Consequences of conformational flexibility in hydrogen-bond-driven self-assembly processes2016In: Chemical Communications, ISSN 1359-7345, E-ISSN 1364-548X, Vol. 52, no 72, p. 10870-10873Article in journal (Refereed)
    Abstract [en]

    We report the synthesis and self-assembly of chiral, conformationally flexible C-3-symmetrical trisamides. A strong Cotton effect is observed for the supramolecular polymers in linear alkanes but not in cyclic alkanes. MD simulations suggest 2:1 conformations of the amides within the aggregates in both types of solvents, but a chiral bias in only linear alkanes.

  • 10.
    Bano-Polo, Manuel
    et al.
    University of Valencia, Spain .
    Martinez-Gill, Luis
    University of Valencia, Spain .
    Wallner, Björn
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Nieva, Jose L.
    University of Pais Vasco UPV EHU, Spain .
    Elofsson, Arne
    Stockholm University, Sweden .
    Mingarro, Ismael
    University of Valencia, Spain .
    Charge Pair Interactions in Transmembrane Helices and Turn Propensity of the Connecting Sequence Promote Helical Hairpin Insertion2013In: Journal of Molecular Biology, ISSN 0022-2836, E-ISSN 1089-8638, Vol. 425, no 4, p. 830-840Article in journal (Refereed)
    Abstract [en]

    alpha-Helical hairpins, consisting of a pair of closely spaced transmembrane (TM) helices that are connected by a short interfacial turn, are the simplest structural motifs found in multi-spanning membrane proteins. In naturally occurring hairpins, the presence of polar residues is common and predicted to complicate membrane insertion. We postulate that the pre-packing process offsets any energetic cost of allocating polar and charged residues within the hydrophobic environment of biological membranes. Consistent with this idea, we provide here experimental evidence demonstrating that helical hairpin insertion into biological membranes can be driven by electrostatic interactions between closely separated, poorly hydrophobic sequences. Additionally, we observe that the integral hairpin can be stabilized by a short loop heavily populated by turn-promoting residues. We conclude that the combined effect of TM-TM electrostatic interactions and tight turns plays an important role in generating the functional architecture of membrane proteins and propose that helical hairpin motifs can be acquired within the context of the Sec61 translocon at the early stages of membrane protein biosynthesis. Taken together, these data further underline the potential complexities involved in accurately predicting TM domains from primary structures.

  • 11.
    Barrientos-Somarribas, Mauricio
    et al.
    Karolinska Inst, Sweden.
    Messina, David N.
    Stockholm Univ, Sweden.
    Pou, Christian
    Karolinska Inst, Sweden.
    Lysholm, Fredrik
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Bjerkner, Annelie
    Karolinska Univ Hosp, Sweden.
    Allander, Tobias
    Karolinska Univ Hosp, Sweden.
    Andersson, Björn
    Karolinska Inst, Sweden.
    Sonnhammer, Erik L. L.
    Stockholm Univ, Sweden.
    Discovering viral genomes in human metagenomic data by predicting unknown protein families2018In: Scientific Reports, ISSN 2045-2322, E-ISSN 2045-2322, Vol. 8, article id 28Article in journal (Refereed)
    Abstract [en]

    Massive amounts of metagenomics data are currently being produced, and in all such projects a sizeable fraction of the resulting data shows no or little homology to known sequences. It is likely that this fraction contains novel viruses, but identification is challenging since they frequently lack homology to known viruses. To overcome this problem, we developed a strategy to detect ORFan protein families in shotgun metagenomics data, using similarity-based clustering and a set of filters to extract bona fide protein families. We applied this method to 17 virus-enriched libraries originating from human nasopharyngeal aspirates, serum, feces, and cerebrospinal fluid samples. This resulted in 32 predicted putative novel gene families. Some families showed detectable homology to sequences in metagenomics datasets and protein databases after reannotation. Notably, one predicted family matches an ORF from the highly variable Torque Teno virus (TTV). Furthermore, follow-up from a predicted ORFan resulted in the complete reconstruction of a novel circular genome. Its organisation suggests that it most likely corresponds to a novel bacteriophage in the microviridae family, hence it was named bacteriophage HFM.

  • 12.
    Basu, Sankar Chandra
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering. University of Calcutta, India.
    Söderquist, Fredrik
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Wallner, Björn
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Proteus: a random forest classifier to predict disorder-to-order transitioning binding regions in intrinsically disordered proteins2017In: Journal of Computer-Aided Molecular Design, ISSN 0920-654X, E-ISSN 1573-4951, Vol. 31, no 5, p. 453-466Article in journal (Refereed)
    Abstract [en]

    The focus of the computational structural biology community has taken a dramatic shift over the past one-and-a-half decades from the classical protein structure prediction problem to the possible understanding of intrinsically disordered proteins (IDP) or proteins containing regions of disorder (IDPR). The current interest lies in the unraveling of a disorder-to-order transitioning code embedded in the amino acid sequences of IDPs/ IDPRs. Disordered proteins are characterized by an enormous amount of structural plasticity which makes them promiscuous in binding to different partners, multi-functional in cellular activity and atypical in folding energy landscapes resembling partially folded molten globules. Also, their involvement in several deadly human diseases (e.g. cancer, cardiovascular and neurodegenerative diseases) makes them attractive drug targets, and important for a biochemical understanding of the disease(s). The study of the structural ensemble of IDPs is rather difficult, in particular for transient interactions. When bound to a structured partner, an IDPR adapts an ordered conformation in the complex. The residues that undergo this disorder-to-order transition are called protean residues, generally found in short contiguous stretches and the first step in understanding the modus operandi of an IDP/IDPR would be to predict these residues. There are a few available methods which predict these protean segments from their amino acid sequences; however, their performance reported in the literature leaves clear room for improvement. With this background, the current study presents Proteus, a random forest classifier that predicts the likelihood of a residue undergoing a disorder-toorder transition upon binding to a potential partner protein. The prediction is based on features that can be calculated using the amino acid sequence alone. Proteus compares favorably with existing methods predicting twice as many true positives as the second best method (55 vs. 27%) with a much higher precision on an independent data set. The current study also sheds some light on a possible disorderto-order transitioning consensus, untangled, yet embedded in the amino acid sequence of IDPs. Some guidelines have also been suggested for proceeding with a real-life structural modeling involving an IDPR using Proteus.

  • 13.
    Basu, Sankar Chandra
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Wallner, Björn
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    DockQ: A Quality Measure for Protein-Protein Docking Models2016In: PLoS ONE, ISSN 1932-6203, E-ISSN 1932-6203, Vol. 11, no 8, p. e0161879-Article in journal (Refereed)
    Abstract [en]

    The state-of-the-art to assess the structural quality of docking models is currently based on three related yet independent quality measures: F-nat, LRMS, and iRMS as proposed and standardized by CAPRI. These quality measures quantify different aspects of the quality of a particular docking model and need to be viewed together to reveal the true quality, e.g. a model with relatively poor LRMS (amp;gt; 10 angstrom) might still qualify as acceptable with a descent F-nat (amp;gt; 0.50) and iRMS (amp;lt; 3.0 angstrom). This is also the reason why the so called CAPRI criteria for assessing the quality of docking models is defined by applying various ad-hoc cutoffs on these measures to classify a docking model into the four classes: Incorrect, Acceptable, Medium, or High quality. This classification has been useful in CAPRI, but since models are grouped in only four bins it is also rather limiting, making it difficult to rank models, correlate with scoring functions or use it as target function in machine learning algorithms. Here, we present DockQ, a continuous protein-protein docking model quality measure derived by combining F-nat, LRMS, and iRMS to a single score in the range [0, 1] that can be used to assess the quality of protein docking models. By using DockQ on CAPRI models it is possible to almost completely reproduce the original CAPRI classification into Incorrect, Acceptable, Medium and High quality. An average PPV of 94% at 90% Recall demonstrating that there is no need to apply predefined ad-hoc cutoffs to classify docking models. Since DockQ recapitulates the CAPRI classification almost perfectly, it can be viewed as a higher resolution version of the CAPRI classification, making it possible to estimate model quality in a more quantitative way using Z-scores or sum of top ranked models, which has been so valuable for the CASP community. The possibility to directly correlate a quality measure to a scoring function has been crucial for the development of scoring functions for protein structure prediction, and DockQ should be useful in a similar development in the protein docking field.

  • 14.
    Basu, Sankar Chandra
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Wallner, Björn
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Finding correct protein-protein docking models using ProQDock2016In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 32, no 12, p. 262-270Article in journal (Refereed)
    Abstract [en]

    Motivation: Protein-protein interactions are a key in virtually all biological processes. For a detailed understanding of the biological processes, the structure of the protein complex is essential. Given the current experimental techniques for structure determination, the vast majority of all protein complexes will never be solved by experimental techniques. In lack of experimental data, computational docking methods can be used to predict the structure of the protein complex. A common strategy is to generate many alternative docking solutions (atomic models) and then use a scoring function to select the best. The success of the computational docking technique is, to a large degree, dependent on the ability of the scoring function to accurately rank and score the many alternative docking models. Results: Here, we present ProQDock, a scoring function that predicts the absolute quality of docking model measured by a novel protein docking quality score (DockQ). ProQDock uses support vector machines trained to predict the quality of protein docking models using features that can be calculated from the docking model itself. By combining different types of features describing both the protein-protein interface and the overall physical chemistry, it was possible to improve the correlation with DockQ from 0.25 for the best individual feature (electrostatic complementarity) to 0.49 for the final version of ProQDock. ProQDock performed better than the state-of-the-art methods ZRANK and ZRANK2 in terms of correlations, ranking and finding correct models on an independent test set. Finally, we also demonstrate that it is possible to combine ProQDock with ZRANK and ZRANK2 to improve performance even further.

  • 15.
    Beecham, Ashley H.
    et al.
    University of Miami, FL USA .
    Patsopoulos, Nikolaos A.
    Brigham and Womens Hospital, MA USA .
    Xifara, Dionysia K.
    University of Oxford, England .
    Davis, Mary F.
    Vanderbilt University, TN USA .
    Kemppinen, Anu
    University of Cambridge, England .
    Cotsapas, Chris
    Broad Institute Harvard and MIT, MA USA .
    Shah, Tejas S.
    Wellcome Trust Sanger Institute, England .
    Spencer, Chris
    University of Oxford, England .
    Booth, David
    University of Sydney, Australia .
    Goris, An
    Katholieke University of Leuven, Belgium .
    Oturai, Annette
    Copenhagen University Hospital, Denmark .
    Saarela, Janna
    University of Helsinki, Finland .
    Fontaine, Bertrand
    University of Paris 06, France .
    Hemmer, Bernhard
    Technical University of Munich, Germany .
    Martin, Claes
    Danderyd Hospital, Sweden .
    Zipp, Frauke
    Johannes Gutenberg University of Mainz, Germany .
    DAlfonso, Sandra
    University of Piemonte Orientale, Italy .
    Martinelli-Boneschi, Filippo
    Ist Science San Raffaele, Italy .
    Taylor, Bruce
    University of Tasmania, Australia .
    Harbo, Hanne F.
    Oslo University Hospital, Norway .
    Kockum, Ingrid
    Karolinska Institute, Sweden .
    Hillert, Jan
    Karolinska Institute, Sweden .
    Olsson, Tomas
    Karolinska Institute, Sweden .
    Ban, Maria
    University of Cambridge, England .
    Oksenberg, Jorge R.
    University of Calif San Francisco, CA USA .
    Hintzen, Rogier
    Erasmus University, Netherlands .
    F Barcellos, Lisa
    University of Calif Berkeley, CA 94720 USA .
    Agliardi, Cristina
    IRCCS Santa Maria Nascente, Italy .
    Alfredsson, Lars
    Karolinska Institute, Sweden .
    Alizadeh, Mehdi
    University of Rennes 1, France .
    Anderson, Carl
    Wellcome Trust Sanger Institute, England .
    Andrews, Robert
    Wellcome Trust Sanger Institute, England .
    Bach Sondergaard, Helle
    Copenhagen University Hospital, Denmark .
    Baker, Amie
    University of Cambridge, England .
    Band, Gavin
    University of Oxford, England .
    Baranzini, Sergio E.
    University of Calif San Francisco, CA USA .
    Barizzone, Nadia
    University of Piemonte Orientale, Italy .
    Barrett, Jeffrey
    Wellcome Trust Sanger Institute, England .
    Bellenguez, Celine
    University of Oxford, England .
    Bergamaschi, Laura
    University of Piemonte Orientale, Italy .
    Bernardinelli, Luisa
    MRC, England .
    Berthele, Achim
    Technical University of Munich, Germany .
    Biberacher, Viola
    Technical University of Munich, Germany .
    Binder, Thomas M C.
    University of Medical Centre Hamburg Eppendorf, Germany .
    Blackburn, Hannah
    Wellcome Trust Sanger Institute, England .
    Bomfim, Izaura L.
    Karolinska Institute, Sweden .
    Brambilla, Paola
    Ist Science San Raffaele, Italy .
    Broadley, Simon
    Griffith University, Australia .
    Brochet, Bruno
    University of Bordeaux 2, France .
    Brundin, Lou
    Karolinska Institute, Sweden .
    Buck, Dorothea
    Technical University of Munich, Germany .
    Butzkueven, Helmut
    University of Melbourne, Australia .
    Caillier, Stacy J.
    University of Calif San Francisco, CA USA .
    Camu, William
    Centre Hospital University of Regional Montpellier, France .
    Carpentier, Wassila
    University of Paris 06, France .
    Cavalla, Paola
    Azienda Osped Citta Salute and Science Torino, Italy .
    Celius, Elisabeth G.
    Oslo University Hospital, Norway .
    Coman, Irene
    Hop Avicenne, France .
    Comi, Giancarlo
    Ist Science San Raffaele, Italy .
    Corrado, Lucia
    University of Piemonte Orientale, Italy .
    Cosemans, Leentje
    Katholieke University of Leuven, Belgium .
    Cournu-Rebeix, Isabelle
    University of Paris 06, France .
    Cree, Bruce A C.
    University of Calif San Francisco, CA USA .
    Cusi, Daniele
    University of Milan, Italy .
    Damotte, Vincent
    University of Paris 06, France .
    Defer, Gilles
    CHU Caen, France .
    Delgado, Silvia R.
    University of Miami, FL USA .
    Deloukas, Panos
    Wellcome Trust Sanger Institute, England .
    di Sapio, Alessia
    University of San Luigi, Italy .
    Dilthey, Alexander T.
    University of Oxford, England .
    Donnelly, Peter
    University of Oxford, England .
    Dubois, Benedicte
    Katholieke University of Leuven, Belgium .
    Duddy, Martin
    Royal Victoria Infirm, England .
    Edkins, Sarah
    Wellcome Trust Sanger Institute, England .
    Elovaara, Irina
    University of Tampere, Finland .
    Esposito, Federica
    Ist Science San Raffaele, Italy .
    Evangelou, Nikos
    University of Nottingham Hospital, England .
    Fiddes, Barnaby
    University of Cambridge, England .
    Field, Judith
    University of Melbourne, Australia .
    Franke, Andre
    University of Kiel, Germany .
    Freeman, Colin
    University of Oxford, England .
    Frohlich, Irene Y.
    Brigham and Womens Hospital, MA USA .
    Galimberti, Daniela
    University of Milan, Italy .
    Gieger, Christian
    German Research Centre Environm Heatlh, Germany .
    Gourraud, Pierre-Antoine
    University of Calif San Francisco, CA USA .
    Graetz, Christiane
    Johannes Gutenberg University of Mainz, Germany .
    Graham, Andrew
    Ipswich Hospital National Health Serv NHS Trust, England .
    Grummel, Verena
    Technical University of Munich, Germany .
    Guaschino, Clara
    Ist Science San Raffaele, Italy .
    Hadjixenofontos, Athena
    University of Miami, FL USA .
    Hakonarson, Hakon
    Childrens Hospital Philadelphia, PA USA .
    Halfpenny, Christopher
    Southampton Gen Hospital, England .
    Hall, Gillian
    Aberdeen Royal Infirm, Scotland .
    Hall, Per
    Karolinska Institute, Sweden .
    Hamsten, Anders
    Karolinska University Hospital Solna, Sweden .
    Harley, James
    Hull Royal Infirm, England .
    Harrower, Timothy
    Royal Devon and Exeter Fdn Trust Hospital, England .
    Hawkins, Clive
    Keele University, England .
    Hellenthal, Garrett
    UCL, England .
    Hillier, Charles
    Poole Gen Hospital, England .
    Hobart, Jeremy
    University of Plymouth, England .
    Hoshi, Muni
    Technical University of Munich, Germany .
    Hunt, Sarah E.
    Wellcome Trust Sanger Institute, England .
    Jagodic, Maja
    Karolinska Institute, Sweden .
    Jelcic, Ilijas
    University of Medical Centre Hamburg Eppendorf, Germany .
    Jochim, Angela
    Technical University of Munich, Germany .
    Kendall, Brian
    Leicester Royal Infirm, England .
    Kermode, Allan
    University of Western Australia, Australia .
    Kilpatrick, Trevor
    University of Melbourne, Australia .
    Koivisto, Keijo
    Seinajoki Central Hospital, Finland .
    Konidari, Ioanna
    University of Miami, FL USA .
    Korn, Thomas
    Technical University of Munich, Germany .
    Kronsbein, Helena
    Technical University of Munich, Germany .
    Langford, Cordelia
    Wellcome Trust Sanger Institute, England .
    Larsson, Malin
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Lathrop, Mark
    Centre Etud Polymorphisme Humain, France .
    Lebrun-Frenay, Christine
    CHRU Nice, France .
    Lechner-Scott, Jeannette
    University of Newcastle, Australia .
    Lee, Michelle H.
    Brigham and Womens Hospital, MA USA .
    Leone, Maurizio A.
    Osped Maggiore Novara, Italy .
    Leppa, Virpi
    University of Helsinki, Finland .
    Liberatore, Giuseppe
    Ist Science San Raffaele, Italy .
    Lie, Benedicte A.
    University of Oslo, Norway .
    Lill, Christina M.
    Johannes Gutenberg University of Mainz, Germany .
    Linden, Magdalena
    Karolinska Institute, Sweden .
    Link, Jenny
    Karolinska Institute, Sweden .
    Luessi, Felix
    Johannes Gutenberg University of Mainz, Germany .
    Lycke, Jan
    University of Gothenburg, Sweden .
    Macciardi, Fabio
    University of Calif Irvine, CA USA .
    Mannisto, Satu
    National Institute Health and Welf, Finland .
    Manrique, Clara P.
    University of Miami, FL USA .
    Martin, Roland
    University of Medical Centre Hamburg Eppendorf, Germany .
    Martinelli, Vittorio
    Ist Science San Raffaele, Italy .
    Mason, Deborah
    Canterbury Dist Health Board, New Zealand .
    Mazibrada, Gordon
    Queen Elizabeth Medical Centre, England .
    McCabe, Cristin
    Broad Institute Harvard and MIT, MA USA .
    Mero, Inger-Lise
    Oslo University Hospital, Norway .
    Mescheriakova, Julia
    Erasmus University, Netherlands .
    Moutsianas, Loukas
    University of Oxford, England .
    Myhr, Kjell-Morten
    Haukeland Hospital, Norway .
    Nagels, Guy
    National Multiple Sclerosis Centre Melsbroek, Belgium .
    Nicholas, Richard
    Charing Cross Hospital, England .
    Nilsson, Petra
    Lund University, Sweden .
    Piehl, Fredrik
    Karolinska Institute, Sweden .
    Pirinen, Matti
    University of Oxford, England .
    Price, Sian E.
    Royal Hallamshire Hospital, England .
    Quach, Hong
    University of Calif Berkeley, CA USA .
    Reunanen, Mauri
    University of Oulu, Finland .
    Robberecht, Wim
    Vesalius Research Centre, Belgium .
    Robertson, Neil P.
    Cardiff University, Wales .
    Rodegher, Mariaemma
    Ist Science San Raffaele, Italy .
    Rog, David
    Salford Royal NHS Fdn Trust, England .
    Salvetti, Marco
    University of Roma La Sapienza, Italy .
    Schnetz-Boutaud, Nathalie C.
    Vanderbilt University, TN USA .
    Sellebjerg, Finn
    Copenhagen University Hospital, Denmark .
    Selter, Rebecca C.
    Technical University of Munich, Germany .
    Schaefer, Catherine
    Kaiser Permanente Div Research, CA USA .
    Shaunak, Sandip
    Royal Preston Hospital, England .
    Shen, Ling
    Kaiser Permanente Div Research, CA USA .
    Shields, Simon
    Norfolk and Norwich Hospital, England .
    Siffrin, Volker
    Johannes Gutenberg University of Mainz, Germany .
    Slee, Mark
    Flinders University of S Australia, Australia .
    Soelberg Sorensen, Per
    Copenhagen University Hospital, Denmark .
    Sorosina, Melissa
    Ist Science San Raffaele, Italy .
    Sospedra, Mireia
    University of Medical Centre Hamburg Eppendorf, Germany .
    Spurkland, Anne
    University of Oslo, Norway .
    Strange, Amy
    University of Oxford, England .
    Sundqvist, Emilie
    Karolinska Institute, Sweden .
    Thijs, Vincent
    Vesalius Research Centre, Belgium .
    Thorpe, John
    Peterborough City Hospital, England .
    Ticca, Anna
    San Francesco Hospital, Italy .
    Tienari, Pentti
    University of Helsinki, Finland .
    van Duijn, Cornelia
    Erasmus MC, Netherlands .
    Visser, Elizabeth M.
    University of Aberdeen, Scotland .
    Vucic, Steve
    University of Sydney, Australia .
    Westerlind, Helga
    Karolinska Institute, Sweden .
    Wiley, James S.
    University of Melbourne, Australia .
    Wilkins, Alastair
    University of Bristol, England .
    Wilson, James F.
    University of Edinburgh, Scotland .
    Winkelmann, Juliane
    Technical University of Munich, Germany .
    Zajicek, John
    University of Plymouth, England .
    Zindler, Eva
    Johannes Gutenberg University of Mainz, Germany .
    Haines, Jonathan L.
    Vanderbilt University, TN USA .
    Pericak-Vance, Margaret A.
    University of Miami, FL USA .
    Ivinson, Adrian J.
    Harvard University, MA USA .
    Stewart, Graeme
    University of Sydney, Australia .
    Hafler, David
    Broad Institute Harvard and MIT, MA USA .
    Hauser, Stephen L.
    University of Calif San Francisco, CA USA .
    Compston, Alastair
    University of Cambridge, England .
    McVean, Gil
    University of Oxford, England .
    De Jager, Philip
    Brigham and Womens Hospital, MA USA .
    Sawcer, Stephen J.
    University of Cambridge, England .
    McCauley, Jacob L.
    University of Miami, FL USA .
    Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis2013In: Nature Genetics, ISSN 1061-4036, E-ISSN 1546-1718, Vol. 45, no 11, p. 1353-+Article in journal (Refereed)
    Abstract [en]

    Using the ImmunoChip custom genotyping array, we analyzed 14,498 subjects with multiple sclerosis and 24,091 healthy controls for 161,311 autosomal variants and identified 135 potentially associated regions (P andlt; 1.0 x 10(-4)). In a replication phase, we combined these data with previous genome-wide association study (GWAS) data from an independent 14,802 subjects with multiple sclerosis and 26,703 healthy controls. In these 80,094 individuals of European ancestry, we identified 48 new susceptibility variants (P andlt; 5.0 x 10(-8)), 3 of which we found after conditioning on previously identified variants. Thus, there are now 110 established multiple sclerosis risk variants at 103 discrete loci outside of the major histocompatibility complex. With high-resolution Bayesian fine mapping, we identified five regions where one variant accounted for more than 50% of the posterior probability of association. This study enhances the catalog of multiple sclerosis risk variants and illustrates the value of fine mapping in the resolution of GWAS signals.

  • 16.
    Berggren, Karl-Fredrik
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Theoretical Physics . Linköping University, The Institute of Technology.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    20 years in HPC 1989-20092009Other (Other (popular science, discussion, etc.))
  • 17.
    Bergqvist, Jonathan
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics.
    Study of Protein Interfaces with Clustering2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Protein-protein interactions occur in nature and have different functions. The interacting surface between two interacting proteins contains the respective protein's interface residues.

    In this thesis, a series of Python scripts are presented which can perform interface-interface comparisons with the method InterComp, to obtain a distance matrix of different protein interfaces. The distance matrix can be studied with the use of clustering algorithms such as DBSCAN.

    The result from clustering using DBSCAN shows that for the 77,017 protein interfaces studied, a majority of the protein interfaces are part of a single cluster while most of the remaining interfaces are noise for the tested parameters Eps and MinPts.

    The conclusion of this thesis is the effect on the number of clusters for the tested parameters Eps and MinPts when performing DBSCAN.

  • 18.
    Bhattacharyya, Dhananjay
    et al.
    Saha Institute Nucl Phys, India.
    Halder, Sukanya
    Saha Institute Nucl Phys, India.
    Basu, Sankar Chandra
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering. University of Calcutta, India.
    Mukherjee, Debasish
    Saha Institute Nucl Phys, India.
    Kumar, Prasun
    Indian Institute Science, India.
    Bansal, Manju
    Indian Institute Science, India.
    RNAHelix: computational modeling of nucleic acid structures with Watson-Crick and non-canonical base pairs2017In: Journal of Computer-Aided Molecular Design, ISSN 0920-654X, E-ISSN 1573-4951, Vol. 31, no 2, p. 219-235Article in journal (Refereed)
    Abstract [en]

    Comprehensive analyses of structural features of non-canonical base pairs within a nucleic acid double helix are limited by the availability of a small number of three dimensional structures. Therefore, a procedure for model building of double helices containing any given nucleotide sequence and base pairing information, either canonical or non-canonical, is seriously needed. Here we describe a program RNAHelix, which is an updated version of our widely used software, NUCGEN. The program can regenerate duplexes using the dinucleotide step and base pair orientation parameters for a given double helical DNA or RNA sequence with defined Watson-Crick or non-Watson-Crick base pairs. The original structure and the corresponding regenerated structure of double helices were found to be very close, as indicated by the small RMSD values between positions of the corresponding atoms. Structures of several usual and unusual double helices have been regenerated and compared with their original structures in terms of base pair RMSD, torsion angles and electrostatic potentials and very high agreements have been noted. RNAHelix can also be used to generate a structure with a sequence completely different from an experimentally determined one or to introduce single to multiple mutation, but with the same set of parameters and hence can also be an important tool in homology modeling and study of mutation induced structural changes.

  • 19.
    Bresel, Anders
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Biomolecular and Organic Electronics . Linköping University, The Institute of Technology.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    GenomeLKPG: A comprehensive proteome sequencedatabase for taxonomy studies2008Article in journal (Refereed)
    Abstract [en]

    Background: In order to perform taxonomically unbiased analyses of protein relationships, there is a need ofcomplete proteomes rather than databases with bias towards well characterized protein families. However, nocomprehensive resource of completed proteomes is currently available. Instead, the proteomes need to be down-loaded manually from di®erent servers, all using different filename conventions and fasta header formats.

    Results: We have developed a semi-automatic algorithm that retrieves complete proteomes from multiple FTP-servers and maps the species-speci¯c sequence entries to the NCBI taxonomy. The compiled data is provided ina sequence database named genomeLKPG.

    Conclusions: The usefulness of genomeLKPG is proven in several published taxonomical studies.

  • 20.
    Bresell, Anders
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Characterization of protein families, sequence patterns, and functional annotations in large data sets2008Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Bioinformatics involves storing, analyzing and making predictions on massive amounts of protein and nucleotide sequence data. The thesis consists of six papers and is focused on proteins. It describes the utilization of bioinformatics techniques to characterize protein families and to detect patterns in gene expression and in polypeptide occurrences. Two protein families were bioinformatically characterized - the membrane associated proteins in eicosanoid and glutathione metabolism (MAPEG) and the Tripartite motif (TRIM) protein families.

    In the study of the MAPEG super-family, application of different bioinformatic methods made it possible to characterize many new members leading to a doubling of the family size. Furthermore, the MAPEG members were subdivided into families. Remarkably, in six families with previously predominantly mammalian members, fish representatives were also now detected, which dated the origin of these families back to the Cambrium ”species explosion”, thus earlier than previously anticipated. Sequence comparisons made it possible to define diagnostic sequence patterns that can be used in genome annotations. Upon publication of several MAPEG structures, these patterns were confirmed to be part of the active sites.

    In the TRIM study, the bioinformatic analyses made it possible to subdivide the proteins into three subtypes and to characterize a large number of members. In addition, the analyses showed crucial structural dependencies between the RING and the B-box domains of the TRIM member

    Ro52. The linker region between the two domains, denoted RBL, is known

    to be disease associated. Now, an amphipathic helix was found to be a

    characteristic feature of the RBL region, which also was used to divide the family into three subtypes.

    The ontology annotation treebrowser (OAT) tool was developed to detect functional similarities or common concepts in long lists of proteins or genes, typically generated from proteomics or microarray experiments. OAT was the first annotation browser to include both Gene Ontology (GO) and Medical Subject Headings (MeSH) into the same framework. The complementarity of these two ontologies was demonstrated. OAT was used in the TRIM study to detect differences in functional annotations between the subtypes.

    In the oligopeptide study, we investigated pentapeptide patterns that were over- or under-represented in the current de facto standard database of protein knowledge and a set of completed genomes, compared to what could be expected from amino acid compositions. We found three predominant categories of patterns: (i) patterns originating from frequently occurring families, e.g. respiratory chain-associated proteins and translation machinery proteins; (ii) proteins with structurally and/or functionally favored patterns; (iii) multicopy species-specific retrotransposons, only found in the genome set. Such patterns may influence amino acid residue based prediction algorithms. These findings in the oligopeptide study were utilized for development of a new method that detects translated introns in unverified protein predictions, which are available in great numbers due to the many completed and ongoing genome projects.

    A new comprehensive database of protein sequences from completed genomes was developed, denoted genomeLKPG. This database was of central importance in the MAPEG, TRIM and oligopeptide studies. The new sequence database has also been proven useful in several other studies.

    List of papers
    1. Bioinformatic and enzymatic characterization of the MAPEG superfamily
    Open this publication in new window or tab >>Bioinformatic and enzymatic characterization of the MAPEG superfamily
    Show others...
    2005 (English)In: The FEBS Journal, ISSN 1742-464X, E-ISSN 1742-4658, Vol. 272, no 7, p. 1688-1703Article in journal (Refereed) Published
    Abstract [en]

    The membrane associated proteins in eicosanoid and glutathione metabolism (MAPEG) superfamily includes structurally related membrane proteins with diverse functions of widespread origin. A total of 136 proteins belonging to the MAPEG superfamily were found in database and genome screenings. The members were found in prokaryotes and eukaryotes, but not in any archaeal organism. Multiple sequence alignments and calculations of evolutionary trees revealed a clear subdivision of the eukaryotic MAPEG members, corresponding to the six families of microsomal glutathione transferases (MGST) 1, 2 and 3, leukotriene C4 synthase (LTC4), 5-lipoxygenase activating protein (FLAP), and prostaglandin E synthase. Prokaryotes contain at least two distinct potential ancestral subfamilies, of which one is unique, whereas the other most closely resembles enzymes that belong to the MGST2/FLAP/LTC4 synthase families. The insect members are most similar to MGST1/prostaglandin E synthase. With the new data available, we observe that fish enzymes are present in all six families, showing an early origin for MAPEG family differentiation. Thus, the evolutionary origins and relationships of the MAPEG superfamily can be defined, including distinct sequence patterns characteristic for each of the subfamilies. We have further investigated and functionally characterized representative gene products from Escherichia coli, Synechocystis sp., Arabidopsis thaliana and Drosophila melanogaster, and the fish liver enzyme, purified from pike (Esox lucius). Protein overexpression and enzyme activity analysis demonstrated that all proteins catalyzed the conjugation of 1-chloro-2,4-dinitrobenzene with reduced glutathione. The E. coli protein displayed glutathione transferase activity of 0.11 µmol·min−1·mg−1 in the membrane fraction from bacteria overexpressing the protein. Partial purification of the Synechocystis sp. protein yielded an enzyme of the expected molecular mass and an N-terminal amino acid sequence that was at least 50% pure, with a specific activity towards 1-chloro-2,4-dinitrobenzene of 11 µmol·min−1·mg−1. Yeast microsomes expressing the Arabidopsis enzyme showed an activity of 0.02 µmol·min−1·mg−1, whereas the Drosophila enzyme expressed in E. coli was highly active at 3.6 µmol·min−1·mg−1. The purified pike enzyme is the most active MGST described so far with a specific activity of 285 µmol·min−1·mg−1. Drosophila and pike enzymes also displayed glutathione peroxidase activity towards cumene hydroperoxide (0.4 and 2.2 µmol·min−1·mg−1, respectively). Glutathione transferase activity can thus be regarded as a common denominator for a majority of MAPEG members throughout the kingdoms of life whereas glutathione peroxidase activity occurs in representatives from the MGST1, 2 and 3 and PGES subfamilies.

    Keywords
    MAPEG, microsomal glutathione transferase, prostaglandin, leukotriene
    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-12886 (URN)10.1111/j.1742-4658.2005.04596.x (DOI)
    Available from: 2008-01-28 Created: 2008-01-28 Last updated: 2017-12-14Bibliographically approved
    2. The fellowship of the RING: The RING-B-box linker region (RBL) interacts with the RING in TRIM21/Ro52, contributes to an autoantigenic epitope in Sjögren's syndrome, and is an integral and conserved region in TRIM proteins
    Open this publication in new window or tab >>The fellowship of the RING: The RING-B-box linker region (RBL) interacts with the RING in TRIM21/Ro52, contributes to an autoantigenic epitope in Sjögren's syndrome, and is an integral and conserved region in TRIM proteins
    Show others...
    2008 (English)In: Journal of Molecular Biology, ISSN 0022-2836, E-ISSN 1089-8638, Vol. 377, no 2, p. 431-449Article in journal (Refereed) Published
    Abstract [en]

    Ro52 is a major autoantigen that is targeted in the autoimmune disease Sjögren syndrome and belongs to the tripartite motif (TRIM) protein family. Disease-related antigenic epitopes are mainly found in the coiled-coil domain of Ro52, but one such epitope is located in the Zn2+-binding region, which comprises an N-terminal RING followed by a B-box, separated by a ∼40-residue linker peptide. In the present study, we extend the structural, biophysical, and immunological knowledge of this RING-B-box linker (RBL) by employing an array of methods. Our bioinformatic investigations show that the RBL sequence motif is unique to TRIM proteins and can be classified into three distinct subtypes. The RBL regions of all three subtypes are as conserved as their known flanking domains, and all are predicted to comprise an amphipathic helix. This helix formation is confirmed by circular dichroism spectroscopy and is dependent on the presence of the RING. Immunological studies show that the RBL is part of a conformation-dependent epitope, and its antigenicity is likewise dependent on a structured RING domain. Recombinant Ro52 RING-RBL exists as a monomer in vitro, and binding of two Zn2+ increases its stability. Regions stabilized by Zn2+ binding are identified by limited proteolysis and matrix-assisted laser desorption/ionization mass spectrometry. Furthermore, the residues of the RING and linker that interact with each other are identified by analysis of protection patterns, which, together with bioinformatic and biophysical data, enabled us to propose a structural model of the RING-RBL based on modeling and docking experiments. Sequence similarities and evolutionary sequence patterns suggest that the results obtained from Ro52 are extendable to the entire TRIM protein family.

    Keywords
    Ro52; TRIM21; RING; linker; zinc binding
    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-12887 (URN)10.1016/j.jmb.2008.01.005 (DOI)
    Available from: 2008-01-28 Created: 2008-01-28 Last updated: 2017-12-14Bibliographically approved
    3. Ontology annotation treebrowser: an interactive tool where the complementarity of medical subject headings and gene ontology improves the interpretation of gene lists
    Open this publication in new window or tab >>Ontology annotation treebrowser: an interactive tool where the complementarity of medical subject headings and gene ontology improves the interpretation of gene lists
    2006 (English)In: Applied Bioinformatics, ISSN 1175-5636, Vol. 5, no 4, p. 225-236Article in journal (Refereed) Published
    Abstract [en]

    Gene expression and proteomics analysis allow the investigation of thousands of biomolecules in parallel. This results in a long list of interesting genes or proteins and a list of annotation terms in the order of thousands. It is not a trivial task to understand such a gene list and it would require extensive efforts to bring together the overwhelming amounts of associated information from the literature and databases. Thus, it is evident that we need ways of condensing and filtering this information. An excellent way to represent knowledge is to use ontologies, where it is possible to group genes or terms with overlapping context, rather than studying one-dimensional lists of keywords. Therefore, we have built the ontology annotation treebrowser (OAT) to represent, condense, filter and summarise the knowledge associated with a list of genes or proteins.

    The OAT system consists of two disjointed parts; a MySQL® database named OATdb, and a treebrowser engine that is implemented as a web interface. The OAT system is implemented using Perl scripts on an Apache web server and the gene, ontology and annotation data is stored in a relational MySQL® database. In OAT, we have harmonized the two ontologies of medical subject headings (MeSH) and gene ontology (GO), to enable us to use knowledge both from the literature and the annotation projects in the same tool. OAT includes multiple gene identifier sets, which are merged internally in the OAT database. We have also generated novel MeSH annotations by mapping accession numbers to MEDLINE entries.

    The ontology browser OAT was created to facilitate the analysis of gene lists. It can be browsed dynamically, so that a scientist can interact with the data and govern the outcome. Test statistics show which branches are enriched. We also show that the two ontologies complement each other, with surprisingly low overlap, by mapping annotations to the Unified Medical Language System®.

    We have developed a novel interactive annotation browser that is the first to incorporate both MeSH and GO for improved interpretation of gene lists. With OAT, we illustrate the benefits of combining MeSH and GO for understanding gene lists. OAT is available as a public web service at: http://www.ifm.liu.se/bioinfo/oat

    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-12888 (URN)
    Available from: 2008-01-28 Created: 2008-01-28 Last updated: 2009-11-07Bibliographically approved
    4. Characterization of oligopeptide patterns in large protein sets
    Open this publication in new window or tab >>Characterization of oligopeptide patterns in large protein sets
    2007 (English)In: BMC Genomics, ISSN 1471-2164, E-ISSN 1471-2164, Vol. 8, no 346, p. 1-15Article in journal (Refereed) Published
    Abstract [en]

    Background: Recent sequencing projects and the growth of sequence data banks enable oligopeptide patterns to be characterized on a genome or kingdom level. Several studies have focused on kingdom or habitat classifications based on the abundance of short peptide patterns. There have also been efforts at local structural prediction based on short sequence motifs. Oligopeptide patterns undoubtedly carry valuable information content. Therefore, it is important to characterize these informational peptide patterns to shed light on possible new applications and the pitfalls implicit in neglecting bias in peptide patterns.

    Results: We have studied four classes of pentapeptide patterns (designated POP, NEP, ORP and URP) in the kingdoms archaea, bacteria and eukaryotes. POP are highly abundant patterns statistically not expected to exist; NEP are patterns that do not exist but are statistically expected to; ORP are patterns unique to a kingdom; and URP are patterns excluded from a kingdom. We used two data sources: the de facto standard of protein knowledge Swiss-Prot, and a set of 386 completely sequenced genomes. For each class of peptides we looked at the 100 most extreme and found both known and unknown sequence features. Most of the known sequence motifs can be explained on the basis of the protein families from which they originate.

    Conclusion: We find an inherent bias of certain oligopeptide patterns in naturally occurring proteins that cannot be explained solely on the basis of residue distribution in single proteins, kingdoms or databases. We see three predominant categories of patterns: (i) patterns widespread in a kingdom such as those originating from respiratory chain-associated proteins and translation machinery; (ii) proteins with structurally and/or functionally favored patterns, which have not yet been ascribed this role; (iii) multicopy species-specific retrotransposons, only found in the genome set. These categories will affect the accuracy of sequence pattern algorithms that rely mainly on amino acid residue usage. Methods presented in this paper may be used to discover targets for antibiotics, as we identify numerous examples of kingdom-specific antigens among our peptide classes. The methods may also be useful for detecting coding regions of genes.

    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-12889 (URN)10.1186/1471-2164-8-346 (DOI)
    Available from: 2008-01-28 Created: 2008-01-28 Last updated: 2017-12-14Bibliographically approved
    5. Using SVM and tripeptide patterns to detect translated introns
    Open this publication in new window or tab >>Using SVM and tripeptide patterns to detect translated introns
    2007 (English)In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105Article in journal (Refereed) Submitted
    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-12890 (URN)
    Available from: 2008-01-28 Created: 2008-01-28 Last updated: 2017-12-14
    6. GenomeLKPG: A comprehensive proteome sequencedatabase for taxonomy studies
    Open this publication in new window or tab >>GenomeLKPG: A comprehensive proteome sequencedatabase for taxonomy studies
    2008 (English)Article in journal (Refereed) Submitted
    Abstract [en]

    Background: In order to perform taxonomically unbiased analyses of protein relationships, there is a need ofcomplete proteomes rather than databases with bias towards well characterized protein families. However, nocomprehensive resource of completed proteomes is currently available. Instead, the proteomes need to be down-loaded manually from di®erent servers, all using different filename conventions and fasta header formats.

    Results: We have developed a semi-automatic algorithm that retrieves complete proteomes from multiple FTP-servers and maps the species-speci¯c sequence entries to the NCBI taxonomy. The compiled data is provided ina sequence database named genomeLKPG.

    Conclusions: The usefulness of genomeLKPG is proven in several published taxonomical studies.

    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-52933 (URN)
    Available from: 2010-01-13 Created: 2010-01-13 Last updated: 2010-01-13
  • 21.
    Bresell, Anders
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Characterization of oligopeptide patterns in large protein sets2007In: BMC Genomics, ISSN 1471-2164, E-ISSN 1471-2164, Vol. 8, no 346, p. 1-15Article in journal (Refereed)
    Abstract [en]

    Background: Recent sequencing projects and the growth of sequence data banks enable oligopeptide patterns to be characterized on a genome or kingdom level. Several studies have focused on kingdom or habitat classifications based on the abundance of short peptide patterns. There have also been efforts at local structural prediction based on short sequence motifs. Oligopeptide patterns undoubtedly carry valuable information content. Therefore, it is important to characterize these informational peptide patterns to shed light on possible new applications and the pitfalls implicit in neglecting bias in peptide patterns.

    Results: We have studied four classes of pentapeptide patterns (designated POP, NEP, ORP and URP) in the kingdoms archaea, bacteria and eukaryotes. POP are highly abundant patterns statistically not expected to exist; NEP are patterns that do not exist but are statistically expected to; ORP are patterns unique to a kingdom; and URP are patterns excluded from a kingdom. We used two data sources: the de facto standard of protein knowledge Swiss-Prot, and a set of 386 completely sequenced genomes. For each class of peptides we looked at the 100 most extreme and found both known and unknown sequence features. Most of the known sequence motifs can be explained on the basis of the protein families from which they originate.

    Conclusion: We find an inherent bias of certain oligopeptide patterns in naturally occurring proteins that cannot be explained solely on the basis of residue distribution in single proteins, kingdoms or databases. We see three predominant categories of patterns: (i) patterns widespread in a kingdom such as those originating from respiratory chain-associated proteins and translation machinery; (ii) proteins with structurally and/or functionally favored patterns, which have not yet been ascribed this role; (iii) multicopy species-specific retrotransposons, only found in the genome set. These categories will affect the accuracy of sequence pattern algorithms that rely mainly on amino acid residue usage. Methods presented in this paper may be used to discover targets for antibiotics, as we identify numerous examples of kingdom-specific antigens among our peptide classes. The methods may also be useful for detecting coding regions of genes.

  • 22.
    Bresell, Anders
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Using SVM and tripeptide patterns to detect translated introns2007In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105Article in journal (Refereed)
  • 23.
    Bresell, Anders
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Servenius, Bo
    Biological Sciences, AstraZeneca R&D Lund, Sweden.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Ontology annotation treebrowser: an interactive tool where the complementarity of medical subject headings and gene ontology improves the interpretation of gene lists2006In: Applied Bioinformatics, ISSN 1175-5636, Vol. 5, no 4, p. 225-236Article in journal (Refereed)
    Abstract [en]

    Gene expression and proteomics analysis allow the investigation of thousands of biomolecules in parallel. This results in a long list of interesting genes or proteins and a list of annotation terms in the order of thousands. It is not a trivial task to understand such a gene list and it would require extensive efforts to bring together the overwhelming amounts of associated information from the literature and databases. Thus, it is evident that we need ways of condensing and filtering this information. An excellent way to represent knowledge is to use ontologies, where it is possible to group genes or terms with overlapping context, rather than studying one-dimensional lists of keywords. Therefore, we have built the ontology annotation treebrowser (OAT) to represent, condense, filter and summarise the knowledge associated with a list of genes or proteins.

    The OAT system consists of two disjointed parts; a MySQL® database named OATdb, and a treebrowser engine that is implemented as a web interface. The OAT system is implemented using Perl scripts on an Apache web server and the gene, ontology and annotation data is stored in a relational MySQL® database. In OAT, we have harmonized the two ontologies of medical subject headings (MeSH) and gene ontology (GO), to enable us to use knowledge both from the literature and the annotation projects in the same tool. OAT includes multiple gene identifier sets, which are merged internally in the OAT database. We have also generated novel MeSH annotations by mapping accession numbers to MEDLINE entries.

    The ontology browser OAT was created to facilitate the analysis of gene lists. It can be browsed dynamically, so that a scientist can interact with the data and govern the outcome. Test statistics show which branches are enriched. We also show that the two ontologies complement each other, with surprisingly low overlap, by mapping annotations to the Unified Medical Language System®.

    We have developed a novel interactive annotation browser that is the first to incorporate both MeSH and GO for improved interpretation of gene lists. With OAT, we illustrate the benefits of combining MeSH and GO for understanding gene lists. OAT is available as a public web service at: http://www.ifm.liu.se/bioinfo/oat

  • 24.
    Bresell, Anders
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Weinander, Rolf
    Department of Medicine, Division of Rheumatology Unit, Karolinska Institutet, Stockholm.
    Wiklund, Ronney
    Department of Plant Biology & Forestry Genetics, Swedish Agricultural University, Uppsala.
    Eriksson, Jan
    Department of Plant Biology & Forestry Genetics, Swedish Agricultural University, Uppsala.
    Jansson, Christer
    Department of Plant Biology & Forestry Genetics, Swedish Agricultural University, Uppsala.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Jakobsson, Per-Johan
    Department of Medicine, Division of Rheumatology Unit, Karolinska Institutet, Stockholm.
    Morgenstern, Ralf
    Institute of Environmental Medicine Karolinska Institutet, Stockholm.
    Lundqvist, Gerd
    Institute of Environmental Medicine Karolinska Institutet, Stockholm.
    Raza, Haider
    Institute of Environmental Medicine Karolinska Institutet, Stockholm.
    Shimoji, Miyuki
    Institute of Environmental Medicine Karolinska Institutet, Stockholm.
    Sun, Tie-Hua
    Institute of Environmental Medicine Karolinska Institutet, Stockholm.
    Balk, Lennart
    Stockholm Marine Research Centre, University of Stockholm.
    Bioinformatic and enzymatic characterization of the MAPEG superfamily2005In: The FEBS Journal, ISSN 1742-464X, E-ISSN 1742-4658, Vol. 272, no 7, p. 1688-1703Article in journal (Refereed)
    Abstract [en]

    The membrane associated proteins in eicosanoid and glutathione metabolism (MAPEG) superfamily includes structurally related membrane proteins with diverse functions of widespread origin. A total of 136 proteins belonging to the MAPEG superfamily were found in database and genome screenings. The members were found in prokaryotes and eukaryotes, but not in any archaeal organism. Multiple sequence alignments and calculations of evolutionary trees revealed a clear subdivision of the eukaryotic MAPEG members, corresponding to the six families of microsomal glutathione transferases (MGST) 1, 2 and 3, leukotriene C4 synthase (LTC4), 5-lipoxygenase activating protein (FLAP), and prostaglandin E synthase. Prokaryotes contain at least two distinct potential ancestral subfamilies, of which one is unique, whereas the other most closely resembles enzymes that belong to the MGST2/FLAP/LTC4 synthase families. The insect members are most similar to MGST1/prostaglandin E synthase. With the new data available, we observe that fish enzymes are present in all six families, showing an early origin for MAPEG family differentiation. Thus, the evolutionary origins and relationships of the MAPEG superfamily can be defined, including distinct sequence patterns characteristic for each of the subfamilies. We have further investigated and functionally characterized representative gene products from Escherichia coli, Synechocystis sp., Arabidopsis thaliana and Drosophila melanogaster, and the fish liver enzyme, purified from pike (Esox lucius). Protein overexpression and enzyme activity analysis demonstrated that all proteins catalyzed the conjugation of 1-chloro-2,4-dinitrobenzene with reduced glutathione. The E. coli protein displayed glutathione transferase activity of 0.11 µmol·min−1·mg−1 in the membrane fraction from bacteria overexpressing the protein. Partial purification of the Synechocystis sp. protein yielded an enzyme of the expected molecular mass and an N-terminal amino acid sequence that was at least 50% pure, with a specific activity towards 1-chloro-2,4-dinitrobenzene of 11 µmol·min−1·mg−1. Yeast microsomes expressing the Arabidopsis enzyme showed an activity of 0.02 µmol·min−1·mg−1, whereas the Drosophila enzyme expressed in E. coli was highly active at 3.6 µmol·min−1·mg−1. The purified pike enzyme is the most active MGST described so far with a specific activity of 285 µmol·min−1·mg−1. Drosophila and pike enzymes also displayed glutathione peroxidase activity towards cumene hydroperoxide (0.4 and 2.2 µmol·min−1·mg−1, respectively). Glutathione transferase activity can thus be regarded as a common denominator for a majority of MAPEG members throughout the kingdoms of life whereas glutathione peroxidase activity occurs in representatives from the MGST1, 2 and 3 and PGES subfamilies.

  • 25.
    Bunkoczi, Gabor
    et al.
    University of Cambridge, England.
    Wallner, Björn
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Read, Randy J.
    University of Cambridge, England.
    Local Error Estimates Dramatically Improve the Utility of Homology Models for Solving Crystal Structures by Molecular Replacement2015In: Structure, ISSN 0969-2126, E-ISSN 1878-4186, Vol. 23, no 2, p. 397-406Article in journal (Refereed)
    Abstract [en]

    Predicted structures submitted for CASP10 have been evaluated as molecular replacement models against the corresponding sets of structure factor amplitudes. It has been found that the log- likelihood gain score computed for each prediction correlates well with common structure quality indicators but is more sensitive when the accuracy of the models is high. In addition, it was observed that using coordinate error estimates submitted by predictors to weight the model can improve its utility in molecular replacement dramatically, and several groups have been identified who reliably provide accurate error estimates that could be used to extend the application of molecular replacement for low-homology cases.

  • 26.
    Bzhalava, David
    et al.
    Karolinska Institutet and Karolinska University Hospital, Stockholm.
    Ekström, Johanna
    Lund University, Malmö.
    Lysholm, Fredrik
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Hultin, Emilie
    Karolinska Institutet and Karolinska University Hospital, Stockholm.
    Faust, Helena
    Lund University, Malmö.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Lehtinen, Matti
    National Institute for Health and Welfare, Oulu, Finland.
    de Villiers, Ethel-Michele
    Deutsches Krebsforschungszentrum, Heidelberg, Germany.
    Dillner, Joakim
    Karolinska Institutet and Karolinska University Hospital, Stockholm.
    Phylogenetically diverse TT virus viremia among pregnant women2012In: Virology, ISSN 0042-6822, E-ISSN 1096-0341, Vol. 432, no 2, p. 427-434Article in journal (Refereed)
    Abstract [en]

    Infections during pregnancy have been suggested to be involved in childhood leukemias. We used high-throughput sequencing to describe the viruses most readily detectable in serum samples of pregnantwomen. Serum DNA of 112 mothers to leukemic children was amplified using whole genome amplification. Sequencing identified one TTvirus (TTV) isolate belonging to a known type and two putatively new TTVs. For 22 mothers, we also performed TTV amplification by general primer PCR before sequencing. This detected 39 TTVs, two of which were identical to the TTVs found after whole genome amplification.

    Altogether, we found 40 TTV isolates, 29 of which were putatively new types (similarities ranging from 89% to 69%). In conclusion, high throughput sequencing is useful to describe the known or unknown viruses that are present in serum samples of pregnantwomen.

  • 27.
    Bzhalava, Davit
    et al.
    Karolinska Institutet, Stockholm, Sweden.
    Johansson, Hanna
    Lund University, Malmö, Sweden.
    Ekstrom, Johanna
    Karolinska Institutet, Stockholm, Sweden.
    Faust, Helena
    Lund University, Malmö, Sweden.
    Moller, Birgitta
    Karolinska Institutet, Stockholm, Sweden.
    Eklund, Carina
    Karolinska Institutet, Stockholm, Sweden.
    Nordin, Peter
    Läkarhuset, Gothenburg, Sweden.
    Stenquist, Bo
    University of Gothenburg, Sweden.
    Paoli, John
    University of Gothenburg, Sweden.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Forslund, Ola
    Lund University, Malmö, Sweden.
    Dillner, Joakim
    Karolinska Institutet, Stockholm, Sweden.
    Unbiased Approach for Virus Detection in Skin Lesions2013In: PLoS ONE, ISSN 1932-6203, E-ISSN 1932-6203, Vol. 8, no 6Article in journal (Refereed)
    Abstract [en]

    To assess presence of virus DNA in skin lesions, swab samples from 82 squamous cell carcinomas of the skin (SCCs), 60 actinic keratoses (AKs), paraffin-embedded biopsies from 28 SCCs and 72 kerathoacanthomas (KAs) and fresh-frozen biopsies from 92 KAs, 85 SCCs and 92 AKs were analyzed by high throughput sequencing (HTS) using 454 or Ion Torrent technology. We found total of 4,284 viral reads, out of which 4,168 were Human Papillomavirus (HPV)-related, belonging to 15 known (HPV8, HPV12, HPV20, HPV36, HPV38, HPV45, HPV57, HPV59, HPV104, HPV105, HPV107, HPV109, HPV124, HPV138, HPV147), four previously described putative (HPV 915 F 06 007 FD1, FA73, FA101, SE42) and two putatively new HPV types (SE46, SE47). SE42 was cloned, sequenced, designated as HPV155 and found to have 76% similarity to the most closely related known HPV type. In conclusion, an unbiased approach for viral DNA detection in skin tumors has found that, although some new putative HPVs were found, known HPV types constituted most of the viral DNA.

  • 28.
    Carlsson, Jonas
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Mutational effects on protein structure and function2009Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    In this thesis several important proteins are investigated from a structural perspective. Some of the proteins are disease related while other have important but not completely characterised functions. The techniques used are general as demonstrated by applications on metabolic proteins (CYP21, CYP11B1, IAPP, ADH3), regulatory proteins (p53, GDNF) and a transporter protein (ANTR1).

    When the protein CYP21 (steroid 21-hydroxylase) is deficient it causes CAH (congenital adrenal hyperplasia). For this protein, there are about 60 known mutations with characterised clinical phenotypes. Using manual structural analysis we managed to explain the severity of all but one of the mutations. By observing the properties of these mutations we could perform good predictions on, at the time, not classified mutations.

    For the cancer suppressor protein p53, there are over thousand mutations with known activity. To be able to analyse such a large number of mutations we developed an automated method for evaluation of the mutation effect called PREDMUT. In this method we include twelve different prediction parameters including two energy parameters calculated using an energy minimization procedure. The method manages to differentiate severe mutations from non-severe mutations with 77% accuracy on all possible single base substitutions and with 88% on mutations found in breast cancer patients.

    The automated prediction was further applied to CYP11B1 (steroid 11-beta-hydroxylase), which in a similar way as CYP21 causes CAH when deficient. A generalized method applicable to any kind of globular protein was developed. The method was subsequently evaluated on nine additional proteins for which mutants were known with annotated disease phenotypes. This prediction achieved 84% accuracy on CYP11B1 and 81% accuracy in total on the evaluation proteins while leaving 8% as unclassified. By increasing the number of unclassified mutations the accuracy of the remaining mutations could be increased on the evaluation proteins and substantially increase the classification quality as measured by the Matthews correlation coefficient. Servers with predictions for all possible single based substitutions are provided for p53, CYP21 and CYP11B1.

    The amyloid formation of IAPP (islet amyloid polypeptide) is strongly connected to diabetes and has been studied using both molecular dynamics and Monte Carlo energy minimization. The effects of mutations on the amount and speed of amyloid formation were investigated using three approaches. Applying a consensus of the three methods on a number of interesting mutations, 94% of the mutations could be correctly classified as amyloid forming or not, evaluated with in vitro measurements.

    In the brain there are many proteins whose functions and interactions are largely unknown. GDNF (glial cell line-derived neurotrophic factor) and NCAM (neural cell adhesion molecule) are two such neuron connected proteins that are known to interact. The form of interaction was studied using protein--protein docking where a docking interface was found mediated by four oppositely charged residues in respective protein. This interface was subsequently confirmed by mutagenesis experiments. The NCAM dimer interface upon binding to the GDNF dimer was also mapped as well as an additional interacting protein, GFRα1, which was successfully added to the protein complex without any clashes.

    A large and well studied protein family is the alcohol dehydrogenase family, ADH. A class of this family is ADH3 (alcohol dehydrogenase class III) that has several known substrates and inhibitors. By using virtual screening we tried to characterize new ligands. As some ligands were already known we could incorporate this knowledge when the compound docking simulations were scored and thereby find two new substrates and two new inhibitors which were subsequently successfully tested in vitro.

    ANTR1 (anion transporter 1) is a membrane bound transporter important in the photosynthesis in plants. To be able to study the amino acid residues involved in inorganic phosphate transportation a homology model of the protein was created. Important residues were then mapped onto the structure using conservation analysis and we were in this way able to propose roles of amino acid residues involved in the transportation of inorganic phosphate. Key residues were subsequently mutated in vitro and a transportation process could be postulated.

    To conclude, we have used several molecular modelling techniques to find functional clues, interaction sites and new ligands. Furthermore, we have investigated the effect of muations on the function and structure of a multitude of disease related proteins.

     

    List of papers
    1. Molecular Model of Human CYP21 Based onMammalian CYP2C5: Structural Features Correlatewith Clinical Severity of Mutations CausingCongenital Adrenal Hyperplasia
    Open this publication in new window or tab >>Molecular Model of Human CYP21 Based onMammalian CYP2C5: Structural Features Correlatewith Clinical Severity of Mutations CausingCongenital Adrenal Hyperplasia
    Show others...
    2006 (English)In: Molecular Endocrinology, ISSN 0888-8809, E-ISSN 1944-9917, Vol. 20, no 11, p. 2946-2964Article in journal (Refereed) Published
    Abstract [en]

    Enhanced understanding of structure-function relationshipsof human 21-hydroxylase, CYP21, is requiredto better understand the molecular causesof congenital adrenal hyperplasia. To this end, astructural model of human CYP21 was calculatedbased on the crystal structure of rabbit CYP2C5.All but two known allelic variants of missense type,a total of 60 disease-causing mutations and sixnormal variants, were analyzed using this model. Astructural explanation for the corresponding phenotypewas found for all but two mutants for whichavailable clinical data are also discrepant with invitro enzyme activity. Calculations of protein stabilityof modeled mutants were found to correlateinversely with the corresponding clinical severity.Putative structurally important residues were identifiedto be involved in heme and substrate binding,redox partner interaction, and enzyme catalysisusing docking calculations and analysis of structurallydetermined homologous cytochrome P450s(CYPs). Functional and structural consequences ofseven novel mutations, V139E, C147R, R233G,T295N, L308F, R366C, and M473I, detected inScandinavian patients with suspected congenitaladrenal hyperplasia of different severity, were predictedusing molecular modeling. Structural featuresdeduced from the models are in good correlationwith clinical severity of CYP21 mutants,which shows the applicability of a modeling approachin assessment of new CYP21 mutations.

    Place, publisher, year, edition, pages
    Stanford: The endocrin society, 2006
    Keywords
    Mutations, prediction, CAH, CYP21, homology model
    National Category
    Bioinformatics and Systems Biology
    Identifiers
    urn:nbn:se:liu:diva-21305 (URN)10.1210/me.2006-0172 (DOI)
    Available from: 2009-09-30 Created: 2009-09-30 Last updated: 2017-12-13Bibliographically approved
    2. Investigation and prediction of the severity of p53 mutants using parameters from structural calculations
    Open this publication in new window or tab >>Investigation and prediction of the severity of p53 mutants using parameters from structural calculations
    2009 (English)In: The FEBS Journal, ISSN 1742-464X, E-ISSN 1742-4658, Vol. 276, no 15, p. 4142-4155Article in journal (Refereed) Published
    Abstract [en]

    A method has been developed to predict the effects of mutations in the p53 cancer suppressor gene. The new method uses novel parameters combined with previously established parameters. The most important parameter is the stability measure of the mutated structure calculated using molecular modelling. For each mutant, a severity score is reported, which can be used for classification into deleterious and nondeleterious. Both structural features and sequence properties are taken into account. The method has a prediction accuracy of 77% on all mutants and 88% on breast cancer mutations affecting WAF1 promoter binding. When compared with earlier methods, using the same dataset, our method clearly performs better. As a result of the severity score calculated for every mutant, valuable knowledge can be gained regarding p53, a protein that is believed to be involved in over 50% of all human cancers.

    Keywords
    Cancer; molecular modelling; mutations; p53; structural prediction
    National Category
    Medical and Health Sciences
    Identifiers
    urn:nbn:se:liu:diva-20141 (URN)10.1111/j.1742-4658.2009.07124.x (DOI)
    Available from: 2009-08-31 Created: 2009-08-31 Last updated: 2017-12-13Bibliographically approved
    3. A structural model of human steroid 11-betahydroxylase,CYP11B1, used to predict consequences of mutations
    Open this publication in new window or tab >>A structural model of human steroid 11-betahydroxylase,CYP11B1, used to predict consequences of mutations
    2009 (English)Article in journal (Other academic) Submitted
    Abstract [en]

    A prediction method has been developed to estimate the severity of amino acid residue exchanges in human steroid 11-beta-hydroxylase, CYP11B1, due to mutations in the corresponding gene. The prediction is based both on structural and on sequence dependent parameters. The method uses two approaches; one with general molecular property weights and one with a consensus voting strategy based upon distribution of molecular properties, which does not require any training. Both methods are tested on known mutations in CYP11B1 and result in 85% prediction accuracy. The consensus voting method is then further evaluated on 9 proteins with an average of 81% prediction accuracy. A server utilizing the results from the consensus voting on CYP11B1 is provided where the user can extract information about new mutants. A similar server is also provided for mutants in human steroid 21-hydroxylase (CYP21).

    Keywords
    CYP11B1, steroid 11-beta-hydroxylase, molecular modeling, structural prediction, mutations
    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-51118 (URN)
    Available from: 2009-10-19 Created: 2009-10-19 Last updated: 2009-10-19Bibliographically approved
    4. Disruption of the GDNF Binding Site in NCAM DissociatesLigand Binding and Homophilic Cell Adhesion
    Open this publication in new window or tab >>Disruption of the GDNF Binding Site in NCAM DissociatesLigand Binding and Homophilic Cell Adhesion
    Show others...
    2007 (English)In: Journal of Biological Chemistry, ISSN 0021-9258, E-ISSN 1083-351X, Vol. 282, no 17, p. 12734-12740Article in journal (Refereed) Published
    Abstract [en]

    Most plasma membrane proteins are capable of sensing multiple cell-cell and cell-ligand interactions, but the extent towhich this functional versatility is founded on their modular design is less clear. We have identified the third immunoglobulin domain of the Neural Cell Adhesion Molecule (NCAM) as the necessary and sufficient determinant for its interaction with Glial Cell Line-derived Neurotrophic Factor (GDNF). Four charged contacts were identified by molecular modeling as the main contributors to binding energy. Their mutation abolished GDNF binding to NCAM but left intact the ability of NCAM tomediate cell adhesion, indicating that the two functions are genetically separable. The GDNF-NCAM interface allows complex formation with the GDNF family receptor α1, shedding light on the molecular architecture of a multicomponent GDNF receptor.

    Place, publisher, year, edition, pages
    Bethesda, MD: American Society for Biochemistry and Molecular Biology, 2007
    Keywords
    homology model, protein complex, interaction interface, mutagenesis
    National Category
    Bioinformatics and Systems Biology
    Identifiers
    urn:nbn:se:liu:diva-21306 (URN)10.1074/jbc.M701588200 (DOI)
    Available from: 2009-09-30 Created: 2009-09-30 Last updated: 2017-12-13Bibliographically approved
    5. Functionally Important Amino Acids in the Arabidopsis Thylakoid Phosphate Transporter: Homology Modeling and Site-directed Mutagenesis
    Open this publication in new window or tab >>Functionally Important Amino Acids in the Arabidopsis Thylakoid Phosphate Transporter: Homology Modeling and Site-directed Mutagenesis
    Show others...
    2010 (English)In: Biochemistry, ISSN 0006-2960, E-ISSN 1520-4995, Vol. 49, no 30, p. 6430-6439Article in journal (Other academic) Published
    Abstract [en]

    The anion transporter 1 (ANTR1) from Arabidopsis thaliana, homologous to the mammalian SLC17 family, has recently been localized to the chloroplast thylakoid membrane. When expressed heterologously in Escherichia coli, ANTR1 mediates a Na+-dependent active transport of inorganic phosphate (Pi). The aim of this study was to identify amino acids involved in substrate binding/translocation by ANTR1 and in the Na+-dependence of its activity. A threedimensional structural model of ANTR1 was constructed using the crystal structure of glycerol-3-phosphate/phosphate antiporter (GlpT) from E.coli as a template. Based on this model and multiple sequence alignments, five highly conserved residues in plant ANTRs and mammalian SLC17 homologues have been selected for site-directed mutagenesis, namely Arg-120, Ser-124 and Arg-201 inside the putative translocation pathway, Arg-228 and Asp-382 exposed at the cytosolic surface of the protein. The activities of the wild type and mutant proteins have been analyzed using expression in E. coli and radioactive transport assays, and compared with bacterial cells carrying an empty plasmid. Based on Pi- and Na+-dependent kinetics, we propose that Arg-120, Arg-201 and Arg-228 are involved in binding and translocation of the substrate, Ser-124 functions as a periplasmic gate for Na+ ions, and finally Asp-382 participates in the turnover of the transporter via ionic interaction with either Arg-228 or Na+ ions. We also propose that the corresponding residues may have a similar function in other plant and mammalian SLC17 homologous transporters.

    National Category
    Medical and Health Sciences
    Identifiers
    urn:nbn:se:liu:diva-51119 (URN)10.1021/bi100239j (DOI)
    Note
    On the day of the defence day the status of this article was ManuscriptAvailable from: 2009-10-19 Created: 2009-10-19 Last updated: 2017-12-12Bibliographically approved
    6. A folding study on IAPP (Islet Amyloid Polypeptide) using molecular dynamics simulations
    Open this publication in new window or tab >>A folding study on IAPP (Islet Amyloid Polypeptide) using molecular dynamics simulations
    Show others...
    (English)Manuscript (preprint) (Other academic)
    Abstract [en]

    Amyloidosis is the largest group among the protein misfolding diseases, and includes well known diseases such as Alzheimer’s disease and type 2 diabetes. In the latter, islet amyloid is present in the pancreas in almost all individuals. Today, more than 25 different proteins have been isolated from amyloid deposits in human. Even though these proteins differ in size, charge and sequence they all have the capacity to assemble in to fibrillar structures with inseparable morphological appearance. Therefore, it can be assumed that the fibril process is based upon principles that are general for all proteins and knowledge derived from one protein can be used for other amyloid proteins. In this paper, we study the process of amyloid formation in parts of islet amyloid polypeptide (residues 18-29 and 11-37) by analyzing mutations using three different in silico methods. Finally, we use the methods to predict the amyloidogenic properties of the native IAPP and 16 variants thereof and compare the result with in vitro measurements. Using a consensus prediction of the three methods we managed to correctly classify all but two peptides. We have also given further evidence to the importance of S28P for inhibiting amyloid fibre formation, found evidence for antiparallel stacking, and identified important regions for beta sheet stability.

    Keywords
    IAPP, molecular modeling, amyloid, prediction, molecular dynamics, Monte Carlo
    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-51120 (URN)
    Available from: 2009-10-19 Created: 2009-10-19 Last updated: 2010-01-14Bibliographically approved
    7. Virtual screening for ligands to human alcohol dehydrogenase 3
    Open this publication in new window or tab >>Virtual screening for ligands to human alcohol dehydrogenase 3
    Show others...
    (English)Manuscript (preprint) (Other academic)
    Abstract [en]

    Alcohol dehydrogenase 3 (ADH3) has been suggested a role in nitric oxide homeostasis due to its function as a S-nitrosoglutathione (GSNO) reductase. This has requested a modulator of the ADH3 activity for control of GSNO levels. Today virtual screenings are frequently used in drug discovery to dock and rank a large number of compounds. With molecular dockings of more than 40,000 compounds into the active site pocket of human ADH3 we ranked compounds with a novel method. Six top ranked compounds that were not known to interact with ADH3 were tested in vitro, where two showed substrate activity (9-decen-1-ol and dodecyltetraglycol), two showed inhibition capacity (deoxycholic acid and doxorubicin) and two did not have any detectable effect. For the substrates, site specific interactions and calculated binding scoring energies were determined with an extended docking simulation including flexible side chains of amino acids residues. The binding scoring energies correlated well with the logarithm of the substrates kcat over Km values. Furthermore, with these computational and experimental data three different lines for specific inhibitors for ADH3 are suggested: fatty acids, glutathione analogs and in addition deoxycholic acids.

    Keywords
    Alcohol dehydrogenase, Enzyme kinetics, Molecular docking, Virtual screening
    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-51121 (URN)
    Available from: 2009-10-19 Created: 2009-10-19 Last updated: 2010-01-14Bibliographically approved
  • 29.
    Carlsson, Jonas
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Investigating protein variants using structural calculation techniques2012In: Homology Modeling: Methods and Protocols / [ed] Andrew J. W. Orry and Ruben Abagyan, Springer, 2012, Vol. 857, p. 313-330Chapter in book (Other academic)
    Abstract [en]

    Knowledge about protein tertiary structure can guide experiments, assist in the understanding of structure-function relationships, and aid the design of new therapeutics for disease. Homology modeling is an in silico method that predicts the tertiary structure of an amino acid sequence based on a homologous experimentally determined structure. In, Homology Modeling: Methods and Protocols experts in the field describe each homology modeling step from first principles, provide case studies for challenging modeling targets and describe methods for the prediction of how other molecules such as drugs can interact with the protein. Written in the highly successful Methods in Molecular Biology series format, the chapters include the kind of detailed description and implementation advice that is crucial for getting optimal results in the laboratory. Thorough and intuitive, Homology Modeling: Methods and Protocols guides scientists in the available homology modeling methods.

  • 30.
    Carlsson, Jonas
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Soussi, Thierry
    Department of Oncology-Pathology, Cancer Center Karolinska (CCK), Karolinska Institutet, Stockholm, Sweden.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Investigation and prediction of the severity of p53 mutants using parameters from structural calculations2009In: The FEBS Journal, ISSN 1742-464X, E-ISSN 1742-4658, Vol. 276, no 15, p. 4142-4155Article in journal (Refereed)
    Abstract [en]

    A method has been developed to predict the effects of mutations in the p53 cancer suppressor gene. The new method uses novel parameters combined with previously established parameters. The most important parameter is the stability measure of the mutated structure calculated using molecular modelling. For each mutant, a severity score is reported, which can be used for classification into deleterious and nondeleterious. Both structural features and sequence properties are taken into account. The method has a prediction accuracy of 77% on all mutants and 88% on breast cancer mutations affecting WAF1 promoter binding. When compared with earlier methods, using the same dataset, our method clearly performs better. As a result of the severity score calculated for every mutant, valuable knowledge can be gained regarding p53, a protein that is believed to be involved in over 50% of all human cancers.

  • 31.
    Carlsson, Jonas
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Vahdat Shariatpanahi, Aida
    Schultz, Sebastian
    Linköping University, Department of Clinical and Experimental Medicine, Cell Biology. Linköping University, Faculty of Health Sciences.
    Westermark, Gunilla
    Linköping University, Department of Clinical and Experimental Medicine, Cell Biology. Linköping University, Faculty of Health Sciences.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    A folding study on IAPP (Islet Amyloid Polypeptide) using molecular dynamics simulationsManuscript (preprint) (Other academic)
    Abstract [en]

    Amyloidosis is the largest group among the protein misfolding diseases, and includes well known diseases such as Alzheimer’s disease and type 2 diabetes. In the latter, islet amyloid is present in the pancreas in almost all individuals. Today, more than 25 different proteins have been isolated from amyloid deposits in human. Even though these proteins differ in size, charge and sequence they all have the capacity to assemble in to fibrillar structures with inseparable morphological appearance. Therefore, it can be assumed that the fibril process is based upon principles that are general for all proteins and knowledge derived from one protein can be used for other amyloid proteins. In this paper, we study the process of amyloid formation in parts of islet amyloid polypeptide (residues 18-29 and 11-37) by analyzing mutations using three different in silico methods. Finally, we use the methods to predict the amyloidogenic properties of the native IAPP and 16 variants thereof and compare the result with in vitro measurements. Using a consensus prediction of the three methods we managed to correctly classify all but two peptides. We have also given further evidence to the importance of S28P for inhibiting amyloid fibre formation, found evidence for antiparallel stacking, and identified important regions for beta sheet stability.

  • 32.
    Carlsson, Jonas
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Wedell, Anna
    Department of Molecular Medicine and Surgery, CMM:02, Karolinska Institutet/Karolinska University Hospital, SE-171 76 Stockholm, Sweden.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    A structural model of human steroid 11-betahydroxylase,CYP11B1, used to predict consequences of mutations2009Article in journal (Other academic)
    Abstract [en]

    A prediction method has been developed to estimate the severity of amino acid residue exchanges in human steroid 11-beta-hydroxylase, CYP11B1, due to mutations in the corresponding gene. The prediction is based both on structural and on sequence dependent parameters. The method uses two approaches; one with general molecular property weights and one with a consensus voting strategy based upon distribution of molecular properties, which does not require any training. Both methods are tested on known mutations in CYP11B1 and result in 85% prediction accuracy. The consensus voting method is then further evaluated on 9 proteins with an average of 81% prediction accuracy. A server utilizing the results from the consensus voting on CYP11B1 is provided where the user can extract information about new mutants. A similar server is also provided for mutants in human steroid 21-hydroxylase (CYP21).

  • 33.
    Cederlund, Ella
    et al.
    Karolinska Institute.
    Hedlund, Joel
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Hjelmqvist, Lars
    Karolinska Institute.
    Jonsson, Andreas
    Karolinska Institute.
    Shafqat, Jawed
    Karolinska Institute.
    Norin, Annika
    Karolinska Institute.
    Keung, Wing-Ming
    Harvard University.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Jornvall, Hans
    Karolinska Institute.
    Characterization of new medium-chain alcohol dehydrogenases adds resolution to duplications of the class I/III and the sub-class I genes2011In: Chemico-Biological Interactions, ISSN 0009-2797, E-ISSN 1872-7786, Vol. 191, no 03-janArticle in journal (Refereed)
    Abstract [en]

    Four additional variants of alcohol and aldehyde dehydrogenases have been purified and functionally characterized, and their primary structures have been determined. The results allow conclusions about the structural and evolutionary relationships within the large family of MDR alcohol dehydrogenases from characterizations of the pigeon (Columba livia) and dogfish (Scyliorhinus canicula) major liver alcohol dehydrogenases. The pigeon enzyme turns out to be of class I type and the dogfish enzyme of class III type. This result gives a third type of evidence, based on purifications and enzyme characterization in lower vertebrates, that the classical liver alcohol dehydrogenase originated by a gene duplication early in the evolution of vertebrates. It is discernable as the major liver form at about the level in-between cartilaginous and osseous fish. The results also show early divergence within the avian orders. Structures were determined by Edman degradations, making it appropriate to acknowledge the methodological contributions of Pehr Edman during the 65 years since his thesis at Karolinska Institutet, where also the present analyses were performed.

  • 34.
    Charalambidis, Georgios
    et al.
    University of Crete, Greece.
    Georgilis, Evangelos
    University of Crete, Greece; Fdn Research and Technology Hellas FORTH, Greece.
    Panda, Manas K.
    University of Crete, Greece; CSIR NIIST, India.
    Anson, Christopher E.
    Karlsruhe Institute Technology, Germany.
    Powell, Annie K.
    Karlsruhe Institute Technology, Germany; Karlsruhe Institute Technology, Germany.
    Doyle, Stephen
    Karlsruhe Institute Technology, Germany; Karlsruhe Institute Technology, Germany.
    Moss, David
    Karlsruhe Institute Technology, Germany; Karlsruhe Institute Technology, Germany.
    Jochum, Tobias
    Karlsruhe Institute Technology, Germany; Karlsruhe Institute Technology, Germany; Abcr GmbH, Germany.
    Horton, Peter N.
    University of Southampton, England.
    Coles, Simon J.
    University of Southampton, England.
    Linares, Mathieu
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Beljonne, David
    University of Mons UMONS, Belgium; University of Mons UMONS, Belgium.
    Naubron, Jean-Valere
    Aix Marseille University, France.
    Conradt, Jonas
    Karlsruhe Institute Technology, Germany; Karlsruhe Institute Technology, Germany.
    Kalt, Heinz
    Karlsruhe Institute Technology, Germany; Karlsruhe Institute Technology, Germany.
    Mitraki, Anna
    University of Crete, Greece; Fdn Research and Technology Hellas FORTH, Greece.
    Coutsolelos, Athanassios G.
    University of Crete, Greece.
    Silviu Balaban, Teodor
    Aix Marseille University, France.
    A switchable self-assembling and disassembling chiral system based on a porphyrin-substituted phenylalanine-phenylalanine motif2016In: Nature Communications, ISSN 2041-1723, E-ISSN 2041-1723, Vol. 7, no 12657Article in journal (Refereed)
    Abstract [en]

    Artificial light-harvesting systems have until now not been able to self-assemble into structures with a large photon capture cross-section that upon a stimulus reversibly can switch into an inactive state. Here we describe a simple and robust FLFL-dipeptide construct to which a meso-tetraphenylporphyrin has been appended and which self-assembles to fibrils, platelets or nanospheres depending on the solvent composition. The fibrils, functioning as quenched antennas, give intense excitonic couplets in the electronic circular dichroism spectra which are mirror imaged if the unnatural FDFD-analogue is used. By slightly increasing the solvent polarity, these light-harvesting fibres disassemble to spherical structures with silent electronic circular dichroism spectra but which fluoresce. Upon further dilution with the nonpolar solvent, the intense Cotton effects are recovered, thus proving a reversible switching. A single crystal X-ray structure shows a head-to-head arrangement of porphyrins that explains both their excitonic coupling and quenched fluorescence.

  • 35.
    Durbeej, Bo
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Wang, Jun
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Oruganti, Baswanth
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Molecular Photoswitching Aided by Excited-State Aromaticity2018In: ChemPlusChem, ISSN 2192-6506, Vol. 83, no 11, p. 958-967Article in journal (Refereed)
    Abstract [en]

    Central to the development of optoelectronic devices is the availability of efficient synthetic molecular photoswitches, the design of which is an arena where the evolving concept of excited‐state aromaticity (ESA) is yet to make a big impact. The aim of this minireview is to illustrate the potential of this concept to become a key tool for the future design of photoswitches. The paper starts with a discussion of challenges facing the use of photoswitches for applications and continues with an account of how the ESA concept has progressed since its inception. Then, following some brief remarks on computational modeling of photoswitches and ESA, the paper describes two different approaches to improve the quantum yields and response times of switches driven by E/Z photoisomerization or photoinduced H‐atom/proton transfer reactions through simple ESA considerations. It is our hope that these approaches, verified by quantum chemical calculations and molecular dynamics simulations, will help stimulate the application of the ESA concept as a general tool for designing more efficient photoswitches and other functional molecules used in optoelectronic devices.

    The full text will be freely available from 2019-07-30 13:14
  • 36.
    Elfving, Eric
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics .
    Automated annotation of protein families2011Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Introduction: The great challenge in bioinformatics is data integration. The amount of available data is always increasing and there are no common unified standards of where, or how, the data should be stored. The aim of this workis to build an automated tool to annotate the different member families within the protein superfamily of medium-chain dehydrogenases/reductases (MDR), by finding common properties among the member proteins. The goal is to increase the understanding of the MDR superfamily as well as the different member families.This will add to the amount of knowledge gained for free when a new, unannotated, protein is matched as a member to a specific MDR member family.

    Method: The different types of data available all needed different handling. Textual data was mainly compared as strings while numeric data needed some special handling such as statistical calculations. Ontological data was handled as tree nodes where ancestry between terms had to be considered. This was implemented as a plugin-based system to make the tool easy to extend with additional data sources of different types.

    Results: The biggest challenge was data incompleteness yielding little (or no) results for some families and thus decreasing the statistical significance of the results. Results show that all the human and mouse MDR members have a Pfam ADH domain (ADH_N and/or ADH_zinc_N) and takes part in an oxidation-reduction process, often with NAD or NADP as cofactor. Many of the proteins contain zinc and are expressed in liver tissue.

    Conclusions: A python based tool for automatic annotation has been created to annotate the different MDR member families. The tool is easily extendable to be used with new databases and much of the results agrees with information found in literature. The utility and necessity of this system, as well as the quality of its produced results, are expected to only increase over time, even if no additional extensions are produced, as the system itself is able to make further and more detailed inferences as more and more data become available.

  • 37.
    Elofsson, Arne
    et al.
    Stockholm Univ, Sweden.
    Joo, Keehyoung
    Korea Inst Adv Study, South Korea.
    Keasar, Chen
    Ben Gurion Univ Negev, Israel.
    Lee, Jooyoung
    Korea Inst Adv Study, South Korea.
    Maghrabi, Ali H. A.
    Univ Reading, England.
    Manavalan, Balachandran
    Korea Inst Adv Study, South Korea.
    McGuffin, Liam J.
    Univ Reading, England.
    Hurtado, David Menendez
    Stockholm Univ, Sweden.
    Mirabello, Claudio
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Pilstål, Robert
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Sidi, Tomer
    Ben Gurion Univ Negev, Israel.
    Uziela, Karolis
    Stockholm Univ, Sweden.
    Wallner, Björn
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Methods for estimation of model accuracy in CASP122018In: Proteins: Structure, Function, and Bioinformatics, ISSN 0887-3585, E-ISSN 1097-0134, Vol. 86, p. 361-373Article in journal (Refereed)
    Abstract [en]

    Methods to reliably estimate the quality of 3D models of proteins are essential drivers for the wide adoption and serious acceptance of protein structure predictions by life scientists. In this article, the most successful groups in CASP12 describe their latest methods for estimates of model accuracy (EMA). We show that pure single model accuracy estimation methods have shown clear progress since CASP11; the 3 top methods (MESHI, ProQ3, SVMQA) all perform better than the top method of CASP11 (ProQ2). Although the pure single model accuracy estimation methods outperform quasi-single (ModFOLD6 variations) and consensus methods (Pcons, ModFOLDclust2, Pcomb-domain, and Wallner) in model selection, they are still not as good as those methods in absolute model quality estimation and predictions of local quality. Finally, we show that when using contact-based model quality measures (CAD, lDDT) the single model quality methods perform relatively better.

  • 38.
    Eriksson, Hanna
    et al.
    Karolinska University Hospital.
    Lengqvist, Johan
    Karolinska University Hospital.
    Hedlund, Joel
    Linköping University, The Institute of Technology. Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics .
    Uhlen, Kristina
    GE Healthcare Biosci AB.
    Orre, Lukas M
    Karolinska University Hospital.
    Bjellqvist, Bengt
    GE Healthcare Biosci AB.
    Persson, Bengt
    Karolinska University Hospital.
    Lehtio, Janne
    Karolinska University Hospital.
    Jakobsson , Per-Johan
    Karolinska University Hospital.
    Quantitative membrane proteomics applying narrow range peptide isoelectric focusing for studies of small cell lung cancer resistance mechanisms2008In: Proteomics, ISSN 1615-9853, E-ISSN 1615-9861, Vol. 8, no 15, p. 3008-3018Article in journal (Refereed)
    Abstract [en]

    Drug resistance is often associated with upregulation of membrane-associated drug-efflux systems, and thus global membrane proteomics methods are valuable tools in the search for novel components of drug resistance phenotypes. Herein we have compared the microsomal proteome from the lung cancer cell line H69 and its isogenic Doxorubicin-resistant subcell line H69AR. The method used includes microsome preparation, iTRAQ labeling followed by narrow range peptide IEF in an immobilized pH-gradient (IPG-IEF) and LC-MS/MS analysis. We demonstrate that the microsomal preparation and iTRAQ labeling is reproducible regarding protein content and composition. The rationale using narrow range peptide IPG-IEF separation is demonstrated by its ability to: (i) lowering the complexity of the sample by two-thirds while keeping high proteome coverage (96%), (ii) providing high separation efficiency, and (iii) allowing for peptide validation and possibly identifications of post-transcriptional modifications. After analyzing one-fifth of the IEF fractions (effective pH range of 4.0-4.5), a total of 3704 proteins were identified, among which 527 were predicted to be membrane proteins. One of the proteins found to be differentially expressed was Serca 2, a calcium pump located in the ER membrane that potentially could result in changes of apoptotic response toward Doxorubicin.

  • 39.
    Falklöf, Olle
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Durbeej, Bo
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Computational Identification of Pyrrole Ring C as the Preferred Donor for Excited-State Proton Transfer in Bacteriophytochromes2018In: ChemPhotoChem, ISSN 2367-0932, Vol. 2, no 6, p. 453-457Article in journal (Refereed)
    Abstract [en]

    The engineering of bacteriophytochrome photoreceptors into near-infrared fluorescent proteins is a promising route toward deep-tissue imaging of living cells with many challenges ahead. One key objective is to increase the fluorescence quantum yields, which are limited by competing non-radiative relaxation processes involving not only the well-known double-bond photoisomerization of the tetrapyrrole chromophore, but also a potential excited-state proton transfer from the chromophore to the protein. Motivated by the lack of mechanistic knowledge about this proton transfer, we here use hybrid quantum mechanics/molecular mechanics methods to investigate three possible scenarios for how the process is initiated. Through calculated excited-state pKa values of the chromophore inside the protein matrix of Deinococcus radiodurans bacteriophytochrome, it is found that pyrrole ring C is a much more likely donor for excited-state proton transfer than rings A and B, which are also possible donors discussed in the literature. This finding offers a starting point for establishing a strategy to strengthen the fluorescence of engineered bacteriophytochromes through biochemical inhibition of the proton transfer.

  • 40.
    Franco, Irene
    et al.
    Karolinska Inst, Sweden.
    Johansson, Anna
    Uppsala Univ, Sweden.
    Olsson, Karl
    Karolinska Inst, Sweden.
    Vrtacnik, Peter
    Karolinska Inst, Sweden.
    Lundin, Par
    Karolinska Inst, Sweden; Stockholm Univ, Sweden.
    Helgadottir, Hafdis T.
    Karolinska Inst, Sweden.
    Larsson, Malin
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Revechon, Gwladys
    Karolinska Inst, Sweden.
    Bosia, Carla
    IIGM, Italy; Politecn Torino, Italy.
    Pagnani, Andrea
    IIGM, Italy; Politecn Torino, Italy.
    Provero, Paolo
    Mol Biotechnol Ctr, Italy; Ist Sci San Raffaele, Italy.
    Gustafsson, Thomas
    Karolinska Inst, Sweden.
    Fischer, Helene
    Karolinska Inst, Sweden.
    Eriksson, Maria
    Karolinska Inst, Sweden.
    Somatic mutagenesis in satellite cells associates with human skeletal muscle aging2018In: Nature Communications, ISSN 2041-1723, E-ISSN 2041-1723, Vol. 9, article id 800Article in journal (Refereed)
    Abstract [en]

    Human aging is associated with a decline in skeletal muscle (SkM) function and a reduction in the number and activity of satellite cells (SCs), the resident stem cells. To study the connection between SC aging and muscle impairment, we analyze the whole genome of single SC clones of the leg muscle vastus lateralis from healthy individuals of different ages (21-78 years). We find an accumulation rate of 13 somatic mutations per genome per year, consistent with proliferation of SCs in the healthy adult muscle. SkM-expressed genes are protected from mutations, but aging results in an increase in mutations in exons and promoters, targeting genes involved in SC activity and muscle function. In agreement with SC mutations affecting the whole tissue, we detect a missense mutation in a SC propagating to the muscle. Our results suggest somatic mutagenesis in SCs as a driving force in the age-related decline of SkM function.

  • 41.
    Fucile, Geoffrey
    et al.
    University of Toronto.
    Garcia, Christel
    University of Toronto.
    Carlsson, Jonas
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Sunnerhagen, Maria
    Linköping University, Department of Physics, Chemistry and Biology, Molecular Biotechnology . Linköping University, The Institute of Technology.
    Christendat, Dinesh
    University of Toronto.
    Structural and biochemical investigation of two Arabidopsis shikimate kinases: The heat-inducible isoform is thermostable2011In: Protein Science, ISSN 0961-8368, E-ISSN 1469-896X, Vol. 20, no 7, p. 1125-1136Article in journal (Refereed)
    Abstract [en]

    The expression of plant shikimate kinase (SK; EC 2.7.1.71), an intermediate step in the shikimate pathway to aromatic amino acid biosynthesis, is induced under specific conditions of environmental stress and developmental requirements in an isoform-specific manner. Despite their important physiological role, experimental structures of plant SKs have not been determined and the biochemical nature of plant SK regulation is unknown. The Arabidopsis thaliana genome encodes two SKs, AtSK1 and AtSK2. We demonstrate that AtSK2 is highly unstable and becomes inactivated at 37 degrees C whereas the heat-induced isoform, AtSK1, is thermostable and fully active under identical conditions at this temperature. We determined the crystal structure of AtSK2, the first SK structure from the plant kingdom, and conducted biophysical characterizations of both AtSK1 and AtSK2 towards understanding this mechanism of thermal regulation. The crystal structure of AtSK2 is generally conserved with bacterial SKs with the addition of a putative regulatory phosphorylation motif forming part of the adenosine triphosphate binding site. The heat-induced isoform, AtSK1, forms a homodimer in solution, the formation of which facilitates its relative thermostability compared to AtSK2. In silico analyses identified AtSK1 site variants that may contribute to AtSK1 stability. Our findings suggest that AtSK1 performs a unique function under heat stress conditions where AtSK2 could become inactivated. We discuss these findings in the context of regulating metabolic flux to competing downstream pathways through SK-mediated control of steady state concentrations of shikimate.

  • 42.
    Gustafsson, Mika
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Gawel, Danuta
    Linköping University, Department of Clinical and Experimental Medicine, Division of Clinical Sciences. Linköping University, Faculty of Medicine and Health Sciences.
    Alfredsson, Lars
    Karolinska Institute, Sweden.
    Baranzini, Sergio
    University of Calif San Francisco, CA, USA.
    Bjorkander, Janne
    County Council Jonköping, Sweden.
    Blomgran, Robert
    Linköping University, Department of Clinical and Experimental Medicine, Division of Microbiology and Molecular Medicine. Linköping University, Faculty of Medicine and Health Sciences.
    Hellberg, Sandra
    Linköping University, Department of Clinical and Experimental Medicine, Division of Neuro and Inflammation Science. Linköping University, Faculty of Medicine and Health Sciences.
    Eklund, Daniel
    Linköping University, Department of Clinical and Experimental Medicine, Division of Neuro and Inflammation Science. Linköping University, Faculty of Medicine and Health Sciences.
    Ernerudh, Jan
    Linköping University, Department of Clinical and Experimental Medicine, Division of Neuro and Inflammation Science. Linköping University, Faculty of Medicine and Health Sciences. Region Östergötland, Center for Diagnostics, Department of Clinical Immunology and Transfusion Medicine.
    Kockum, Ingrid
    Karolinska Institute, Sweden; Centre Molecular Med, Sweden.
    Konstantinell, Aelita
    Linköping University, Faculty of Medicine and Health Sciences. Linköping University, Department of Clinical and Experimental Medicine, Division of Clinical Sciences. Arctic University of Norway, Norway.
    Lahesmaa, Riita
    University of Turku, Finland; Abo Akad University, Finland.
    Lentini, Antonio
    Linköping University, Department of Clinical and Experimental Medicine, Division of Clinical Sciences. Linköping University, Faculty of Medicine and Health Sciences.
    Liljenström, H. Robert I.
    Linköping University, Department of Clinical and Experimental Medicine, Division of Clinical Sciences. Linköping University, Faculty of Medicine and Health Sciences.
    Mattson, Lina
    Linköping University, Department of Clinical and Experimental Medicine, Division of Clinical Sciences. Linköping University, Faculty of Medicine and Health Sciences.
    Matussek, Andreas
    County Council Jonköping, Sweden.
    Mellergård, Johan
    Linköping University, Department of Clinical and Experimental Medicine, Division of Neuro and Inflammation Science. Linköping University, Faculty of Medicine and Health Sciences. Region Östergötland, Local Health Care Services in Central Östergötland, Department of Neurology.
    Mendez, Melissa
    University of Peruana Cayetano Heredia, Peru.
    Olsson, Tomas
    Karolinska Institute, Sweden; Centre Molecular Med, Sweden.
    Pujana, Miguel A.
    Catalan Institute Oncol, Spain.
    Rasool, Omid
    University of Turku, Finland; Abo Akad University, Finland.
    Serra-Musach, Jordi
    Catalan Institute Oncol, Spain.
    Stenmarker, Margaretha
    County Council Jonköping, Sweden.
    Tripathi, Subhash
    University of Turku, Finland; Abo Akad University, Finland.
    Viitala, Miro
    University of Turku, Finland; Abo Akad University, Finland.
    Wang, Hui
    Linköping University, Department of Clinical and Experimental Medicine, Division of Clinical Sciences. Linköping University, Faculty of Medicine and Health Sciences. University of Texas MD Anderson Cancer Centre, TX 77030 USA.
    Zhang, Huan
    Linköping University, Department of Clinical and Experimental Medicine, Division of Clinical Sciences. Linköping University, Faculty of Medicine and Health Sciences.
    Nestor, Colm
    Linköping University, Department of Clinical and Experimental Medicine, Division of Clinical Sciences. Linköping University, Faculty of Medicine and Health Sciences.
    Benson, Mikael
    Linköping University, Department of Clinical and Experimental Medicine, Division of Clinical Sciences. Linköping University, Faculty of Medicine and Health Sciences. Region Östergötland, Heart and Medicine Center, Allergy Center.
    A validated gene regulatory network and GWAS identifies early regulators of T cell-associated diseases2015In: Science Translational Medicine, ISSN 1946-6234, E-ISSN 1946-6242, Vol. 7, no 313, article id 313ra178Article in journal (Refereed)
    Abstract [en]

    Early regulators of disease may increase understanding of disease mechanisms and serve as markers for presymptomatic diagnosis and treatment. However, early regulators are difficult to identify because patients generally present after they are symptomatic. We hypothesized that early regulators of T cell-associated diseases could be found by identifying upstream transcription factors (TFs) in T cell differentiation and by prioritizing hub TFs that were enriched for disease-associated polymorphisms. A gene regulatory network (GRN) was constructed by time series profiling of the transcriptomes and methylomes of human CD4(+) T cells during in vitro differentiation into four helper T cell lineages, in combination with sequence-based TF binding predictions. The TFs GATA3, MAF, and MYB were identified as early regulators and validated by ChIP-seq (chromatin immunoprecipitation sequencing) and small interfering RNA knockdowns. Differential mRNA expression of the TFs and their targets in T cell-associated diseases supports their clinical relevance. To directly test if the TFs were altered early in disease, T cells from patients with two T cell-mediated diseases, multiple sclerosis and seasonal allergic rhinitis, were analyzed. Strikingly, the TFs were differentially expressed during asymptomatic stages of both diseases, whereas their targets showed altered expression during symptomatic stages. This analytical strategy to identify early regulators of disease by combining GRNs with genome-wide association studies may be generally applicable for functional and clinical studies of early disease development.

  • 43.
    Hederos, Sofia
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Organic Chemistry. Linköping University, The Institute of Technology.
    Tegler, Lotta
    Carlsson, Jonas
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Persson, Bengt
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, The Institute of Technology.
    Viljanen, Johan
    Linköping University, Department of Physics, Chemistry and Biology, Organic Chemistry. Linköping University, The Institute of Technology.
    Kerstin S., Broo
    Linköping University, Department of Physics, Chemistry and Biology, Organic Chemistry. Linköping University, The Institute of Technology.
    A Promiscuous Glutathione Transferase Transformed into a Selective Thiolester Hydrolase2006In: Organic and biomolecular chemistry, ISSN 1477-0520, E-ISSN 1477-0539, Vol. 4, no 1, p. 90-97Article in journal (Refereed)
  • 44.
    Hedlund, Joel
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Bioinformatic protein family characterisation2010Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Biological research is necessary; not only to further our understanding of the processes of life, but also to combat disease, hunger and environmental damage.

    Bioinformatics is the science of handling biological information. It entails integrating, structuring and analysing the ever-increasing amounts of available biological data. In practise it means using computers to analyse huge amounts of very complicated data taken from a field that is only partially understood, to see the hidden trends and connections, and to draw useful conclusions.

    My thesis work has mainly concerned the study of protein families, which are groups of evolutionarily related proteins. I have analysed known protein families and created predictive models for them, and developed algorithms for defining new protein families. My principal techniques have been sequence alignments and hidden Markov models (HMM). To aid my work, I have written a lot of software, including MSAView, a visualiser for multiple sequence alignments (MSA).

    In this thesis, the protein family of inorganic pyrophosphatases (H+-PPases) is studied, as well as the two protein superfamilies BRICHOS and MDR (medium-chain dehydrogenases/reductases). The H+-PPases are tightly membrane bound, proton pumping, dimeric enzymes with ~700-residue subunits and found in bacteria, plants and eukaryotic parasites, and which use pyrophosphate as an alternative to ATP. The BRICHOS superfamily is only present in higher eukaryotes, but encompasses at least 8 protein families with a wide range of functions and disease associations, such as respiratory distress syndrome, dementia and cancer. The sequences are typically ~200 residues with even shorter functional forms. Finally, MDR, is a large and complex protein superfamily; it currently has over 16000 members, it is present in all kingdoms of life, the pairwise sequence identity is typically around 25 %, the chain lengths vary as does the oligomericity, and the members are partaking in a multitude of biological processes. The member families include the classical liver alcohol dehydrogenase (ADH), quinone reductase, leukotriene B4 dehydrogenase, and many more forms. There are at least 25 human MDR genes excluding close homologues. There are HMMs available for detecting MDR superfamily membership, but none for the individual families.

    For the H+-PPase family, we characterised member sequences found using an HMM of a conserved 57-residue region thought to form part of the active site. This region was found to contain two highly conserved nonapeptides, mainly consisting of the four “very early” residues Gly, Ala, Val and Asp, compatible with an ancient origin of the family. The two patterns have charged amino acid residues at positions 1, 5 and 9, are apparent binding sites for the substrate and parts of the active site, and were shown to be so specific for these enzymes that they can be used for automated annotation of new sequences.

    For the BRICHOS superfamily, we were able to find three previously unknown member families; group A, which may be ancestral to the ITM2 families (integral membrane protein 2); group B, which is a close relative to the gastrokine families, and group C, which appears to be a truly novel, disjoint BRICHOS family. The C-terminal region of group C has nearly identical sequences in all species ranging from fish to man and is seemingly unique to this family, indicating critical functional or structural properties.

    For the MDR superfamily, we characterised and built stable HMMs for 17 member families using an empiric approach. From our experiences we were able to develop an algorithm for automated HMM refinement that uses relationships in data to produce stable and reliable classifiers, and we used it to produce HMMs for 86 distinct MDR families. We have made the program freely available and it can be readily applied to other protein families. We also developed a web site (http://mdr–enzymes.org) that makes our findings directly useful also for non-bioinformaticians.

    In our analyses of the 86 families, we found that MDR forms with 2 Zn2+ ions in general are dehydrogenases, while MDR forms with no Zn2+ in general are reductases. Furthermore, in Bacteria, MDRs without Zn2+ are more frequent than those with Zn2+, while the opposite is true for eukaryotic MDRs, indicating that Zn2+ has been recruited into the MDR superfamily after the initial life kingdom separations.

    Multiple sequence alignments (MSA) play a central part in most work on protein families, and are integral to many bioinformatic methods. With the ongoing explosive increase of available sequence data, the scales of bioinformatic projects are growing, and efficient and human-friendly data visualisation becomes increasingly challenging, but is still essential for making new interpretations and discovering unexpected properties of the data.

    Ideally, visualisation should be comprehensive and detailed, and never distract with irrelevant information. It needs to offer natural and responsive ways of exploring the data, as well as provide consistent views in order to facilitate comparisons between datasets. I therefore developed MSAView, which is a fast, modular, configurable and extensible package for analysing and visualising MSAs and sequence features. It has a graphical user interface and a powerful command line client, and can be imported as a package into any Python program. It has a plugin architecture and a user extendable preset library. It can integrate and display data from online sources and launch external viewers for showing additional details. It also includes two new conservation measures; alignment divergences, which indicate atypical residues or deletions, and sequence conformances, which highlight sequences that differ from their siblings at crucial positions.

    In conclusion, this thesis details my work in analysing two protein superfamilies and one protein family using bioinformatic methods; developing an algorithm for automated generation of stable and reliable HMMs, as well as a new conservation measure, and a software platform for working with aligned sequences.

    List of papers
    1. Analysis of ancient sequence motifs in the H+ -PPase family
    Open this publication in new window or tab >>Analysis of ancient sequence motifs in the H+ -PPase family
    Show others...
    2006 (English)In: The FEBS Journal, ISSN 1742-464X, E-ISSN 1742-4658, Vol. 273, p. 5183-5193Article in journal (Refereed) Published
    Abstract [en]

    The unique family of membrane-bound proton-pumping inorganic pyrophosphatases, involving pyrophosphate as the alternative to ATP, was investigated by characterizing 166 members of the UniProtKB ⁄ Swiss-Prot + UniProtKB ⁄TrEMBL databases and available completed genomes, using sequence comparisons and a hidden Markov model based upon a conserved 57-residue region in the loop between transmembrane segments 5 and 6. The hidden Markov model was also used to search the approximately one million sequences recently reported from a large-scale sequencing project of organisms in the Sargasso Sea, resulting in additional 164 partial pyrophosphatase sequences. The strongly conserved 57-residue region was found to contain two nonapeptidyl sequences, mainly consisting of the four ‘very early’ proteinaceous amino acid residues Gly, Ala, Val and Asp, compatible with an ancient origin of the inorganic pyrophosphatases. The nonapeptide patterns have charged amino acid residues at positions 1, 5 and 9, are apparent binding sites for the substrate and parts of the active site, and were shown to be so specific for these enzymes that they can be used for functional assignments of unannotated genomes.

    Keywords
    bioinformatics; hidden Markov models; molecular evolution; proteinaceous amino acids; pyrophosphatase
    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-36165 (URN)10.1111/j.1742-4658.2006.05514.x (DOI)30276 (Local ID)30276 (Archive number)30276 (OAI)
    Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2018-03-21
    2. BRICHOS - a superfamily of multidomain proteins with diverse functions.
    Open this publication in new window or tab >>BRICHOS - a superfamily of multidomain proteins with diverse functions.
    2009 (English)In: BMC research notes, ISSN 1756-0500, Vol. 2, p. 180-Article in journal (Refereed) Published
    Abstract [en]

    ABSTRACT: BACKGROUND: The BRICHOS domain has been found in 8 protein families with a wide range of functions and a variety of disease associations, such as respiratory distress syndrome, dementia and cancer. The domain itself is thought to have a chaperone function, and indeed three of the families are associated with amyloid formation, but its structure and many of its functional properties are still unknown. FINDINGS: The proteins in the BRICHOS superfamily have four regions with distinct properties. We have analysed the BRICHOS proteins focusing on sequence conservation, amino acid residue properties, native disorder and secondary structure predictions. Residue conservation shows large variations between the regions, and the spread of residue conservation between different families can vary greatly within the regions. The secondary structure predictions for the BRICHOS proteins show remarkable coherence even where sequence conservation is low, and there seems to be little native disorder. CONCLUSIONS: The greatly variant rates of conservation indicates different functional constraints among the regions and among the families. We present three previously unknown BRICHOS families; group A, which may be ancestral to the ITM2 families; group B, which is a close relative to the gastrokine families, and group C, which appears to be a truly novel, disjoint BRICHOS family. The C-terminal region of group C has nearly identical sequences in all species ranging from fish to man and is seemingly unique to this family, indicating critical functional or structural properties.

    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-21338 (URN)10.1186/1756-0500-2-180 (DOI)19747390 (PubMedID)