liu.seSearch for publications in DiVA
Change search
Refine search result
12 1 - 50 of 61
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ballber Torres, Nuria
    et al.
    University of Politecn Cataluna, Spain.
    Altafini, Claudio
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, Faculty of Science & Engineering.
    Drug combinatorics and side effect estimation on the signed human drug-target network2016In: BMC Systems Biology, ISSN 1752-0509, E-ISSN 1752-0509, Vol. 10, no 74Article in journal (Refereed)
    Abstract [en]

    Background: The mode of action of a drug on its targets can often be classified as being positive (activator, potentiator, agonist, etc.) or negative (inhibitor, blocker, antagonist, etc.). The signed edges of a drug-target network can be used to investigate the combined mechanisms of action of multiple drugs on the ensemble of common targets. Results: In this paper it is shown that for the signed human drug-target network the majority of drug pairs tend to have synergistic effects on the common targets, i.e., drug pairs tend to have modes of action with the same sign on most of the shared targets, especially for the principal pharmacological targets of a drug. Methods are proposed to compute this synergism, as well as to estimate the influence of the drugs on the side effect of another drug. Conclusions: Enriching a drug-target network with information of functional nature like the sign of the interactions allows to explore in a systematic way a series of network properties of key importance in the context of computational drug combinatorics.

  • 2.
    Bartoszek, Krzysztof
    Gdansk University of Technology, Poland.
    A Graph – String Model of Gene Assembly in Ciliates [Grafowo-tekstowy model rekombinacji DNA u orzęsek]2006In: Zeszyty Naukowe Wydzialu ETI Politechniki Gdanskiej, 2006, p. 521-534Conference paper (Refereed)
    Abstract [en]

    The ciliates are a family of unicellular organisms that characterize themselves by having two types of nuclei, micro - and macronuclei. During cell mating the genetic material must change from the micronuclei to the macronuclei form. The paper summarises a formal model for this change. The model, which is described in recent works, is based on strings and graphs. It shows that inside the cell complex computational operations have to take place.

  • 3.
    Bartoszek, Krzysztof
    Gdansk University of Technology, Poland.
    The Bootstrap and Other Methods of Testing Phylogenetic Trees2007In: Zeszyty Naukowe Wydzialu ETI Politechniki Gdanskiej, 2007, p. 103-108Conference paper (Refereed)
    Abstract [en]

    The final step of a phylogenetic analysis is the test of the generated tree. This is not a easy task for which there is an obvious methodology because we do not know the full probabilistic model of evolution. A number of methods have been proposed but there is a wide debate concerning the interpretations of the results they produce.

  • 4.
    Bartoszek, Krzysztof
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Trait evolution with jumps: illusionary normality2017In: Proceedings of the XXIII National Conference on Applications of Mathematics in Biology and Medicine, 2017, p. 23-28Conference paper (Refereed)
    Abstract [en]

    Phylogenetic comparative methods for real-valued traits usually make use of stochastic process whose trajectories are continuous.This is despite biological intuition that evolution is rather punctuated thangradual. On the other hand, there has been a number of recent proposals of evolutionarymodels with jump components. However, as we are only beginning to understandthe behaviour of branching Ornstein-Uhlenbeck (OU) processes the asymptoticsof branching  OU processes with jumps is an even greater unknown. In thiswork we build up on a previous study concerning OU with jumps evolution on a pure birth tree.We introduce an extinction component and explore via simulations, its effects on the weak convergence of such a process.We furthermore, also use this work to illustrate the simulation and graphic generation possibilitiesof the mvSLOUCH package.

  • 5.
    Bartoszek, Krzysztof
    et al.
    Mathematical Statistics, Chalmers University of Technology and University of Gothenburg, Gothenburg, Sweden.
    Liò, Pietro
    Computer Laboratory, University of Cambridge Cambridge, United Kingdom.
    Sorathiya, Anil
    Computer Laboratory, University of Cambridge Cambridge, United Kingdom.
    Influenza differentiation and evolution2010In: Acta Physica Polonica B Proceedings Supplement, 2010, Vol. 3, p. 417-452, article id 2Conference paper (Refereed)
    Abstract [en]

    The aim of the study is to do a very wide analysis of HA, NA and M influenza gene segments to find short nucleotide regions,which differentiate between strains (i.e. H1, H2, ... e.t.c.), hosts, geographic regions, time when sequence was found and combination of time and region using a simple methodology. Finding regions  differentiating between strains has as its goal the construction of a Luminex microarray which will allow quick and efficient strain recognition. Discovery for the other splitting factors could shed lighton structures significant for host specificity and on the history of influenza evolution. A large number of places in the HA, NA and M gene segments were found that can differentiate between hosts, regions, time and combination of time and region. Also very good differentiation between different Hx strains can be seen.We link one of our findings to a proposed stochastic model of creation of viral phylogenetic trees.

  • 6.
    Bartoszek, Krzysztof
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences. Uppsala Univ, Sweden.
    Majchrzak, Marta
    Polish Acad Sci, Poland.
    Sakowski, Sebastian
    Univ Lodz, Poland.
    Kubiak-Szeligowska, Anna B.
    Polish Acad Sci, Poland.
    Kaj, Ingemar
    Uppsala Univ, Sweden.
    Parniewski, Pawel
    Polish Acad Sci, Poland.
    Predicting pathogenicity behavior in Escherichia coli population through a state dependent model and TRS profiling2018In: PloS Computational Biology, ISSN 1553-734X, E-ISSN 1553-7358, Vol. 14, no 1, article id e1005931Article in journal (Refereed)
    Abstract [en]

    The Binary State Speciation and Extinction (BiSSE) model is a branching process based model that allows the diversification rates to be controlled by a binary trait. We develop a general approach, based on the BiSSE model, for predicting pathogenicity in bacterial populations from microsatellites profiling data. A comprehensive approach for predicting pathogenicity in E. coli populations is proposed using the state-dependent branching process model combined with microsatellites TRS-PCR profiling. Additionally, we have evaluated the possibility of using the BiSSE model for estimating parameters from genetic data. We analyzed a real dataset (from 251 E. coli strains) and confirmed previous biological observations demonstrating a prevalence of some virulence traits in specific bacterial sub-groups. The method may be used to predict pathogenicity of other bacterial taxa.

  • 7.
    Bartoszek, Krzysztof
    et al.
    Department of Mathematics, Uppsala University, Uppsala, Sweden.
    Pietro, Lio'
    Computer Laboratory , University of Cambridge, Cambridge, Un ited Kingdom.
    A novel algorithm to reconstruct phylogenies using gene sequences and expression data2014In: International Proceedings of Chemical, Biological & Environmental Engineering; Environment, Energy and Biotechnology III, 2014, Vol. 70, p. 8-12Conference paper (Refereed)
    Abstract [en]

    Phylogenies based on single loci should be viewed with caution and the best approach for obtaining robust trees is to examine numerous loci across the genome. It often happens that for the same set of species trees derived from different genes are in conflict between each other. There are several methods that combine information from different genes in order to infer the species tree. One novel approach is to use informationfrom different -omics. Here we describe a phylogenetic method based on an Ornstein–Uhlenbeck process that combines sequence and gene expression data. We test our method on genes belonging to the histidine biosynthetic operon. We found that the method provides interesting insights into selection pressures and adaptive hypotheses concerning gene expression levels.

  • 8.
    Basu, Sankar Chandra
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Wallner, Björn
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    DockQ: A Quality Measure for Protein-Protein Docking Models2016In: PLoS ONE, ISSN 1932-6203, E-ISSN 1932-6203, Vol. 11, no 8, p. e0161879-Article in journal (Refereed)
    Abstract [en]

    The state-of-the-art to assess the structural quality of docking models is currently based on three related yet independent quality measures: F-nat, LRMS, and iRMS as proposed and standardized by CAPRI. These quality measures quantify different aspects of the quality of a particular docking model and need to be viewed together to reveal the true quality, e.g. a model with relatively poor LRMS (amp;gt; 10 angstrom) might still qualify as acceptable with a descent F-nat (amp;gt; 0.50) and iRMS (amp;lt; 3.0 angstrom). This is also the reason why the so called CAPRI criteria for assessing the quality of docking models is defined by applying various ad-hoc cutoffs on these measures to classify a docking model into the four classes: Incorrect, Acceptable, Medium, or High quality. This classification has been useful in CAPRI, but since models are grouped in only four bins it is also rather limiting, making it difficult to rank models, correlate with scoring functions or use it as target function in machine learning algorithms. Here, we present DockQ, a continuous protein-protein docking model quality measure derived by combining F-nat, LRMS, and iRMS to a single score in the range [0, 1] that can be used to assess the quality of protein docking models. By using DockQ on CAPRI models it is possible to almost completely reproduce the original CAPRI classification into Incorrect, Acceptable, Medium and High quality. An average PPV of 94% at 90% Recall demonstrating that there is no need to apply predefined ad-hoc cutoffs to classify docking models. Since DockQ recapitulates the CAPRI classification almost perfectly, it can be viewed as a higher resolution version of the CAPRI classification, making it possible to estimate model quality in a more quantitative way using Z-scores or sum of top ranked models, which has been so valuable for the CASP community. The possibility to directly correlate a quality measure to a scoring function has been crucial for the development of scoring functions for protein structure prediction, and DockQ should be useful in a similar development in the protein docking field.

  • 9.
    Basu, Sankar Chandra
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Wallner, Björn
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Finding correct protein-protein docking models using ProQDock2016In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 32, no 12, p. 262-270Article in journal (Refereed)
    Abstract [en]

    Motivation: Protein-protein interactions are a key in virtually all biological processes. For a detailed understanding of the biological processes, the structure of the protein complex is essential. Given the current experimental techniques for structure determination, the vast majority of all protein complexes will never be solved by experimental techniques. In lack of experimental data, computational docking methods can be used to predict the structure of the protein complex. A common strategy is to generate many alternative docking solutions (atomic models) and then use a scoring function to select the best. The success of the computational docking technique is, to a large degree, dependent on the ability of the scoring function to accurately rank and score the many alternative docking models. Results: Here, we present ProQDock, a scoring function that predicts the absolute quality of docking model measured by a novel protein docking quality score (DockQ). ProQDock uses support vector machines trained to predict the quality of protein docking models using features that can be calculated from the docking model itself. By combining different types of features describing both the protein-protein interface and the overall physical chemistry, it was possible to improve the correlation with DockQ from 0.25 for the best individual feature (electrostatic complementarity) to 0.49 for the final version of ProQDock. ProQDock performed better than the state-of-the-art methods ZRANK and ZRANK2 in terms of correlations, ranking and finding correct models on an independent test set. Finally, we also demonstrate that it is possible to combine ProQDock with ZRANK and ZRANK2 to improve performance even further.

  • 10.
    Bhattacharyya, Dhananjay
    et al.
    Saha Institute Nucl Phys, India.
    Halder, Sukanya
    Saha Institute Nucl Phys, India.
    Basu, Sankar Chandra
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering. University of Calcutta, India.
    Mukherjee, Debasish
    Saha Institute Nucl Phys, India.
    Kumar, Prasun
    Indian Institute Science, India.
    Bansal, Manju
    Indian Institute Science, India.
    RNAHelix: computational modeling of nucleic acid structures with Watson-Crick and non-canonical base pairs2017In: Journal of Computer-Aided Molecular Design, ISSN 0920-654X, E-ISSN 1573-4951, Vol. 31, no 2, p. 219-235Article in journal (Refereed)
    Abstract [en]

    Comprehensive analyses of structural features of non-canonical base pairs within a nucleic acid double helix are limited by the availability of a small number of three dimensional structures. Therefore, a procedure for model building of double helices containing any given nucleotide sequence and base pairing information, either canonical or non-canonical, is seriously needed. Here we describe a program RNAHelix, which is an updated version of our widely used software, NUCGEN. The program can regenerate duplexes using the dinucleotide step and base pair orientation parameters for a given double helical DNA or RNA sequence with defined Watson-Crick or non-Watson-Crick base pairs. The original structure and the corresponding regenerated structure of double helices were found to be very close, as indicated by the small RMSD values between positions of the corresponding atoms. Structures of several usual and unusual double helices have been regenerated and compared with their original structures in terms of base pair RMSD, torsion angles and electrostatic potentials and very high agreements have been noted. RNAHelix can also be used to generate a structure with a sequence completely different from an experimentally determined one or to introduce single to multiple mutation, but with the same set of parameters and hence can also be an important tool in homology modeling and study of mutation induced structural changes.

  • 11.
    Björkholm, Patrik
    Linköping University, The Department of Physics, Chemistry and Biology.
    Method for recognizing local descriptors of protein structures using Hidden Markov Models2008Independent thesis Basic level (professional degree), 20 points / 30 hpStudent thesis
    Abstract [en]

    Being able to predict the sequence-structure relationship in proteins will extend the scope of many bioinformatics tools relying on structure information. Here we use Hidden Markov models (HMM) to recognize and pinpoint the location in target sequences of local structural motifs (local descriptors of protein structure, LDPS) These substructures are composed of three or more segments of amino acid backbone structures that are in proximity with each other in space but not necessarily along the amino acid sequence. We were able to align descriptors to their proper locations in 41.1% of the cases when using models solely built from amino acid information. Using models that also incorporated secondary structure information, we were able to assign 57.8% of the local descriptors to their proper location. Further enhancements in performance was yielded when threading a profile through the Hidden Markov models together with the secondary structure, with this material we were able assign 58,5% of the descriptors to their proper locations. Hidden Markov models were shown to be able to locate LDPS in target sequences, the performance accuracy increases when secondary structure and the profile for the target sequence were used in the models.

  • 12.
    Borg, Ann-Louise
    Linköping University, Department of Physics, Chemistry and Biology.
    Investigation of a Method for Determination of Anticomplementary Activity (ACA) in Octagam2009Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This Master Thesis was conducted at Octapharma AB in Stockholm.

    Anticomplementary activity (ACA) is a measure of the product’s abilities to activate the complement system. IgG aggregates are mainly responsible for this activation. Two different performances of a method for determination of ACA in Octagam® are available. The two performances are based on the reference method for test of ACA in immunoglobulins in the European Pharmacopoeia Commission Guideline 6.0 (chapter 2.6.17). The method is carried out either in test tubes or on microtiter plates. The test tube method can be performed either in a manual manner or modified, being more automated. The latter performance has been applied in this study. The plate method is more automated than both of the tube methods. The plate method and the manual tube method have earlier seemed to result in different outcomes, which was the basis for this thesis.

    The plate method and the modified test tube method have been compared and robustness parameters have been studied in order to see which factors influence on the end result. The adequacy of using Human Biological Reference Preparation (human BRP) as a control for the ACA method in general has also been investigated. Samples of the product are outside the scope of this thesis and have not been investigated.

    According to this study, the plate method and the modified tube method are not comparable with regard to complement titration results and to ACA of the BRP control. A higher precision is gained with the plate method. This in combination with the higher degree of automation makes the plate method advantageous in several aspects. When it comes to the robustness of the ACA method in general, the sheep red blood cells (SRBC) used are critical. Haemolysin dilution and complement activity seem to be critical as well.

    Human BRP is, according to this study more adequate as a reference for the plate method than for the tube method. An In house control is believed to be more representative to the ACA method in general as it is of the same nature as the samples analysed, in contrast to the human BRP.

  • 13.
    Bresell, Anders
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics . Linköping University, The Institute of Technology.
    Characterization of protein families, sequence patterns, and functional annotations in large data sets2008Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Bioinformatics involves storing, analyzing and making predictions on massive amounts of protein and nucleotide sequence data. The thesis consists of six papers and is focused on proteins. It describes the utilization of bioinformatics techniques to characterize protein families and to detect patterns in gene expression and in polypeptide occurrences. Two protein families were bioinformatically characterized - the membrane associated proteins in eicosanoid and glutathione metabolism (MAPEG) and the Tripartite motif (TRIM) protein families.

    In the study of the MAPEG super-family, application of different bioinformatic methods made it possible to characterize many new members leading to a doubling of the family size. Furthermore, the MAPEG members were subdivided into families. Remarkably, in six families with previously predominantly mammalian members, fish representatives were also now detected, which dated the origin of these families back to the Cambrium ”species explosion”, thus earlier than previously anticipated. Sequence comparisons made it possible to define diagnostic sequence patterns that can be used in genome annotations. Upon publication of several MAPEG structures, these patterns were confirmed to be part of the active sites.

    In the TRIM study, the bioinformatic analyses made it possible to subdivide the proteins into three subtypes and to characterize a large number of members. In addition, the analyses showed crucial structural dependencies between the RING and the B-box domains of the TRIM member

    Ro52. The linker region between the two domains, denoted RBL, is known

    to be disease associated. Now, an amphipathic helix was found to be a

    characteristic feature of the RBL region, which also was used to divide the family into three subtypes.

    The ontology annotation treebrowser (OAT) tool was developed to detect functional similarities or common concepts in long lists of proteins or genes, typically generated from proteomics or microarray experiments. OAT was the first annotation browser to include both Gene Ontology (GO) and Medical Subject Headings (MeSH) into the same framework. The complementarity of these two ontologies was demonstrated. OAT was used in the TRIM study to detect differences in functional annotations between the subtypes.

    In the oligopeptide study, we investigated pentapeptide patterns that were over- or under-represented in the current de facto standard database of protein knowledge and a set of completed genomes, compared to what could be expected from amino acid compositions. We found three predominant categories of patterns: (i) patterns originating from frequently occurring families, e.g. respiratory chain-associated proteins and translation machinery proteins; (ii) proteins with structurally and/or functionally favored patterns; (iii) multicopy species-specific retrotransposons, only found in the genome set. Such patterns may influence amino acid residue based prediction algorithms. These findings in the oligopeptide study were utilized for development of a new method that detects translated introns in unverified protein predictions, which are available in great numbers due to the many completed and ongoing genome projects.

    A new comprehensive database of protein sequences from completed genomes was developed, denoted genomeLKPG. This database was of central importance in the MAPEG, TRIM and oligopeptide studies. The new sequence database has also been proven useful in several other studies.

    List of papers
    1. Bioinformatic and enzymatic characterization of the MAPEG superfamily
    Open this publication in new window or tab >>Bioinformatic and enzymatic characterization of the MAPEG superfamily
    Show others...
    2005 (English)In: The FEBS Journal, ISSN 1742-464X, E-ISSN 1742-4658, Vol. 272, no 7, p. 1688-1703Article in journal (Refereed) Published
    Abstract [en]

    The membrane associated proteins in eicosanoid and glutathione metabolism (MAPEG) superfamily includes structurally related membrane proteins with diverse functions of widespread origin. A total of 136 proteins belonging to the MAPEG superfamily were found in database and genome screenings. The members were found in prokaryotes and eukaryotes, but not in any archaeal organism. Multiple sequence alignments and calculations of evolutionary trees revealed a clear subdivision of the eukaryotic MAPEG members, corresponding to the six families of microsomal glutathione transferases (MGST) 1, 2 and 3, leukotriene C4 synthase (LTC4), 5-lipoxygenase activating protein (FLAP), and prostaglandin E synthase. Prokaryotes contain at least two distinct potential ancestral subfamilies, of which one is unique, whereas the other most closely resembles enzymes that belong to the MGST2/FLAP/LTC4 synthase families. The insect members are most similar to MGST1/prostaglandin E synthase. With the new data available, we observe that fish enzymes are present in all six families, showing an early origin for MAPEG family differentiation. Thus, the evolutionary origins and relationships of the MAPEG superfamily can be defined, including distinct sequence patterns characteristic for each of the subfamilies. We have further investigated and functionally characterized representative gene products from Escherichia coli, Synechocystis sp., Arabidopsis thaliana and Drosophila melanogaster, and the fish liver enzyme, purified from pike (Esox lucius). Protein overexpression and enzyme activity analysis demonstrated that all proteins catalyzed the conjugation of 1-chloro-2,4-dinitrobenzene with reduced glutathione. The E. coli protein displayed glutathione transferase activity of 0.11 µmol·min−1·mg−1 in the membrane fraction from bacteria overexpressing the protein. Partial purification of the Synechocystis sp. protein yielded an enzyme of the expected molecular mass and an N-terminal amino acid sequence that was at least 50% pure, with a specific activity towards 1-chloro-2,4-dinitrobenzene of 11 µmol·min−1·mg−1. Yeast microsomes expressing the Arabidopsis enzyme showed an activity of 0.02 µmol·min−1·mg−1, whereas the Drosophila enzyme expressed in E. coli was highly active at 3.6 µmol·min−1·mg−1. The purified pike enzyme is the most active MGST described so far with a specific activity of 285 µmol·min−1·mg−1. Drosophila and pike enzymes also displayed glutathione peroxidase activity towards cumene hydroperoxide (0.4 and 2.2 µmol·min−1·mg−1, respectively). Glutathione transferase activity can thus be regarded as a common denominator for a majority of MAPEG members throughout the kingdoms of life whereas glutathione peroxidase activity occurs in representatives from the MGST1, 2 and 3 and PGES subfamilies.

    Keywords
    MAPEG, microsomal glutathione transferase, prostaglandin, leukotriene
    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-12886 (URN)10.1111/j.1742-4658.2005.04596.x (DOI)
    Available from: 2008-01-28 Created: 2008-01-28 Last updated: 2017-12-14Bibliographically approved
    2. The fellowship of the RING: The RING-B-box linker region (RBL) interacts with the RING in TRIM21/Ro52, contributes to an autoantigenic epitope in Sjögren's syndrome, and is an integral and conserved region in TRIM proteins
    Open this publication in new window or tab >>The fellowship of the RING: The RING-B-box linker region (RBL) interacts with the RING in TRIM21/Ro52, contributes to an autoantigenic epitope in Sjögren's syndrome, and is an integral and conserved region in TRIM proteins
    Show others...
    2008 (English)In: Journal of Molecular Biology, ISSN 0022-2836, E-ISSN 1089-8638, Vol. 377, no 2, p. 431-449Article in journal (Refereed) Published
    Abstract [en]

    Ro52 is a major autoantigen that is targeted in the autoimmune disease Sjögren syndrome and belongs to the tripartite motif (TRIM) protein family. Disease-related antigenic epitopes are mainly found in the coiled-coil domain of Ro52, but one such epitope is located in the Zn2+-binding region, which comprises an N-terminal RING followed by a B-box, separated by a ∼40-residue linker peptide. In the present study, we extend the structural, biophysical, and immunological knowledge of this RING-B-box linker (RBL) by employing an array of methods. Our bioinformatic investigations show that the RBL sequence motif is unique to TRIM proteins and can be classified into three distinct subtypes. The RBL regions of all three subtypes are as conserved as their known flanking domains, and all are predicted to comprise an amphipathic helix. This helix formation is confirmed by circular dichroism spectroscopy and is dependent on the presence of the RING. Immunological studies show that the RBL is part of a conformation-dependent epitope, and its antigenicity is likewise dependent on a structured RING domain. Recombinant Ro52 RING-RBL exists as a monomer in vitro, and binding of two Zn2+ increases its stability. Regions stabilized by Zn2+ binding are identified by limited proteolysis and matrix-assisted laser desorption/ionization mass spectrometry. Furthermore, the residues of the RING and linker that interact with each other are identified by analysis of protection patterns, which, together with bioinformatic and biophysical data, enabled us to propose a structural model of the RING-RBL based on modeling and docking experiments. Sequence similarities and evolutionary sequence patterns suggest that the results obtained from Ro52 are extendable to the entire TRIM protein family.

    Keywords
    Ro52; TRIM21; RING; linker; zinc binding
    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-12887 (URN)10.1016/j.jmb.2008.01.005 (DOI)
    Available from: 2008-01-28 Created: 2008-01-28 Last updated: 2017-12-14Bibliographically approved
    3. Ontology annotation treebrowser: an interactive tool where the complementarity of medical subject headings and gene ontology improves the interpretation of gene lists
    Open this publication in new window or tab >>Ontology annotation treebrowser: an interactive tool where the complementarity of medical subject headings and gene ontology improves the interpretation of gene lists
    2006 (English)In: Applied Bioinformatics, ISSN 1175-5636, Vol. 5, no 4, p. 225-236Article in journal (Refereed) Published
    Abstract [en]

    Gene expression and proteomics analysis allow the investigation of thousands of biomolecules in parallel. This results in a long list of interesting genes or proteins and a list of annotation terms in the order of thousands. It is not a trivial task to understand such a gene list and it would require extensive efforts to bring together the overwhelming amounts of associated information from the literature and databases. Thus, it is evident that we need ways of condensing and filtering this information. An excellent way to represent knowledge is to use ontologies, where it is possible to group genes or terms with overlapping context, rather than studying one-dimensional lists of keywords. Therefore, we have built the ontology annotation treebrowser (OAT) to represent, condense, filter and summarise the knowledge associated with a list of genes or proteins.

    The OAT system consists of two disjointed parts; a MySQL® database named OATdb, and a treebrowser engine that is implemented as a web interface. The OAT system is implemented using Perl scripts on an Apache web server and the gene, ontology and annotation data is stored in a relational MySQL® database. In OAT, we have harmonized the two ontologies of medical subject headings (MeSH) and gene ontology (GO), to enable us to use knowledge both from the literature and the annotation projects in the same tool. OAT includes multiple gene identifier sets, which are merged internally in the OAT database. We have also generated novel MeSH annotations by mapping accession numbers to MEDLINE entries.

    The ontology browser OAT was created to facilitate the analysis of gene lists. It can be browsed dynamically, so that a scientist can interact with the data and govern the outcome. Test statistics show which branches are enriched. We also show that the two ontologies complement each other, with surprisingly low overlap, by mapping annotations to the Unified Medical Language System®.

    We have developed a novel interactive annotation browser that is the first to incorporate both MeSH and GO for improved interpretation of gene lists. With OAT, we illustrate the benefits of combining MeSH and GO for understanding gene lists. OAT is available as a public web service at: http://www.ifm.liu.se/bioinfo/oat

    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-12888 (URN)
    Available from: 2008-01-28 Created: 2008-01-28 Last updated: 2009-11-07Bibliographically approved
    4. Characterization of oligopeptide patterns in large protein sets
    Open this publication in new window or tab >>Characterization of oligopeptide patterns in large protein sets
    2007 (English)In: BMC Genomics, ISSN 1471-2164, E-ISSN 1471-2164, Vol. 8, no 346, p. 1-15Article in journal (Refereed) Published
    Abstract [en]

    Background: Recent sequencing projects and the growth of sequence data banks enable oligopeptide patterns to be characterized on a genome or kingdom level. Several studies have focused on kingdom or habitat classifications based on the abundance of short peptide patterns. There have also been efforts at local structural prediction based on short sequence motifs. Oligopeptide patterns undoubtedly carry valuable information content. Therefore, it is important to characterize these informational peptide patterns to shed light on possible new applications and the pitfalls implicit in neglecting bias in peptide patterns.

    Results: We have studied four classes of pentapeptide patterns (designated POP, NEP, ORP and URP) in the kingdoms archaea, bacteria and eukaryotes. POP are highly abundant patterns statistically not expected to exist; NEP are patterns that do not exist but are statistically expected to; ORP are patterns unique to a kingdom; and URP are patterns excluded from a kingdom. We used two data sources: the de facto standard of protein knowledge Swiss-Prot, and a set of 386 completely sequenced genomes. For each class of peptides we looked at the 100 most extreme and found both known and unknown sequence features. Most of the known sequence motifs can be explained on the basis of the protein families from which they originate.

    Conclusion: We find an inherent bias of certain oligopeptide patterns in naturally occurring proteins that cannot be explained solely on the basis of residue distribution in single proteins, kingdoms or databases. We see three predominant categories of patterns: (i) patterns widespread in a kingdom such as those originating from respiratory chain-associated proteins and translation machinery; (ii) proteins with structurally and/or functionally favored patterns, which have not yet been ascribed this role; (iii) multicopy species-specific retrotransposons, only found in the genome set. These categories will affect the accuracy of sequence pattern algorithms that rely mainly on amino acid residue usage. Methods presented in this paper may be used to discover targets for antibiotics, as we identify numerous examples of kingdom-specific antigens among our peptide classes. The methods may also be useful for detecting coding regions of genes.

    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-12889 (URN)10.1186/1471-2164-8-346 (DOI)
    Available from: 2008-01-28 Created: 2008-01-28 Last updated: 2017-12-14Bibliographically approved
    5. Using SVM and tripeptide patterns to detect translated introns
    Open this publication in new window or tab >>Using SVM and tripeptide patterns to detect translated introns
    2007 (English)In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105Article in journal (Refereed) Submitted
    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-12890 (URN)
    Available from: 2008-01-28 Created: 2008-01-28 Last updated: 2017-12-14
    6. GenomeLKPG: A comprehensive proteome sequencedatabase for taxonomy studies
    Open this publication in new window or tab >>GenomeLKPG: A comprehensive proteome sequencedatabase for taxonomy studies
    2008 (English)Article in journal (Refereed) Submitted
    Abstract [en]

    Background: In order to perform taxonomically unbiased analyses of protein relationships, there is a need ofcomplete proteomes rather than databases with bias towards well characterized protein families. However, nocomprehensive resource of completed proteomes is currently available. Instead, the proteomes need to be down-loaded manually from di®erent servers, all using different filename conventions and fasta header formats.

    Results: We have developed a semi-automatic algorithm that retrieves complete proteomes from multiple FTP-servers and maps the species-speci¯c sequence entries to the NCBI taxonomy. The compiled data is provided ina sequence database named genomeLKPG.

    Conclusions: The usefulness of genomeLKPG is proven in several published taxonomical studies.

    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:liu:diva-52933 (URN)
    Available from: 2010-01-13 Created: 2010-01-13 Last updated: 2010-01-13
  • 14.
    Cedersund, Gunnar
    Linköping University, Department of Biomedical Engineering. Linköping University, Department of Clinical and Experimental Medicine. Linköping University, Faculty of Science & Engineering.
    Prediction uncertainty estimation despite unidentifiability: an overview of recent developments2016In: Uncertainty in Biology: a computational modeling approach / [ed] Liesbet Geris and David Gomez-Cabrero, Springer, 2016, p. 449-466Chapter in book (Refereed)
    Abstract [en]

    One of the most important properties of a mathematical model is the abilityto make predictions: to predict that which has not yet been measured. Suchpredictions can sometimes be obtained from a simple simulation, but that requiresthat the parameters in the model are known from before. In biology, theparameters are usually both not known from before and not identifiable, i.e.the parameter values cannot be determined uniquely from available data. Insuch cases of unidentifiability, the space of acceptable parameters is large, ofteninfinite in certain directions. For such large spaces, sampling-based approachesthat try to characterize the entire space have difficulties. Recently, a new type ofalternative approaches that circumvent this characterization problem has beenproposed: where one only searches those directions in the space of acceptable parametersthat are relevant for the uncertainty of a particular prediction. In thisreview chapter, these recently proposed methods are compared and contrasted,both regarding theoretical properties, and regarding user experience. The focusis on methods from the field of systems biology, but also methods from biostatistics,pharmacodynamics, and biochemometrics are discussed. The hope is thatthis review will increase the usefulness and understanding of already proposedmethods, and thereby help foster a tradition where predictions only are deemedinteresting if their uncertainties have been determined.

  • 15.
    Elofsson, Arne
    et al.
    Stockholm Univ, Sweden.
    Joo, Keehyoung
    Korea Inst Adv Study, South Korea.
    Keasar, Chen
    Ben Gurion Univ Negev, Israel.
    Lee, Jooyoung
    Korea Inst Adv Study, South Korea.
    Maghrabi, Ali H. A.
    Univ Reading, England.
    Manavalan, Balachandran
    Korea Inst Adv Study, South Korea.
    McGuffin, Liam J.
    Univ Reading, England.
    Hurtado, David Menendez
    Stockholm Univ, Sweden.
    Mirabello, Claudio
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Pilstål, Robert
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Sidi, Tomer
    Ben Gurion Univ Negev, Israel.
    Uziela, Karolis
    Stockholm Univ, Sweden.
    Wallner, Björn
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Methods for estimation of model accuracy in CASP122018In: Proteins: Structure, Function, and Bioinformatics, ISSN 0887-3585, E-ISSN 1097-0134, Vol. 86, p. 361-373Article in journal (Refereed)
    Abstract [en]

    Methods to reliably estimate the quality of 3D models of proteins are essential drivers for the wide adoption and serious acceptance of protein structure predictions by life scientists. In this article, the most successful groups in CASP12 describe their latest methods for estimates of model accuracy (EMA). We show that pure single model accuracy estimation methods have shown clear progress since CASP11; the 3 top methods (MESHI, ProQ3, SVMQA) all perform better than the top method of CASP11 (ProQ2). Although the pure single model accuracy estimation methods outperform quasi-single (ModFOLD6 variations) and consensus methods (Pcons, ModFOLDclust2, Pcomb-domain, and Wallner) in model selection, they are still not as good as those methods in absolute model quality estimation and predictions of local quality. Finally, we show that when using contact-based model quality measures (CAD, lDDT) the single model quality methods perform relatively better.

  • 16.
    Feldman Barrett, Lisa
    et al.
    Northeastern University, MA 02115 USA; Massachusetts Gen Hospital, MA 02129 USA; Harvard Medical Sch, MA 02129 USA; Massachusetts Gen Hospital, MA 02114 USA; Harvard Medical Sch, MA 02115 USA.
    Quigley, Karen S.
    Northeastern University, MA 02115 USA.
    Hamilton, Paul
    Linköping University, Department of Clinical and Experimental Medicine, Division of Neuro and Inflammation Science. Linköping University, Faculty of Medicine and Health Sciences. Linköping University, Center for Social and Affective Neuroscience (CSAN).
    An active inference theory of allostasis and interoception in depression2016In: Philosophical Transactions of the Royal Society of London. Biological Sciences, ISSN 0962-8436, E-ISSN 1471-2970, Vol. 371, no 1708, article id 20160011Article in journal (Refereed)
    Abstract [en]

    In this paper, we integrate recent theoretical and empirical developments in predictive coding and active inference accounts of interoception (including the Embodied Predictive Interoception Coding model) with working hypotheses from the theory of constructed emotion to propose a biologically plausible unified theory of the mind that places metabolism and energy regulation (i.e. allostasis), as well as the sensory consequences of that regulation (i.e. interoception), at its core. We then consider the implications of this approach for understanding depression. We speculate that depression is a disorder of allostasis, whose myriad symptoms result from a locked in brain that is relatively insensitive to its sensory context. We conclude with a brief discussion of the ways our approach might reveal new insights for the treatment of depression. This article is part of the themed issue Interoception beyond homeostasis: affect, cognition and mental health.

  • 17.
    Fransson, Martin
    Linköping University, Department of Computer and Information Science, PELAB - Programming Environment Laboratory. Linköping University, The Institute of Technology.
    Towards Individualized Drug Dosage - General Methods and Case Studies2007Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    Progress in individualized drug treatment is of increasing importance, promising to avoid much human suffering and reducing medical treatment costs for society. The strategy is to maximize the therapeutic effects and minimize the negative side effects of a drug on individual or group basis. To reach the goal, interactions between the human body and different drugs must be further clarified, for instance by using mathematical models. Whether clinical studies or laboratory experiments are used as primary sources of information, greatly

    influences the possibilities of obtaining data. This must be considered both prior and during model development and different strategies must be used. The character of the data may also restrict the level of complexity for the models, thus limiting their usage as tools for individualized treatment.

    In this thesis work two case studies have been made, each with the aim to develop a model for a specific human-drug interaction. The first case study concerns treatment of inflammatory bowel disease with thiopurines, whereas the second is about treatment of ovarian cancer with paclitaxel. Although both case studies make use of similar amounts of experimental data, model development depends considerably on prior knowledge about the systems, the character of the data and the choice of modelling tools. All these factors are presented for

    each of the case studies along with current results. Further, a system for classifying different but related models is also proposed with the intention that an increased understanding will contribute to advancement in individualized drug dosage.

    List of papers
    1. A preliminary study of modeling and simulation in individualized drug dosage – azathioprine on inflammatory bowel disease
    Open this publication in new window or tab >>A preliminary study of modeling and simulation in individualized drug dosage – azathioprine on inflammatory bowel disease
    Show others...
    2007 (English)In: SIMS 2006: Proceedings of the 47th Conference on Simulation and Modelling, Helsinki, Finland, Helsinki: Kopio Niini Oy , 2007, p. 216-220Conference paper, Published paper (Refereed)
    Abstract [en]

    Individualized drug dosage based on population pharmacokinetic/dynamic models is an important future technology used to reduce or eliminate side effects of certain drugs, e.g. cancer drugs. In this paper we report preliminary results from work-in-progress: a simplified linear model of the metabolism of a cancer treatment drug was estimated from experimental data. The model was then validated against the same data as a test of the adequacy of the model structure. From this investigation it became apparent that the model structure could not be used due to its inability to recreate the dynamic properties of the system.

    Place, publisher, year, edition, pages
    Helsinki: Kopio Niini Oy, 2007
    Keywords
    azathioprine, inflammatory bowel disease, pharmacokinetic
    National Category
    Bioinformatics (Computational Biology)
    Identifiers
    urn:nbn:se:liu:diva-10250 (URN)9525183300 (ISBN)
    Available from: 2007-11-16 Created: 2007-11-16 Last updated: 2018-01-13
    2. Comparison of two types of population pharmacokinetic model structures of paclitaxel
    Open this publication in new window or tab >>Comparison of two types of population pharmacokinetic model structures of paclitaxel
    2008 (English)In: European Journal of Pharmaceutical Sciences, ISSN 0928-0987, E-ISSN 1879-0720, Vol. 33, no 2, p. 128-137Article in journal (Refereed) Published
    Abstract [en]

    Two main types of model structures have been proposed for the pharmacokinetics of paclitaxel; an empirical model structure based on total plasma concentrations of paclitaxel, and a mechanism-based model structure derived from both total and unbound paclitaxel concentrations and concentrations of the formulation vehicle Cremophor EL. The purpose was to compare the two pharmacokinetic model structures when only total paclitaxel concentrations are available. To support the mechanism-based model structure with Cremophor EL concentrations, in silico concentrations were obtained from simulations of a pharmacokinetic model available in the literature. Local algebraic observability was tested on both model structures; the mechanism-based model structure was found, with high probability, not to be algebraically observable if total paclitaxel concentration is considered to be the only model output, and if no kind of prior information is used. Sensitivity analysis was performed to reveal which parameter should be fixed in order to make it locally observable. Parameter estimation was then performed on both model structures using nonlinear mixed effects and data from a clinical study. The estimated mechanism-based model turned out to have a somewhat better fit to data than the corresponding empirical model, , where AIC is the Akaike Information Criterion. Hold-out validation was performed on three patients, but did not favour any of the models. In conclusion, since the mechanism-based model structure behaved at least as good as the empirical model structure, it is suggested that the former model structure should be used since it offers a more accurate description of the disposition.

    Keywords
    Paclitaxel, Model structure, Observability, NONMEM, Simulation
    National Category
    Medical and Health Sciences
    Identifiers
    urn:nbn:se:liu:diva-12762 (URN)10.1016/j.ejps.2007.10.005 (DOI)
    Available from: 2007-11-16 Created: 2007-11-16 Last updated: 2017-12-14
  • 18.
    Fransson, Martin
    et al.
    Linköping University, Department of Computer and Information Science, PELAB - Programming Environment Laboratory. Linköping University, The Institute of Technology.
    Fritzson, Peter
    Linköping University, Department of Computer and Information Science, PELAB - Programming Environment Laboratory. Linköping University, The Institute of Technology.
    Lindqvist Appell, Malin
    Linköping University, Department of Medicine and Care. Linköping University, Faculty of Health Sciences.
    Hindorf, Ulf
    Almer, Sven
    Linköping University, Department of Clinical and Experimental Medicine, Gastroenterology and Hepatology. Linköping University, Faculty of Health Sciences. Östergötlands Läns Landsting, Centre for Medicine, Department of Endocrinology and Gastroenterology UHL.
    Peterson, Curt
    Linköping University, Department of Medical and Health Sciences, Clinical Pharmacology. Linköping University, Faculty of Health Sciences. Östergötlands Läns Landsting, Centre of Surgery and Oncology, Department of Oncology UHL.
    A preliminary study of modeling and simulation in individualized drug dosage – azathioprine on inflammatory bowel disease2007In: SIMS 2006: Proceedings of the 47th Conference on Simulation and Modelling, Helsinki, Finland, Helsinki: Kopio Niini Oy , 2007, p. 216-220Conference paper (Refereed)
    Abstract [en]

    Individualized drug dosage based on population pharmacokinetic/dynamic models is an important future technology used to reduce or eliminate side effects of certain drugs, e.g. cancer drugs. In this paper we report preliminary results from work-in-progress: a simplified linear model of the metabolism of a cancer treatment drug was estimated from experimental data. The model was then validated against the same data as a test of the adequacy of the model structure. From this investigation it became apparent that the model structure could not be used due to its inability to recreate the dynamic properties of the system.

  • 19.
    Giordano, Giulia
    et al.
    Lund University, Sweden.
    Altafini, Claudio
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, Faculty of Science & Engineering.
    Qualitative and quantitative responses to press perturbations in ecological networks2017In: Scientific Reports, ISSN 2045-2322, E-ISSN 2045-2322, Vol. 7, article id 11378Article in journal (Refereed)
    Abstract [en]

    Predicting the sign of press perturbation responses in ecological networks is challenging, due to the poor knowledge of the strength of the direct interactions among the species, and to the entangled coexistence of direct and indirect effects. We show in this paper that, for a class of networks that includes mutualistic and monotone networks, the sign of press perturbation responses can be qualitatively determined based only on the sign pattern of the community matrix, without any knowledge of parameter values. For other classes of networks, we show that a semi-qualitative approach yields sufficient conditions for community matrices with a given sign pattern to exhibit mutualistic responses to press perturbations; quantitative conditions can be provided as well for community matrices that are eventually nonnegative. We also present a computational test that can be applied to any class of networks so as to check whether the sign of the responses to press perturbations is constant in spite of parameter variations.

  • 20.
    Guerrero-Bosagna, Carlos
    Linköping University, Department of Physics, Chemistry and Biology, Biology. Linköping University, Faculty of Science & Engineering.
    High type II error and interpretation inconsistencies when attempting to refute transgenerational epigenetic inheritance2016In: Genome Biology, ISSN 1465-6906, E-ISSN 1474-760X, Vol. 17, no 1, article id 153Article in journal (Refereed)
    Abstract [en]

    A recently published article in Genome Biology attempts to refute important aspects of the phenomenon of transgenerational epigenetic inheritance (TEI). An alternative explanation of the data is offered here, showing that TEI is indeed not contradicted.Please see related Correspondence article: www.dx.doi.org/10.1186/s13059-016-0981-5 and related Research article: http://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0619-z.

  • 21.
    Hedman, Johannes
    et al.
    Lunds Universitet/Lunds Tekniska Högskola.
    Ansell, Ricky
    Linköping University, Department of Physics, Chemistry and Biology, Molecular genetics. Linköping University, The Institute of Technology.
    Nordgaard, Anders
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    A ranking index for quality assessment of forensic DNA profiles2010In: BMC Research Notes, ISSN 1756-0500, Vol. 3, no 290Article in journal (Refereed)
    Abstract [en]

    Background

    Assessment of DNA profile quality is vital in forensic DNA analysis, both in order to determine the evidentiary value of DNA results and to compare the performance of different DNA analysis protocols. Generally the quality assessment is performed through manual examination of the DNA profiles based on empirical knowledge, or by comparing the intensities (allelic peak heights) of the capillary electrophoresis electropherograms.

    Results

    We recently developed a ranking index for unbiased and quantitative quality assessment of forensic DNA profiles, the forensic DNA profile index (FI) (Hedman et al. Improved forensic DNA analysis through the use of alternative DNA polymerases and statistical modeling of DNA profiles, Biotechniques 47 (2009) 951-958). FI uses electropherogram data to combine the intensities of the allelic peaks with the balances within and between loci, using Principal Components Analysis. Here we present the construction of FI. We explain the mathematical and statistical methodologies used and present details about the applied data reduction method. Thereby we show how to adapt the ranking index for any Short Tandem Repeat-based forensic DNA typing system through validation against a manual grading scale and calibration against a specific set of DNA profiles.

    Conclusions

    The developed tool provides unbiased quality assessment of forensic DNA profiles. It can be applied for any DNA profiling system based on Short Tandem Repeat markers. Apart from crime related DNA analysis, FI can therefore be used as a quality tool in paternal or familial testing as well as in disaster victim identification.

  • 22.
    Hennerdal, Aron
    Linköping University, The Department of Physics, Chemistry and Biology.
    Investigation of multivariate prediction methods for the analysis of biomarker data2006Independent thesis Basic level (professional degree), 20 points / 30 hpStudent thesis
    Abstract [en]

    The paper describes predictive modelling of biomarker data stemming from patients suffering from multiple sclerosis. Improvements of multivariate analyses of the data are investigated with the goal of increasing the capability to assign samples to correct subgroups from the data alone.

    The effects of different preceding scalings of the data are investigated and combinations of multivariate modelling methods and variable selection methods are evaluated. Attempts at merging the predictive capabilities of the method combinations through voting-procedures are made. A technique for improving the result of PLS-modelling, called bagging, is evaluated.

    The best methods of multivariate analysis of the ones tried are found to be Partial least squares (PLS) and Support vector machines (SVM). It is concluded that the scaling have little effect on the prediction performance for most methods. The method combinations have interesting properties – the default variable selections of the multivariate methods are not always the best. Bagging improves performance, but at a high cost. No reasons for drastically changing the work flows of the biomarker data analysis are found, but slight improvements are possible. Further research is needed.

  • 23.
    Jakoniené, Vaida
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, The Institute of Technology.
    Lambrix, Patrick
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, The Institute of Technology.
    Tool for Evaluating Strategies for Grouping of Biological Data2007In: Journal of Integrative Bioinformatics, ISSN 1613-4516, Vol. 4, no 3Article in journal (Refereed)
    Abstract [en]

    During the last decade an enormous amount of biological data has been generated and techniques and tools to analyze this data have been developed. Many of these tools use some form of grouping and are used in, for instance, data integration, data cleaning, prediction of protein functionality, and correlation of genes based on microarray data. A number of aspects influence the quality of the grouping results: the data sources, the grouping attributes and the algorithms implementing the grouping procedure. Many methods exist, but it is often not clear which methods perform best for which grouping tasks. The study of the properties, and the evaluation and the comparison of the different aspects that influence the quality of the grouping results, would give us valuable insight in how the grouping procedures could be used in the best way. It would also lead to recommendations on how to improve the current procedures and develop new procedures. To be able to perform such studies and evaluations we need environments that allow us to compare and evaluate different grouping strategies. In this paper we present a framework, KitEGA, for such an environment, and present its current prototype implementation. We illustrate its use by comparing grouping strategies for classifying proteins regarding biological function and isozymes.

  • 24.
    Jauhiainen, Alexandra
    Linköping University, The Department of Physics, Chemistry and Biology.
    Evaluation and Development of Methods for Identification of Biochemical Networks2005Independent thesis Basic level (professional degree)Student thesis
    Abstract [en]

    Systems biology is an area concerned with understanding biology on a systems level, where structure and dynamics of the system is in focus. Knowledge about structure and dynamics of biological systems is fundamental information about cells and interactions within cells and also play an increasingly important role in medical applications.

    System identification deals with the problem of constructing a model of a system from data and an extensive theory of particularly identification of linear systems exists.

    This is a master thesis in systems biology treating identification of biochemical systems. Methods based on both local parameter perturbation data and time series data have been tested and evaluated in silico.

    The advantage of local parameter perturbation data methods proved to be that they demand less complex data, but the drawbacks are the reduced information content of this data and sensitivity to noise. Methods employing time series data are generally more robust to noise but the lack of available data limits the use of these methods.

    The work has been conducted at the Fraunhofer-Chalmers Research Centre for Industrial Mathematics in Göteborg, and at the division of Computational Biology at the Department of Physics and Measurement Technology, Biology, and Chemistry at Linköping University during the autumn of 2004.

  • 25.
    Johansson-Åkhe, Isak
    Linköping University, Department of Physics, Chemistry and Biology.
    PePIP: a Pipeline for Peptide-Protein Interaction-site Prediction2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Protein-peptide interactions play a major role in several biological processes, such as cellproliferation and cancer cell life-cycles. Accurate computational methods for predictingprotein-protein interactions exist, but few of these method can be extended to predictinginteractions between a protein and a particularly small or intrinsically disordered peptide.

    In this thesis, PePIP is presented. PePIP is a pipeline for predicting where on a given proteina given peptide will most probably bind. The pipeline utilizes structural aligning to perusethe Protein Data Bank for possible templates for the interaction to be predicted, using thelarger chain as the query. The possible templates are then evaluated as to whether they canrepresent the query protein and peptide using a Random Forest classifier machine learningalgorithm, and the best templates are found by using the evaluation from the Random Forest in combination with hierarchical clustering. These final templates are then combined to givea prediction of binding site.

    PePIP is proven to be highly accurate when testing on a set of 502 experimentally determinedprotein-peptide structures, suggesting a binding site on the correct part of the protein- surfaceroughly 4 out of 5 times.

  • 26.
    Klingström, Tomas
    et al.
    Swedish University of Agricultural Sciences, Uppsala, Sweden.
    Soldatova, Larissa
    Aberystwyth University, UK.
    Stevens, Robert
    Universtity of Manchester, UK.
    Roos, T. Erik
    University of Groningen, The Netherlands.
    Swertz, Morris A.
    University of Groningen, The Netherlands.
    Müller, Kristian M.
    University of Potsdam, Germany.
    Kalas, Matus
    University of Bergen, Norway.
    Lambrix, Patrick
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, The Institute of Technology.
    Taussig, Michael J.
    Babraham Bioscience Technologies, Cambridge, UK.
    Litton, Jan-Eric
    Karolinska Institutet, Stockholm, Sweden.
    Landegren, Ulf
    Uppsala University, Sweden.
    Bongcam-Rudloff, Erik
    Swedish University of Agricultural Sciences and Uppsala University, Sweden.
    Workshop on laboratory protocol standards for the molecular methods database2013In: New Biotechnology, ISSN 1871-6784, E-ISSN 1876-4347, Vol. 30, no 2, p. 109-113Article in journal (Refereed)
    Abstract [en]

    Management of data to produce scientific knowledge is a key challenge for biological research in the 21st century. Emerging high-throughput technologies allow life science researchers to produce big data at speeds and in amounts that were unthinkable just a few years ago. This places high demands on all aspects of the workflow: from data capture (including the experimental constraints of the experiment), analysis and preservation, to peer-reviewed publication of results. Failure to recognise the issues at each level can lead to serious conflicts and mistakes; research may then be compromised as a result of the publication of non-coherent protocols, or the misinterpretation of published data. In this report, we present the results from a workshop that was organised to create an ontological data-modelling framework for Laboratory Protocol Standards for the Molecular Methods Database (MolMeth). The workshop provided a set of short- and long-term goals for the MolMeth database, the most important being the decision to use the established EXACT description of biomedical ontologies as a starting point.

  • 27.
    Lambrix, Patrick
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, The Institute of Technology.
    Semantic Web, Ontologies and Linked Data2014In: Comprehensive Biomedical Physics: Volume 6: Bioinformatics / [ed] Brahme (ed in chief), Persson, (ed), Amsterdam: Elsevier, 2014, p. 67-76Chapter in book (Refereed)
    Abstract [en]

    Researchers in various areas in the life sciences use biomedical data sources and tools for their research. However, with the explosion of the amountof available data sources and tools, researchers also face the difficulties of finding and retrieving relevant information and tools as well as integratinginformation from different sources. The vision of the Semantic Web alleviates these difficulties. In this chapter, we introduce the Semantic Web anddiscuss steps that have been taken toward this vision. We discuss ontologies as a key technology as well as the recent development of Linked Data. Further, for each of these, we list issues for future research.

  • 28.
    Lambrix, Patrick
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, The Institute of Technology.
    Edberg, Anna
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, The Institute of Technology.
    Evaluation of ontology merging tools2003In: Pacific Symposium on Biocomputing, World Scientific , 2003, p. 589-600Conference paper (Refereed)
    Abstract [en]

    Ontologies are being used nowadays in many areas, including bioinformatics. One of the issues in ontology research is the aligning and merging of ontologies. Tools have been developed for ontology merging, but they have not been evaluated for their use in bioinformatics. In this paper we evaluate two of the most well-known ontology merging tools with a bioinformatics perspective. As test ontologies we have used Gene Ontology and Signal-Ontology.

  • 29.
    Lambrix, Patrick
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, The Institute of Technology.
    Habbouche, Manal
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, Database and information techniques.
    Perez, Marta
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, The Institute of Technology.
    Evaluation of ontology development tools for bioinformatics2003In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 19, no 12, p. 1564-1571Article in journal (Refereed)
    Abstract [en]

    Ontologies are being used nowadays in many areas, including bioinformatics. To assist users in developing and maintaining ontologies a number of tools have been developed. In this paper we compare four such tools, Protégé-2000, Chimaera, DAG-Edit and OilEd. As test ontologies we have used ontologies from the Gene Ontology Consortium. No system is preferred in all situations, but each system has its own strengths and weaknesses.

  • 30.
    Lambrix, Patrick
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, The Institute of Technology.
    Kemp, GrahamChalmers University of Technology.
    Proceedings of the Seventh International Conference on Data Integration in the Life Sciences2010Conference proceedings (editor) (Refereed)
  • 31.
    Lambrix, Patrick
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, The Institute of Technology. Swedish e-Science Research Centre, Linköping University, Sweden .
    Wei-Kleiner, Fang
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, The Institute of Technology.
    Dragisic, Zlatan
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, The Institute of Technology.
    Completing the is-a structure in light-weight ontologies2015In: Journal of Biomedical Semantics, ISSN 2041-1480, E-ISSN 2041-1480, Vol. 6, article id 12Article in journal (Refereed)
    Abstract [en]

     Background: With the increasing presence of biomedical data sources on the Internet more and more research effort is put into finding possible ways for integrating and searching such often heterogeneous sources. Ontologies are a key technology in this effort. However, developing ontologies is not an easy task and often the resulting ontologies are not complete. In addition to being problematic for the correct modelling of a domain, such incomplete ontologies, when used in semantically-enabled applications, can lead to valid conclusions being missed.

    Results: We consider the problem of repairing missing is-a relations in ontologies. We formalize the problem as a generalized TBox abduction problem. Based on this abduction framework, we present complexity results for the existence, relevance and necessity decision problems for the generalized TBox abduction problem with and without some specific preference relations for ontologies that can be represented using a member of the EL family of description logics. Further, we present algorithms for finding solutions, a system as well as experiments.

    Conclusions: Semantically-enabled applications need high quality ontologies and one key aspect is their completeness. We have introduced a framework and system that provides an environment for supporting domain experts to complete the is-a structure of ontologies. We have shown the usefulness of the approach in different experiments. For the two Anatomy ontologies from the Ontology Alignment Evaluation Initiative, we repaired 94 and 58 initial given missing is-a relations, respectively, and detected and repaired additionally, 47 and 10 missing is-a relations. In an experiment with BioTop without given missing is-a relations, we detected and repaired 40 new missing is-a relations.

  • 32.
    Lundengård, Karin
    et al.
    Linköping University, Department of Medical and Health Sciences, Division of Radiological Sciences. Linköping University, Faculty of Medicine and Health Sciences. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Cedersund, Gunnar
    Linköping University, Department of Biomedical Engineering. Linköping University, Faculty of Science & Engineering. Linköping University, Faculty of Medicine and Health Sciences. Linköping University, Department of Clinical and Experimental Medicine, Division of Cell Biology.
    Sten, Sebastian
    Linköping University, Department of Medical and Health Sciences, Division of Radiological Sciences. Linköping University, Faculty of Medicine and Health Sciences.
    Leong, Felix
    Linköping University, Department of Medical and Health Sciences. Linköping University, Faculty of Medicine and Health Sciences.
    Smedberg, Alexander
    Linköping University, Department of Medical and Health Sciences. Linköping University, Faculty of Medicine and Health Sciences.
    Elinder, Fredrik
    Linköping University, Department of Clinical and Experimental Medicine, Division of Cell Biology. Linköping University, Faculty of Medicine and Health Sciences.
    Engström, Maria
    Linköping University, Department of Medical and Health Sciences, Division of Radiological Sciences. Linköping University, Faculty of Medicine and Health Sciences. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Mechanistic Mathematical Modeling Tests Hypotheses of the Neurovascular Coupling in fMRI2016In: PloS Computational Biology, ISSN 1553-734X, E-ISSN 1553-7358, Vol. 12, no 6, article id e1004971Article in journal (Refereed)
    Abstract [en]

    Functional magnetic resonance imaging (fMRI) measures brain activity by detecting the blood-oxygen-level dependent (BOLD) response to neural activity. The BOLD response depends on the neurovascular coupling, which connects cerebral blood flow, cerebral blood volume, and deoxyhemoglobin level to neuronal activity. The exact mechanisms behind this neurovascular coupling are not yet fully investigated. There are at least three different ways in which these mechanisms are being discussed. Firstly, mathematical models involving the so-called Balloon model describes the relation between oxygen metabolism, cerebral blood volume, and cerebral blood flow. However, the Balloon model does not describe cellular and biochemical mechanisms. Secondly, the metabolic feedback hypothesis, which is based on experimental findings on metabolism associated with brain activation, and thirdly, the neurotransmitter feed-forward hypothesis which describes intracellular pathways leading to vasoactive substance release. Both the metabolic feedback and the neurotransmitter feed-forward hypotheses have been extensively studied, but only experimentally. These two hypotheses have never been implemented as mathematical models. Here we investigate these two hypotheses by mechanistic mathematical modeling using a systems biology approach; these methods have been used in biological research for many years but never been applied to the BOLD response in fMRI. In the current work, model structures describing the metabolic feedback and the neurotransmitter feed-forward hypotheses were applied to measured BOLD responses in the visual cortex of 12 healthy volunteers. Evaluating each hypothesis separately shows that neither hypothesis alone can describe the data in a biologically plausible way. However, by adding metabolism to the neurotransmitter feed-forward model structure, we obtained a new model structure which is able to fit the estimation data and successfully predict new, independent validation data. These results open the door to a new type of fMRI analysis that more accurately reflects the true neuronal activity.

  • 33.
    Lykiardopoulos, Byron
    et al.
    Linköping University, Department of Medical and Health Sciences. Linköping University, Faculty of Medicine and Health Sciences.
    Hagström, Hannes
    Karolinska Institute, Sweden.
    Fredrikson, Mats
    Linköping University, Department of Clinical and Experimental Medicine, Division of Neuro and Inflammation Science. Linköping University, Faculty of Medicine and Health Sciences.
    Ignatova, Simone
    Linköping University, Department of Clinical and Experimental Medicine, Division of Cell Biology. Linköping University, Faculty of Medicine and Health Sciences.
    Stal, Per
    Karolinska Institute, Sweden.
    Hultcrantz, Rolf
    Karolinska Institute, Sweden.
    Ekstedt, Mattias
    Linköping University, Department of Medical and Health Sciences, Division of Cardiovascular Medicine. Linköping University, Faculty of Medicine and Health Sciences. Region Östergötland, Heart and Medicine Center, Department of Gastroentorology.
    Kechagias, Stergios
    Linköping University, Department of Medical and Health Sciences, Division of Cardiovascular Medicine. Linköping University, Faculty of Medicine and Health Sciences. Region Östergötland, Heart and Medicine Center, Department of Gastroentorology.
    Development of Serum Marker Models to Increase Diagnostic Accuracy of Advanced Fibrosis in Nonalcoholic Fatty Liver Disease: The New LINKI Algorithm Compared with Established Algorithms2016In: PLoS ONE, ISSN 1932-6203, E-ISSN 1932-6203, Vol. 11, no 12, article id e0167776Article in journal (Refereed)
    Abstract [en]

    Background and Aim Detection of advanced fibrosis (F3-F4) in nonalcoholic fatty liver disease (NAFLD) is important for ascertaining prognosis. Serum markers have been proposed as alternatives to biopsy. We attempted to develop a novel algorithm for detection of advanced fibrosis based on a more efficient combination of serological markers and to compare this with established algorithms. Methods We included 158 patients with biopsy-proven NAFLD. Of these, 38 had advanced fibrosis. The following fibrosis algorithms were calculated: NAFLD fibrosis score, BARD, NIKEI, NASH-CRN regression score, APRI, FIB-4, Kings score, GUCI, Lok index, Forns score, and ELF. Study population was randomly divided in a training and a validation group. A multiple logistic regression analysis using bootstrapping methods was applied to the training group. Among many variables analyzed age, fasting glucose, hyaluronic acid and AST were included, and a model (LINKI-1) for predicting advanced fibrosis was created. Moreover, these variables were combined with platelet count in a mathematical way exaggerating the opposing effects, and alternative models (LINKI-2) were also created. Models were compared using area under the receiver operator characteristic curves (AUROC). Results Of established algorithms FIB-4 and Kings score had the best diagnostic accuracy with AUROCs 0.84 and 0.83, respectively. Higher accuracy was achieved with the novel LINKI algorithms. AUROCs in the total cohort for LINKI-1 was 0.91 and for LINKI-2 models 0.89. Conclusion The LINKI algorithms for detection of advanced fibrosis in NAFLD showed better accuracy than established algorithms and should be validated in further studies including larger cohorts.

  • 34.
    Malm, Patrik
    Linköping University, Department of Physics, Chemistry and Biology.
    Development of a hierarchical k-selecting clustering algorithm – application to allergy.2007Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The objective with this Master’s thesis was to develop, implement and evaluate an iterative procedure for hierarchical clustering with good overall performance which also merges features of certain already described algorithms into a single integrated package. An accordingly built tool was then applied to an allergen IgE-reactivity data set. The finally implemented algorithm uses a hierarchical approach which illustrates the emergence of patterns in the data. At each level of the hierarchical tree a partitional clustering method is used to divide data into k groups, where the number k is decided through application of cluster validation techniques. The cross-reactivity analysis, by means of the new algorithm, largely arrives at anticipated cluster formations in the allergen data, which strengthen results obtained through previous studies on the subject. Notably, though, certain unexpected findings presented in the former analysis where aggregated differently, and more in line with phylogenetic and protein family relationships, by the novel clustering package.

  • 35.
    Morgan, Daniel
    et al.
    Stockholm Univ, Sweden.
    Tjärnberg, Andreas
    Linköping University, Department of Physics, Chemistry and Biology, Bioinformatics. Linköping University, Faculty of Science & Engineering.
    Nordling, Torbjorn E. M.
    Natl Cheng Kung Univ, Taiwan.
    Sonnhammer, Erik L. L.
    Stockholm Univ, Sweden.
    A generalized framework for controlling FDR in gene regulatory network inference2019In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 35, no 6, p. 1026-1032Article in journal (Refereed)
    Abstract [en]

    Motivation Inference of gene regulatory networks (GRNs) from perturbation data can give detailed mechanistic insights of a biological system. Many inference methods exist, but the resulting GRN is generally sensitive to the choice of method-specific parameters. Even though the inferred GRN is optimal given the parameters, many links may be wrong or missing if the data is not informative. To make GRN inference reliable, a method is needed to estimate the support of each predicted link as the method parameters are varied. Results To achieve this we have developed a method called nested bootstrapping, which applies a bootstrapping protocol to GRN inference, and by repeated bootstrap runs assesses the stability of the estimated support values. To translate bootstrap support values to false discovery rates we run the same pipeline with shuffled data as input. This provides a general method to control the false discovery rate of GRN inference that can be applied to any setting of inference parameters, noise level, or data properties. We evaluated nested bootstrapping on a simulated dataset spanning a range of such properties, using the LASSO, Least Squares, RNI, GENIE3 and CLR inference methods. An improved inference accuracy was observed in almost all situations. Nested bootstrapping was incorporated into the GeneSPIDER package, which was also used for generating the simulated networks and data, as well as running and analyzing the inferences. Availability and implementation https://bitbucket.org/sonnhammergrni/genespider/src/NB/%2B Methods/NestBoot.m

  • 36.
    Muthumanickam, Prithiviraj
    et al.
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Vrotsou, Katerina
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Cooper, Matthew
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Johansson, Jimmy
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Shape Grammar Extraction for Efficient Query-by-Sketch Pattern Matching in Long Time Series2016Conference paper (Refereed)
    Abstract [en]

    Long time-series, involving thousands or even millions of time steps, are common in many application domains but remain very difficult to explore interactively. Often the analytical task in such data is to identify specific patterns, but this is a very complex and computationally difficult problem and so focusing the search in order to only identify interesting patterns is a common solution. We propose an efficient method for exploring user-sketched patterns, incorporating the domain expert’s knowledge, in time series data through a shape grammar based approach. The shape grammar is extracted from the time series by considering the data as a combination of basic elementary shapes positioned across different am- plitudes. We represent these basic shapes using a ratio value, perform binning on ratio values and apply a symbolic approximation. Our proposed method for pattern matching is amplitude-, scale- and translation-invariant and, since the pattern search and pattern con- straint relaxation happen at the symbolic level, is very efficient permitting its use in a real-time/online system. We demonstrate the effectiveness of our method in a case study on stock market data although it is applicable to any numeric time series data.

  • 37.
    Nalenz, Malte
    Linköping University, Department of Computer and Information Science, Statistics.
    Horseshoe RuleFit: Learning Rule Ensembles via Bayesian Regularization2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This work proposes Hs-RuleFit, a learning method for regression and classification, which combines rule ensemble learning based on the RuleFit algorithm with Bayesian regularization through the horseshoe prior. To this end theoretical properties and potential problems of this combination are studied. A second step is the implementation, which utilizes recent sampling schemes to make the Hs-RuleFit computationally feasible. Additionally, changes to the RuleFit algorithm are proposed such as Decision Rule post-processing and the usage of Decision rules generated via Random Forest.

    Hs-RuleFit addresses the problem of finding highly accurate and yet interpretable models. The method shows to be capable of finding compact sets of informative decision rules that give a good insight in the data. Through the careful choice of prior distributions the horse-shoe prior shows to be superior to the Lasso in this context. In an empirical evaluation on 16 real data sets Hs-RuleFit shows excellent performance in regression and outperforms the popular methods Random Forest, BART and RuleFit in terms of prediction error. The interpretability is demonstrated on selected data sets. This makes the Hs-RuleFit a good choice for science domains in which interpretability is desired.

    Problems are found in classification, regarding the usage of the horseshoe prior and rule ensemble learning in general. A simulation study is performed to isolate the problems and potential solutions are discussed.

    Arguments are presented, that the horseshoe prior could be a good choice in other machine learning areas, such as artificial neural networks and support vector machines.

  • 38.
    Ngaruye, Innocent
    Linköping University, Department of Mathematics, Mathematical Statistics . Linköping University, Faculty of Science & Engineering.
    Contributions to Small Area Estimation: Using Random Effects Growth Curve Model2017Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    This dissertation considers Small Area Estimation with a main focus on estimation and prediction for repeated measures data. The demand of small area statistics is for both cross-sectional and repeated measures data. For instance, small area estimates for repeated measures data may be useful for public policy makers for different purposes such as funds allocation, new educational or health programs, etc, where decision makers might be interested in the trend of estimates for a specic characteristic of interest for a given category of the target population as a basis of their planning.

    It has been shown that the multivariate approach for model-based methods in small area estimation may achieve substantial improvement over the usual univariate approach. In this work, we consider repeated surveys taken on the same subjects at different time points. The population from which a sample has been drawn is partitioned into several non-overlapping subpopulations and within all subpopulations there is the same number of group units. The aim is to propose a model that borrows strength across small areas and over time with a particular interest of growth profiles over time. The model accounts for repeated surveys, group individuals and random effects variations.

    Firstly, a multivariate linear model for repeated measures data is formulated under small area estimation settings. The estimation of model parameters is discussed within a likelihood based approach, the prediction of random effects and the prediction of small area means across timepoints, per group units and for all time points are obtained. In particular, as an application of the proposed model, an empirical study is conducted to produce district level estimates of beans in Rwanda during agricultural seasons 2014 which comprise two varieties, bush beans and climbing beans.

    Secondly, the thesis develops the properties of the proposed estimators and discusses the computation of their first and second moments. Through a method based on parametric bootstrap, these moments are used to estimate the mean-squared errors for the predicted small area means. Finally, a particular case of incomplete multivariate repeated measures data that follow a monotonic sample pattern for small area estimation is studied. By using a conditional likelihood based approach, the estimators of model parameters are derived. The prediction of random effects and predicted small area means are also produced.

    List of papers
    1. Small Area Estimation under a Multivariate Linear Model for Repeated measures Data
    Open this publication in new window or tab >>Small Area Estimation under a Multivariate Linear Model for Repeated measures Data
    2017 (English)In: Communications in Statistics - Theory and Methods, ISSN 0361-0926, E-ISSN 1532-415X, Vol. 46, no 21, p. 10835-10850Article in journal (Refereed) Published
    Abstract [en]

    In this article, Small Area Estimation under a Multivariate Linear model for repeated measures data is considered. The proposed model aims to get a model which borrows strength both across small areas and over time. The model accounts for repeated surveys, grouped response units and random effects variations. Estimation of model parameters is discussed within a likelihood based approach. Prediction of random effects, small area means across time points and per group units are derived. A parametric bootstrap method is proposed for estimating the mean squared error of the predicted small area means. Results are supported by a simulation study.

    Place, publisher, year, edition, pages
    New York: Taylor & Francis, 2017
    National Category
    Probability Theory and Statistics Control Engineering Applied Mechanics Geophysics
    Identifiers
    urn:nbn:se:liu:diva-137116 (URN)10.1080/03610926.2016.1248784 (DOI)000415766400033 ()
    Note

    Funding agencies: Swedish International Development and Cooperation Agency (SIDA); University of Rwanda; Swedish Foundation for Humanities and Social Sciences

    Available from: 2017-05-05 Created: 2017-05-05 Last updated: 2018-10-23Bibliographically approved
    2. Crop yield estimation at district level for agricultural seasons 2014 in Rwanda
    Open this publication in new window or tab >>Crop yield estimation at district level for agricultural seasons 2014 in Rwanda
    2016 (English)In: African Journal of Applied Statistics, ISSN 2316-0861, Vol. 3, no 1, p. 69-90Article in journal (Refereed) Published
    Abstract [en]

    In this paper, we discuss an application of Small Area Estimation (SAE) tech- niques under a multivariate linear regression model for repeated measures data to produce district level estimates of crop yield for beans which comprise two varieties, bush beans and climbing beans in Rwanda during agricultural seasons 2014. By using the micro data of National Institute of Statistics of Rwanda (NISR) obtained from the Seasonal Agricul- tural Survey (SAS) 2014 we derive efficient estimates which show considerable gain. The considered model and its estimates may be useful for policy-makers or for further analyses. 

    National Category
    Probability Theory and Statistics
    Identifiers
    urn:nbn:se:liu:diva-136721 (URN)10.16929/ajas/2016.69.203 (DOI)
    Available from: 2017-04-21 Created: 2017-04-21 Last updated: 2017-05-09Bibliographically approved
    3. Mean-squared errors of small area estimators under a multivariate linear model for repeated measures data
    Open this publication in new window or tab >>Mean-squared errors of small area estimators under a multivariate linear model for repeated measures data
    2017 (English)Report (Other academic)
    Abstract [en]

    In this paper, we discuss the derivation of the first and second moments for the proposed small area estimators under a multivariate linear model for repeated measures data. The aim is to use these moments to estimate the mean-squared errors (MSE) for the predicted small area means as a measure of precision. A two stage estimator of MSE is obtained. At the first stage, we derive the MSE when the covariance matrices are known. To obtain an unbiased estimator of the MSE, at the second stage, a method based on parametric bootstrap is  proposed for bias correction and for prediction error that reects the uncertainty when the unknown covariance is replaced by its suitable estimator.

    Place, publisher, year, edition, pages
    Linköping: Linköping University Electronic Press, 2017. p. 19
    Series
    LiTH-MAT-R, ISSN 0348-2960 ; 2017:05
    Keywords
    Mean-squared errors, Multivariate linear model, Repeated measures data, Small area estamation
    National Category
    Probability Theory and Statistics
    Identifiers
    urn:nbn:se:liu:diva-137113 (URN)LiTH-MAT-R--2017/05--SE (ISRN)
    Available from: 2017-05-05 Created: 2017-05-05 Last updated: 2017-11-02Bibliographically approved
    4. Small area estimation under a multivariate linear model for incomplete repeated measures data
    Open this publication in new window or tab >>Small area estimation under a multivariate linear model for incomplete repeated measures data
    2017 (English)Report (Other academic)
    Abstract [en]

    In this paper, the issue of analysis of multivariate repeated measures data that follow a monotonic sample pattern for small area estimation is addressed. Random effects growth curve models with covariates for both complete and incomplete data are formulated. A conditional likelihood based approach is proposed for estimation of the mean parameters and covariances. Further, the prediction of random effects and predicted small area means are also discussed. The proposed techniques may be useful for small area estimation under longitudinal surveys with grouped response units and drop outs.

    Place, publisher, year, edition, pages
    Linköping: Linköping University Electronic Press, 2017. p. 12
    Series
    LiTH-MAT-R, ISSN 0348-2960 ; 2017:06
    Keywords
    Conditional likelihood, Multivariate linear model, Monotone sample, Repeated measures data.
    National Category
    Probability Theory and Statistics
    Identifiers
    urn:nbn:se:liu:diva-137118 (URN)LiTH-MAT-R--2017/06--SE (ISRN)
    Available from: 2017-05-05 Created: 2017-05-05 Last updated: 2017-11-02Bibliographically approved
  • 39.
    Nilsson, Roland
    Linköping University, Department of Physics, Chemistry and Biology, Computational Physics . Linköping University, The Institute of Technology.
    Statistical Feature Selection: With Applications in Life Science2007Doctoral thesis, monograph (Other academic)
    Abstract [en]

    The sequencing of the human genome has changed life science research in many ways. Novel measurement technologies such as microarray expression analysis, genome-wide SNP typing and mass spectrometry are now producing experimental data of extremely high dimensions. While these techniques provide unprecedented opportunities for exploratory data analysis, the increase in dimensionality also introduces many difficulties. A key problem is to discover the most relevant variables, or features, among the tens of thousands of parallel measurements in a particular experiment. This is referred to as feature selection.

    For feature selection to be principled, one needs to decide exactly what it means for a feature to be ”relevant”. This thesis considers relevance from a statistical viewpoint, as a measure of statistical dependence on a given target variable. The target variable might be continuous, such as a patient’s blood glucose level, or categorical, such as ”smoker” vs. ”non-smoker”. Several forms of relevance are examined and related to each other to form a coherent theory. Each form of relevance then defines a different feature selection problem.

    The predictive features are those that allow an accurate predictive model, for example for disease diagnosis. I prove that finding redictive features is a tractable problem, in that consistent estimates can be computed in polynomial time. This is a substantial improvement upon current theory. However, I also demonstrate that selecting features to optimize prediction accuracy does not control feature error rates. This is a severe drawback in life science, where the selected features per se are important, for example as candidate drug targets. To address this problem, I propose a statistical method which to my knowledge is the first to achieve error control. Moreover, I show that in high dimensions, feature sets can be impossible to replicate in independent experiments even with controlled error rates. This finding may explain the lack of agreement among genome-wide association studies and molecular signatures of disease.

    The most predictive features may not always be the most relevant ones from a biological perspective, since the predictive power of a given feature may depend on measurement noise rather than biological properties. I therefore consider a wider definition of relevance that avoids this problem. The resulting feature selection problem is shown to be asymptotically intractable in the general case; however, I derive a set of simplifying assumptions which admit an intuitive, consistent polynomial-time algorithm. Moreover, I present a method that controls error rates also for this problem. This algorithm is evaluated on microarray data from case studies in diabetes and cancer.

    In some cases however, I find that these statistical relevance concepts are insufficient to prioritize among candidate features in a biologically reasonable manner. Therefore, effective feature selection for life science requires both a careful definition of relevance and a principled integration of existing biological knowledge.

  • 40.
    Pham, Tuan D
    James Cook Univ., Townsville .
    Predictive Modeling in Proteomics-based Disease Detection2007Conference paper (Refereed)
    Abstract [en]

    Recent advent of mass-spectrometry data generated by proteomic technology provides a new type of biological information which is very promising in the search for diagnostic and therapeutic approaches that enables the early detection of fatal diseases and the development of personalized medicine. Successful analysis of such high-throughput proteomic data relies much on signal-processing and pattern-recognition techniques. This paper addresses the application of prediction models for cancer detection using mass spectral data.

  • 41.
    Pham, Tuan D
    James Cook University Townsville, QLD 4811 AUSTRALIA.
    Spatial linear predictive coding and its error matching for signal classification2006Conference paper (Refereed)
    Abstract [en]

    Mathematical analysis of the behavior of general dynamic systems based on linear prediction plays an essential role in many fields of science and engineering concerning the processing and representation of complex signals. This paper addresses the parameter estimation of the all-pole model of the linear predictive coding in the sense that the signal has both deterministic and random properties. Estimate of the model variance error is used as a basis for the derivation of a spatial distortion measure which can be used for matching spectral patterns.

  • 42.
    Pham, Tuan D
    et al.
    Griffith University, Nathan Campus, QLD 4111, Australia.
    Crane, Denis I
    Griffith University, Nathan Campus, QLD 4111, Australia.
    Tannock, David
    Griffith University, Nathan Campus, QLD 4111, Australia.
    Beck, Dominik
    Griffith University, Nathan Campus, QLD 4111, Australia.
    Kullback-Leibler dissimilarity of Markov models for phylogenetic tree reconstruction2004In: Intelligent Multimedia, Video and Speech Processing, 2004. Proceedings of 2004 International Symposium on, 2004, p. 157-160Conference paper (Refereed)
    Abstract [en]

    We introduce the Kullback-Leibler dissimilarity measure of Markov-chain models for unaligned DNA sequences with application to the phylogenetic tree reconstruction of complete mammalian mitochondrial genomes. The tree obtained by our approach is generally in agreement with those obtained from other methods. Our proposed method is computationally efficient and very easy for computer implementation.

  • 43.
    Pham, Tuan D
    et al.
    School of Engineering and Information Technology, University of New South Wales, Canberra, ACT, Australia .
    To, Cuong C
    School of Engineering and Information Technology, University of New South Wales, Canberra, ACT, Australia.
    Wang, Honghui
    Clinical Center, National Institutes of Health, Bethesda, USA.
    Zhou, Xiaobo
    Center for Biotechnology and Informatics, The Methodist Hospital Research Institute, Weill Cornell Medical College, Houston, USA.
    Analysis of Major Adverse Cardiac Events with Entropy-Based Complexity2010In: Information Technologies in Biomedicine: Volume 2 / [ed] Ewa Pietka and Jacek Kawa, Springer Berlin/Heidelberg, 2010, p. 261-272Chapter in book (Refereed)
    Abstract [en]

    Major adverse cardiac events (MACE) are referred to as unsuspected heart attacks that include death, myocardial infarction and target lesion revascularization. Feature extraction and classification methods for such cardiac events are useful tools that can be applied for biomarker discovery to allow preventive treatment and healthy-life maintenance. In this study we present an entropy-based analysis of the complexity of MACE-related mass spectrometry signals, and an effective model for classifying MACE and control complexity-based features. In particular, the geostatistical entropy is analytically rigorous and can provide better information about the predictability of this type of MACE data than other entropy-based methods for complexity analysis of biosignals. Information on the complexity of this type of time-series data can expand our knowledge about the dynamical behavior of a cardiac model and be useful as a novel feature for early prediction.

  • 44.
    Pham, Tuan D
    et al.
    Systems Engineering Division, School of Engineering, Cardiff University, Cardiff CF24 OYF, UK.
    Wang, Z
    Systems Engineering Division, School of Engineering, Cardiff University, Cardiff CF24 OYF, UK.
    Yang, M
    Systems Engineering Division, School of Engineering, Cardiff University, Cardiff CF24 OYF, UK.
    Packianather, M S
    Systems Engineering Division, School of Engineering, Cardiff University, Cardiff CF24 OYF, UK.
    Statistical Analysis of Signal-to-Noise Ratios in Fringe Pattern Matching2002In: IEEE 6th International Conference on Signal Processing Proceedings, Vol. 1, p. 636-639Article in journal (Refereed)
    Abstract [en]

    The paper presents a statistical analysis of signal-to-noise ratios (SNRs) in fringe pattern matching. It shows theoretically that the SNR of interference fringes can be significantly improved by fringe pattern matching or mean square difference calculation based on statistical analysis. Computer simulation and experimental results have confirmed that the high accuracy of fringe pattern matching is due to the significant SNR improvements achieved.

  • 45.
    Pértille, Fábio
    et al.
    1Animal Biotechnology Laboratory, Animal Science and Pastures Department, University of São Paulo (USP)/Luiz de Queiroz College of Agriculture (ESALQ), Piracicaba, São Paulo, Brazil.
    Guerrero-Bosagna, Carlos
    Linköping University, Department of Physics, Chemistry and Biology, Biology. Linköping University, Faculty of Science & Engineering.
    da Silva, Vinicius Henrique
    1Animal Biotechnology Laboratory, Animal Science and Pastures Department, University of São Paulo (USP)/Luiz de Queiroz College of Agriculture (ESALQ), Piracicaba, São Paulo, Brazil.
    Boschiero, Clarissa
    1Animal Biotechnology Laboratory, Animal Science and Pastures Department, University of São Paulo (USP)/Luiz de Queiroz College of Agriculture (ESALQ), Piracicaba, São Paulo, Brazil..
    da Silva Nunes, José de Ribamar
    1Animal Biotechnology Laboratory, Animal Science and Pastures Department, University of São Paulo (USP)/Luiz de Queiroz College of Agriculture (ESALQ), Piracicaba, São Paulo, Brazil..
    Corrêa Ledur, Mônica
    Brazilian Agricultural Research Corporation (EMBRAPA) Swine & Poultry, Concórdia, Santa Catarina, Brazil.
    Jensen, Per
    Linköping University, Department of Physics, Chemistry and Biology, Biology. Linköping University, Faculty of Science & Engineering.
    Lehmann Coutinho, Luiz
    1Animal Biotechnology Laboratory, Animal Science and Pastures Department, University of São Paulo (USP)/Luiz de Queiroz College of Agriculture (ESALQ), Piracicaba, São Paulo, Brazil.
    High-throughput and Cost-effective Chicken Genotyping Using Next-Generation Sequencing2016In: Scientific Reports, ISSN 2045-2322, E-ISSN 2045-2322, Vol. 6, article id 26929Article in journal (Refereed)
    Abstract [en]

    Chicken genotyping is becoming common practice in conventional animal breeding improvement.Despite the power of high-throughput methods for genotyping, their high cost limits large scale use inanimal breeding and selection. In the present paper we optimized the CornellGBS, an efficient and costeffectivegenotyping by sequence approach developed in plants, for its application in chickens. Herewe describe the successful genotyping of a large number of chickens (462) using CornellGBS approach.Genomic DNA was cleaved with the PstI enzyme, ligated to adapters with barcodes identifyingindividual animals, and then sequenced on Illumina platform. After filtering parameters were applied,134,528 SNPs were identified in our experimental population of chickens. Of these SNPs, 67,096 hada minimum taxon call rate of 90% and were considered ‘unique tags’. Interestingly, 20.7% of theseunique tags have not been previously reported in the dbSNP. Moreover, 92.6% of these SNPs wereconcordant with a previous Whole Chicken-genome re-sequencing dataset used for validation purposes.The application of CornellGBS in chickens showed high performance to infer SNPs, particularly inexonic regions and microchromosomes. This approach represents a cost-effective (~US$50/sample)and powerful alternative to current genotyping methods, which has the potential to improve wholegenomeselection (WGS), and genome-wide association studies (GWAS) in chicken production.

  • 46.
    Royle, Stephen J
    et al.
    School of Biomedical Sciences, University of Liverpool, Liverpool, UK.
    Granseth, Björn
    MRC Laboratory of Molecular Biology, Cambridge, UK.
    Odermatt, Benjamin
    MRC Laboratory of Molecular Biology, Cambridge, UK.
    Derevier, Aude
    MRC Laboratory of Molecular Biology, Cambridge, UK.
    Lagnado, Leon
    MRC Laboratory of Molecular Biology, Cambridge, UK.
    Imaging phluorin-based probes at hippocampal synapses2008In: Membrane Trafficking / [ed] Ales Vancura, Humana Press, 2008, Vol. 457, p. 293-303Chapter in book (Other academic)
    Abstract [en]

    Accurate measurement of synaptic vesicle exocytosis and endocytosis is crucial to understanding the molecular basis of synaptic transmission. The fusion of a pH-sensitive green fluorescent protein (pHluorin) to various synaptic vesicle proteins has allowed the study of synaptic vesicle recycling in real time. Two such probes, synaptopHluorin and sypHy, have been imaged at synapses of hippocampal neurons in culture. The combination of these reporters with techniques for molecular interference, such as RNAi allows for the study of molecules involved in synaptic vesicle recycling. Here the authors describe methods for the culture and transfection of hippocampal neurons, imaging of pHluorin-based probes at synapses and analysis of pHluorin signals down to the resolution of individual synaptic vesicles.

  • 47.
    Rundqvist, David
    Linköping University, Department of Computer and Information Science.
    Grouping Biological Data2006Independent thesis Basic level (professional degree), 20 points / 30 hpStudent thesis
    Abstract [en]

    Today, scientists in various biomedical fields rely on biological data sources in their research. Large amounts of information concerning, for instance, genes, proteins and diseases are publicly available on the internet, and are used daily for acquiring knowledge. Typically, biological data is spread across multiple sources, which has led to heterogeneity and redundancy.

    The current thesis suggests grouping as one way of computationally managing biological data. A conceptual model for this purpose is presented, which takes properties specific for biological data into account. The model defines sub-tasks and key issues where multiple solutions are possible, and describes what approaches for these that have been used in earlier work. Further, an implementation of this model is described, as well as test cases which show that the model is indeed useful.

    Since the use of ontologies is relatively new in the management of biological data, the main focus of the thesis is on how semantic similarity of ontological annotations can be used for grouping. The results of the test cases show for example that the implementation of the model, using Gene Ontology, is capable of producing groups of data entries with similar molecular functions.

  • 48.
    Rönnbrant, Anders
    Linköping University, Department of Biomedical Engineering.
    Implementing a visualization tool for myocardial strain tensors2005Independent thesis Basic level (professional degree), 20 points / 30 hpStudent thesis
    Abstract [en]

    The heart is a complex three-dimensional structure with mechanical properties that are inhomogeneous, non-linear, time-variant and anisotropic. These properties affect major physiological factors within the heart, such as the pumping performance of the ventricles, the oxygen demand in the tissue and the distribution of coronary blood flow.

    During the cardiac cycle the heart muscle tissue is deformed as a consequence of the active contraction of the muscle fibers and their relaxation respectively. A mapping of this deformation would give increased understanding of the mechanical properties of the heart. The deformation induces strain and stress in the tissue which are both mechanical properties and can be described with a mathematical tensor object.

    The aim of this master's thesis is to develop a visualization tool for the strain tensor objects that can aid a user to see and/or understand various differences between different hearts and spatial and temporal differences within the same heart. Preferably should the tool be general enough for use with different types of data.

  • 49.
    Schwende, Isabel
    et al.
    University of Aizu, Japan/University of Greifswald, Germany.
    Pham, Tuan D
    University of Aizu, Japan.
    Pattern recognition and probabilistic measures in alignment-free sequence analysis2014In: Briefings in Bioinformatics, ISSN 1467-5463, E-ISSN 1477-4054, Vol. 15, no 3, p. 354-368Article in journal (Refereed)
    Abstract [en]

    With the massive production of genomic and proteomic data, the number of available biological sequences in databases has reached a level that is not feasible anymore for exact alignments even when just a fraction of all sequences is used. To overcome this inevitable time complexity, ultrafast alignment-free methods are studied. Within the past two decades, a broad variety of nonalignment methods have been proposed including dissimilarity measures on classical representations of sequences like k-words or Markov models. Furthermore, articles were published that describe distance measures on alternative representations such as compression complexity, spectral time series or chaos game representation. However, alignments are still the standard method for real world applications in biological sequence analysis, and the time efficient alignment-free approaches are usually applied in cases when the accustomed algorithms turn out to fail or be too inconvenient.

  • 50.
    Schwende, Isabel
    et al.
    University of Aizu, Japan.
    Pham, Tuan D
    The Aizu Research Cluster for Medical Engineering and Informatics (ARC-Medical), Research Center for Advanced Information Science and Technology, The University of Aizu, Japan.
    Pattern recognition and probabilistic measures in alignment-free sequence analysis2013In: Briefings in Bioinformatics, ISSN 1467-5463, E-ISSN 1477-4054, Vol. 15, no 3, p. 354-368Article in journal (Refereed)
    Abstract [en]

    With the massive production of genomic and proteomic data, the number of available biological sequences in databases has reached a level that is not feasible anymore for exact alignments even when just a fraction of all sequences is used. To overcome this inevitable time complexity, ultrafast alignment-free methods are studied. Within the past two decades, a broad variety of nonalignment methods have been proposed including dissimilarity measures on classical representations of sequences like k-words or Markov models. Furthermore, articles were published that describe distance measures on alternative representations such as compression complexity, spectral time series or chaos game representation. However, alignments are still the standard method for real world applications in biological sequence analysis, and the time efficient alignment-free approaches are usually applied in cases when the accustomed algorithms turn out to fail or be too inconvenient.

12 1 - 50 of 61
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf