liu.seSearch for publications in DiVA
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
A bayesian random fragment insertion model for de novo detection of DNA regulatory binding regions
Department of Mathematics, Åbo Akademi University, Åbo, Finland.
Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
2007 (engelsk)Manuskript (preprint) (Annet vitenskapelig)
Abstract [en]

Identification of regulatory binding motifs within DNA sequences is a commonly occurring problem in computationnl bioinformatics. A wide variety of statistical approaches have been proposed in the literature to either scan for previously known motif types or to attempt de novo identification of a fixed number (typically one) of putative motifs. Most approaches assume the existence of reliable biodatabasc information to build probabilistic a priori description of the motif classes. No method has been previously proposed for finding the number of putative de novo motif types and their positions within a set of DNA sequences. As the number of sequenced genomes from a wide variety of organisms is constantly increasing, there is a clear need for such methods. Here we introduce a Bayesian unsupervised approach for this purpose by using recent advances in the theory of predictive classification and Markov chain Monte Carlo computation. Our modelling framework enables formal statistical inference in a large-scale sequence screening and we illustrate it by a set of examples.

sted, utgiver, år, opplag, sider
2007.
HSV kategori
Identifikatorer
URN: urn:nbn:se:liu:diva-13107OAI: oai:DiVA.org:liu-13107DiVA, id: diva2:17845
Tilgjengelig fra: 2008-03-31 Laget: 2008-03-31 Sist oppdatert: 2012-11-21
Inngår i avhandling
1. On approximations and computations in probabilistic classification and in learning of graphical models
Åpne denne publikasjonen i ny fane eller vindu >>On approximations and computations in probabilistic classification and in learning of graphical models
2007 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Model based probabilistic classification is heavily used in data mining and machine learning. For computational learning these models may need approximation steps however. One popular approximation in classification is to model the class conditional densities by factorization, which in the independence case is usually called the ’Naïve Bayes’ classifier. In general probabilistic independence cannot model all distributions exactly, and not much has been published on how much a discrete distribution can differ from the independence assumption. In this dissertation the approximation quality of factorizations is analyzed in two articles.

A specific class of factorizations is the factorizations represented by graphical models. Several challenges arise from the use of statistical methods for learning graphical models from data. Examples of problems include the increase in the number of graphical model structures as a function of the number of nodes, and the equivalence of statistical models determined by different graphical models. In one article an algorithm for learning graphical models is presented. In the final article an algorithm for clustering parts of DNA strings is developed, and a graphical representation for the remaining DNA part is learned.

sted, utgiver, år, opplag, sider
Matematiska institutionen, 2007. s. 22
Serie
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1141
Emneord
Mathematical statistics, factorizations, probabilistic classification, nodes, DNA strings
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-11429 (URN)978-91-85895-58-8 (ISBN)
Disputas
2007-12-14, Visionen, Hus B, Campus Valla, Linköpings universitet, Linköping, 10:15 (engelsk)
Opponent
Tilgjengelig fra: 2008-03-31 Laget: 2008-03-31 Sist oppdatert: 2012-11-21

Open Access i DiVA

Fulltekst mangler i DiVA

Personposter BETA

Ekdahl, MagnusKoski, Timo

Søk i DiVA

Av forfatter/redaktør
Ekdahl, MagnusKoski, Timo
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric

urn-nbn
Totalt: 451 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf