liu.seSök publikationer i DiVA
Ändra sökning
Avgränsa sökresultatet
1 - 9 av 9
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Corander, Jukka
    et al.
    Department of Mathematics, Åbo Akademi University, Åbo, Finland.
    Ekdahl, Magnus
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    Koski, Timo
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    A bayesian random fragment insertion model for de novo detection of DNA regulatory binding regions2007Manuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    Identification of regulatory binding motifs within DNA sequences is a commonly occurring problem in computationnl bioinformatics. A wide variety of statistical approaches have been proposed in the literature to either scan for previously known motif types or to attempt de novo identification of a fixed number (typically one) of putative motifs. Most approaches assume the existence of reliable biodatabasc information to build probabilistic a priori description of the motif classes. No method has been previously proposed for finding the number of putative de novo motif types and their positions within a set of DNA sequences. As the number of sequenced genomes from a wide variety of organisms is constantly increasing, there is a clear need for such methods. Here we introduce a Bayesian unsupervised approach for this purpose by using recent advances in the theory of predictive classification and Markov chain Monte Carlo computation. Our modelling framework enables formal statistical inference in a large-scale sequence screening and we illustrate it by a set of examples.

  • 2.
    Corander, Jukka
    et al.
    Department of Mathematics, Åbo Akademi University, Åbo, Finland.
    Ekdahl, Magnus
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    Koski, Timo
    Department of Mathematics, Royal Institute of Technology, Stockholm, Sweden.
    Parallell interacting MCMC for learning of topologies of graphical models2008Ingår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 17, nr 3, s. 431-456Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Automated statistical learning of graphical models from data has attained a considerable degree of interest in the machine learning and related literature. Many authors have discussed and/or demonstrated the need for consistent stochastic search methods that would not be as prone to yield locally optimal model structures as simple greedy methods. However, at the same time most of the stochastic search methods are based on a standard Metropolis–Hastings theory that necessitates the use of relatively simple random proposals and prevents the utilization of intelligent and efficient search operators. Here we derive an algorithm for learning topologies of graphical models from samples of a finite set of discrete variables by utilizing and further enhancing a recently introduced theory for non-reversible parallel interacting Markov chain Monte Carlo-style computation. In particular, we illustrate how the non-reversible approach allows for novel type of creativity in the design of search operators. Also, the parallel aspect of our method illustrates well the advantages of the adaptive nature of search operators to avoid trapping states in the vicinity of locally optimal network topologies.

  • 3.
    Ekdahl, Magnus
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    Approximations of Bayes Classifiers for Statistical Learning of Clusters2006Licentiatavhandling, monografi (Övrigt vetenskapligt)
    Abstract [en]

    It is rarely possible to use an optimal classifier. Often the classifier used for a specific problem is an approximation of the optimal classifier. Methods are presented for evaluating the performance of an approximation in the model class of Bayesian Networks. Specifically for the approximation of class conditional independence a bound for the performance is sharpened.

    The class conditional independence approximation is connected to the minimum description length principle (MDL), which is connected to Jeffreys’ prior through commonly used assumptions. One algorithm for unsupervised classification is presented and compared against other unsupervised classifiers on three data sets.

  • 4.
    Ekdahl, Magnus
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    On approximations and computations in probabilistic classification and in learning of graphical models2007Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    Model based probabilistic classification is heavily used in data mining and machine learning. For computational learning these models may need approximation steps however. One popular approximation in classification is to model the class conditional densities by factorization, which in the independence case is usually called the ’Naïve Bayes’ classifier. In general probabilistic independence cannot model all distributions exactly, and not much has been published on how much a discrete distribution can differ from the independence assumption. In this dissertation the approximation quality of factorizations is analyzed in two articles.

    A specific class of factorizations is the factorizations represented by graphical models. Several challenges arise from the use of statistical methods for learning graphical models from data. Examples of problems include the increase in the number of graphical model structures as a function of the number of nodes, and the equivalence of statistical models determined by different graphical models. In one article an algorithm for learning graphical models is presented. In the final article an algorithm for clustering parts of DNA strings is developed, and a graphical representation for the remaining DNA part is learned.

    Delarbeten
    1. Bounds for the Loss in Probability of Correct Classification Under Model Based Approximation
    Öppna denna publikation i ny flik eller fönster >>Bounds for the Loss in Probability of Correct Classification Under Model Based Approximation
    2006 (Engelska)Ingår i: Journal of Machine Learning Research, ISSN 1532-4435, Vol. 7, s. 2449-2480Artikel i tidskrift (Refereegranskat) Published
    Abstract [en]

    In many pattern recognition/classification problem the true class conditional model and class probabilities are approximated for reasons of reducing complexity and/or of statistical estimation. The approximated classifier is expected to have worse performance, here measured by the probability of correct classification. We present an analysis valid in general, and easily computable formulas for estimating the degradation in probability of correct classification when compared to the optimal classifier. An example of an approximation is the Na¨ıve Bayes classifier. We show that the performance of the Naïve Bayes depends on the degree of functional dependence between the features and labels. We provide a sufficient condition for zero loss of performance, too.

    Nyckelord
    Bayesian networks, na¨ıve Bayes, plug-in classifier, Kolmogorov distance of variation, variational learning
    Nationell ämneskategori
    Matematik
    Identifikatorer
    urn:nbn:se:liu:diva-13104 (URN)
    Tillgänglig från: 2008-03-31 Skapad: 2008-03-31
    2. Concentrated or non-concentrated discrete distributions are almost independent
    Öppna denna publikation i ny flik eller fönster >>Concentrated or non-concentrated discrete distributions are almost independent
    2007 (Engelska)Manuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    The task of approximating a simultaneous distribution with a product of distributions in a single variable is important in the theory and applications of classification and learning, probabilistic reasoning, and random algmithms. The evaluation of the goodness of this approximation by statistical independence amounts to bounding uniformly upwards the difference between a joint distribution and the product of the distributions (marginals). In this paper we develop a bound that uses information about the most probable state to find a sharp estimate, which is often as sharp as possible. We also examine the extreme cases of concentration and non-conccntmtion, respectively, of the approximated distribution.

    Nationell ämneskategori
    Matematik
    Identifikatorer
    urn:nbn:se:liu:diva-13105 (URN)
    Tillgänglig från: 2008-03-31 Skapad: 2008-03-31 Senast uppdaterad: 2014-09-29
    3. Parallell interacting MCMC for learning of topologies of graphical models
    Öppna denna publikation i ny flik eller fönster >>Parallell interacting MCMC for learning of topologies of graphical models
    2008 (Engelska)Ingår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 17, nr 3, s. 431-456Artikel i tidskrift (Refereegranskat) Published
    Abstract [en]

    Automated statistical learning of graphical models from data has attained a considerable degree of interest in the machine learning and related literature. Many authors have discussed and/or demonstrated the need for consistent stochastic search methods that would not be as prone to yield locally optimal model structures as simple greedy methods. However, at the same time most of the stochastic search methods are based on a standard Metropolis–Hastings theory that necessitates the use of relatively simple random proposals and prevents the utilization of intelligent and efficient search operators. Here we derive an algorithm for learning topologies of graphical models from samples of a finite set of discrete variables by utilizing and further enhancing a recently introduced theory for non-reversible parallel interacting Markov chain Monte Carlo-style computation. In particular, we illustrate how the non-reversible approach allows for novel type of creativity in the design of search operators. Also, the parallel aspect of our method illustrates well the advantages of the adaptive nature of search operators to avoid trapping states in the vicinity of locally optimal network topologies.

    Nyckelord
    MCMC, Equivalence search, Learning graphical models
    Nationell ämneskategori
    Matematik
    Identifikatorer
    urn:nbn:se:liu:diva-13106 (URN)10.1007/s10618-008-0099-9 (DOI)
    Tillgänglig från: 2008-03-31 Skapad: 2008-03-31 Senast uppdaterad: 2017-12-13
    4. A bayesian random fragment insertion model for de novo detection of DNA regulatory binding regions
    Öppna denna publikation i ny flik eller fönster >>A bayesian random fragment insertion model for de novo detection of DNA regulatory binding regions
    2007 (Engelska)Manuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    Identification of regulatory binding motifs within DNA sequences is a commonly occurring problem in computationnl bioinformatics. A wide variety of statistical approaches have been proposed in the literature to either scan for previously known motif types or to attempt de novo identification of a fixed number (typically one) of putative motifs. Most approaches assume the existence of reliable biodatabasc information to build probabilistic a priori description of the motif classes. No method has been previously proposed for finding the number of putative de novo motif types and their positions within a set of DNA sequences. As the number of sequenced genomes from a wide variety of organisms is constantly increasing, there is a clear need for such methods. Here we introduce a Bayesian unsupervised approach for this purpose by using recent advances in the theory of predictive classification and Markov chain Monte Carlo computation. Our modelling framework enables formal statistical inference in a large-scale sequence screening and we illustrate it by a set of examples.

    Nationell ämneskategori
    Matematik
    Identifikatorer
    urn:nbn:se:liu:diva-13107 (URN)
    Tillgänglig från: 2008-03-31 Skapad: 2008-03-31 Senast uppdaterad: 2012-11-21
  • 5.
    Ekdahl, Magnus
    Linköpings universitet, Tekniska högskolan. Linköpings universitet, Matematiska institutionen, Matematisk statistik.
    Stokastisk komplexitet i klustringsanalys2004Ingår i: Workshop i tillämpad matematik,2004, 2004Konferensbidrag (Övrigt vetenskapligt)
  • 6.
    Ekdahl, Magnus
    et al.
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    Koski, Timo
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    Bounds for the Loss in Probability of Correct Classification Under Model Based Approximation2006Ingår i: Journal of Machine Learning Research, ISSN 1532-4435, Vol. 7, s. 2449-2480Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In many pattern recognition/classification problem the true class conditional model and class probabilities are approximated for reasons of reducing complexity and/or of statistical estimation. The approximated classifier is expected to have worse performance, here measured by the probability of correct classification. We present an analysis valid in general, and easily computable formulas for estimating the degradation in probability of correct classification when compared to the optimal classifier. An example of an approximation is the Na¨ıve Bayes classifier. We show that the performance of the Naïve Bayes depends on the degree of functional dependence between the features and labels. We provide a sufficient condition for zero loss of performance, too.

  • 7.
    Ekdahl, Magnus
    et al.
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    Koski, Timo
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    On Concentration of Discrete Distributions with Applications to Supervised Learning of Classifiers2007Ingår i: Machine Learning and Data Mining in Pattern Recognition: 5th International Conference, MLDM 2007, Leipzig, Germany, July 18-20, 2007. Proceedings / [ed] Petra Perner, Springer Berlin/Heidelberg, 2007, s. 2-16Kapitel i bok, del av antologi (Refereegranskat)
    Abstract [en]

    Computational procedures using independence assumptions in various forms are popular in machine learning, although checks on empirical data have given inconclusive results about their impact. Some theoretical understanding of when they work is available, but a definite answer seems to be lacking. This paper derives distributions that maximizes the statewise difference to the respective product of marginals. These distributions are, in a sense the worst distribution for predicting an outcome of the data generating mechanism by independence. We also restrict the scope of new theoretical results by showing explicitly that, depending on context, independent ('Naïve') classifiers can be as bad as tossing coins. Regardless of this, independence may beat the generating model in learning supervised classification and we explicitly provide one such scenario.

  • 8.
    Ekdahl, Magnus
    et al.
    Linköpings universitet, Tekniska högskolan. Linköpings universitet, Matematiska institutionen, Matematisk statistik.
    Koski, Timo
    Linköpings universitet, Tekniska högskolan. Linköpings universitet, Matematiska institutionen, Matematisk statistik.
    On the Performance of Approximations of Bayesian Networks in Model-2006Ingår i: The Annual Workshop of the Swedish Artificial Intelligence Society,2006, Umeå: SAIS , 2006, s. 73-Konferensbidrag (Refereegranskat)
    Abstract [en]

    When the true class conditional model and class probabilities are approximated in a pattern recognition/classification problem the performance of the optimal classifier is expected to deteriorate. But calculating this reduction is far from trivial in the general case. We present one generalization, and easily computable formulas for estimating the degradation in performance with respect to the optimal classifier. An example of an approximation is the Naive Bayes classifier. We generalize and sharpen results for evaluating this classifier.

  • 9.
    Ekdahl, Magnus
    et al.
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    Koski, Timo
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    Ohlson, Martin
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    Concentrated or non-concentrated discrete distributions are almost independent2007Manuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    The task of approximating a simultaneous distribution with a product of distributions in a single variable is important in the theory and applications of classification and learning, probabilistic reasoning, and random algmithms. The evaluation of the goodness of this approximation by statistical independence amounts to bounding uniformly upwards the difference between a joint distribution and the product of the distributions (marginals). In this paper we develop a bound that uses information about the most probable state to find a sharp estimate, which is often as sharp as possible. We also examine the extreme cases of concentration and non-conccntmtion, respectively, of the approximated distribution.

1 - 9 av 9
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf