liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
BETA
Koski, Timo
Publications (10 of 28) Show all publications
Ohlson, M. & Koski, T. (2012). On the Distribution of Matrix Quadratic Forms. Communications in Statistics - Theory and Methods, 41(18), 3403-315
Open this publication in new window or tab >>On the Distribution of Matrix Quadratic Forms
2012 (English)In: Communications in Statistics - Theory and Methods, ISSN 0361-0926, E-ISSN 1532-415X, Vol. 41, no 18, p. 3403-315Article in journal (Refereed) Published
Abstract [en]

A characterization of the distribution of the multivariate quadratic form given by XAX′, where X is a p×n normally distributed matrix and A is an n×n symmetric real matrix, is presented. We show that the distribution of the quadratic form is the same as the distribution of a weighted sum of noncentralWishart distributed matrices. This is applied to derive the distribution of the sample covariance between the rows of X when the expectation is the same for every column and is estimated with the regular mean.

Place, publisher, year, edition, pages
Taylor & Francis, 2012
Keywords
Quadratic form; Spectral decomposition; Eigenvalues; Singular matrix normal distribution; Non-centralWishart distribution
National Category
Mathematics
Identifiers
urn:nbn:se:liu:diva-18513 (URN)10.1080/03610926.2011.563009 (DOI)000308465400007 ()
Available from: 2009-05-29 Created: 2009-05-29 Last updated: 2017-12-13
Koski, T. & Noble, J. (2009). Bayesian Networks: An Introduction (1ed.). United Kingdom: Wiley
Open this publication in new window or tab >>Bayesian Networks: An Introduction
2009 (English)Book (Other academic)
Abstract [en]

Bayesian Networks: An Introductionprovides a self-contained introduction to the theory and applications of Bayesian networks, a topic of interest and importance for statisticians, computer scientists and those involved in modelling complex data sets. The material has been extensively tested in classroom teaching and assumes a basic knowledge of probability, statistics and mathematics. All notions are carefully explained and feature exercises throughout.

Features include:

  • An introduction to Dirichlet Distribution, Exponential Families and their applications.
  • A detailed description of learning algorithms and Conditional Gaussian Distributions using Junction Tree methods.
  • A discussion of Pearl's intervention calculus, with an introduction to the notion of see and do conditioning.
  • All concepts are clearly defined and illustrated with examples and exercises. Solutions are provided online.

This book will prove a valuable resource for postgraduate students of statistics, computer engineering, mathematics, data mining, artificial intelligence, and biology.

Researchers and users of comparable modelling or statistical techniques such as neural networks will also find this book of interest.

Place, publisher, year, edition, pages
United Kingdom: Wiley, 2009. p. 347 Edition: 1
Series
Wiley Series in Probability and Statistics
Keywords
Bayesian Networks, Graphical Models
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:liu:diva-52108 (URN)978-0-470-74304-1 (ISBN)
Available from: 2009-12-04 Created: 2009-12-04 Last updated: 2013-05-07Bibliographically approved
Ohlson, M. & Koski, T. (2009). The Likelihood Ratio Statistic for Testing Spatial Independence using a Separable Covariance Matrix. Linköping: Linköping University Electronic Press
Open this publication in new window or tab >>The Likelihood Ratio Statistic for Testing Spatial Independence using a Separable Covariance Matrix
2009 (English)Report (Other academic)
Abstract [en]

This paper deals with the problem of testing spatial independence for dependent observations. The sample observationmatrix is assumed to follow a matrix normal distribution with a separable covariance matrix, in other words it can be written as a Kronecker product of two positive definite matrices. Two cases are considered, when the temporal covariance is known and when it is unknown. When the temporal covariance is known, the maximum likelihood estimates are computed and the asymptotic null distribution is given. In the case when the temporal covariance is unknown the maximum likelihood estimates of the parameters are found by an iterative alternating algori

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2009. p. 17
Series
LiTH-MAI-R, ISSN 0348-2960 ; 2009:06
Keywords
Maximum likelihood estimation, Matrix normal distribution, Testing independence
National Category
Mathematics
Identifiers
urn:nbn:se:liu:diva-18225 (URN)LiTH-MAT-R-2009-06 (ISRN)
Available from: 2009-05-12 Created: 2009-05-12 Last updated: 2018-10-02Bibliographically approved
Corander, J., Ekdahl, M. & Koski, T. (2008). Parallell interacting MCMC for learning of topologies of graphical models. Data mining and knowledge discovery, 17(3), 431-456
Open this publication in new window or tab >>Parallell interacting MCMC for learning of topologies of graphical models
2008 (English)In: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 17, no 3, p. 431-456Article in journal (Refereed) Published
Abstract [en]

Automated statistical learning of graphical models from data has attained a considerable degree of interest in the machine learning and related literature. Many authors have discussed and/or demonstrated the need for consistent stochastic search methods that would not be as prone to yield locally optimal model structures as simple greedy methods. However, at the same time most of the stochastic search methods are based on a standard Metropolis–Hastings theory that necessitates the use of relatively simple random proposals and prevents the utilization of intelligent and efficient search operators. Here we derive an algorithm for learning topologies of graphical models from samples of a finite set of discrete variables by utilizing and further enhancing a recently introduced theory for non-reversible parallel interacting Markov chain Monte Carlo-style computation. In particular, we illustrate how the non-reversible approach allows for novel type of creativity in the design of search operators. Also, the parallel aspect of our method illustrates well the advantages of the adaptive nature of search operators to avoid trapping states in the vicinity of locally optimal network topologies.

Keywords
MCMC, Equivalence search, Learning graphical models
National Category
Mathematics
Identifiers
urn:nbn:se:liu:diva-13106 (URN)10.1007/s10618-008-0099-9 (DOI)
Available from: 2008-03-31 Created: 2008-03-31 Last updated: 2017-12-13
Corander, J., Ekdahl, M. & Koski, T. (2007). A bayesian random fragment insertion model for de novo detection of DNA regulatory binding regions.
Open this publication in new window or tab >>A bayesian random fragment insertion model for de novo detection of DNA regulatory binding regions
2007 (English)Manuscript (preprint) (Other academic)
Abstract [en]

Identification of regulatory binding motifs within DNA sequences is a commonly occurring problem in computationnl bioinformatics. A wide variety of statistical approaches have been proposed in the literature to either scan for previously known motif types or to attempt de novo identification of a fixed number (typically one) of putative motifs. Most approaches assume the existence of reliable biodatabasc information to build probabilistic a priori description of the motif classes. No method has been previously proposed for finding the number of putative de novo motif types and their positions within a set of DNA sequences. As the number of sequenced genomes from a wide variety of organisms is constantly increasing, there is a clear need for such methods. Here we introduce a Bayesian unsupervised approach for this purpose by using recent advances in the theory of predictive classification and Markov chain Monte Carlo computation. Our modelling framework enables formal statistical inference in a large-scale sequence screening and we illustrate it by a set of examples.

National Category
Mathematics
Identifiers
urn:nbn:se:liu:diva-13107 (URN)
Available from: 2008-03-31 Created: 2008-03-31 Last updated: 2012-11-21
Ekdahl, M., Koski, T. & Ohlson, M. (2007). Concentrated or non-concentrated discrete distributions are almost independent.
Open this publication in new window or tab >>Concentrated or non-concentrated discrete distributions are almost independent
2007 (English)Manuscript (preprint) (Other academic)
Abstract [en]

The task of approximating a simultaneous distribution with a product of distributions in a single variable is important in the theory and applications of classification and learning, probabilistic reasoning, and random algmithms. The evaluation of the goodness of this approximation by statistical independence amounts to bounding uniformly upwards the difference between a joint distribution and the product of the distributions (marginals). In this paper we develop a bound that uses information about the most probable state to find a sharp estimate, which is often as sharp as possible. We also examine the extreme cases of concentration and non-conccntmtion, respectively, of the approximated distribution.

National Category
Mathematics
Identifiers
urn:nbn:se:liu:diva-13105 (URN)
Available from: 2008-03-31 Created: 2008-03-31 Last updated: 2014-09-29
Ekdahl, M. & Koski, T. (2007). On Concentration of Discrete Distributions with Applications to Supervised Learning of Classifiers. In: Petra Perner (Ed.), Petra Perner (Ed.), Machine Learning and Data Mining in Pattern Recognition: 5th International Conference, MLDM 2007, Leipzig, Germany, July 18-20, 2007. Proceedings (pp. 2-16). Springer Berlin/Heidelberg
Open this publication in new window or tab >>On Concentration of Discrete Distributions with Applications to Supervised Learning of Classifiers
2007 (English)In: Machine Learning and Data Mining in Pattern Recognition: 5th International Conference, MLDM 2007, Leipzig, Germany, July 18-20, 2007. Proceedings / [ed] Petra Perner, Springer Berlin/Heidelberg, 2007, p. 2-16Chapter in book (Refereed)
Abstract [en]

Computational procedures using independence assumptions in various forms are popular in machine learning, although checks on empirical data have given inconclusive results about their impact. Some theoretical understanding of when they work is available, but a definite answer seems to be lacking. This paper derives distributions that maximizes the statewise difference to the respective product of marginals. These distributions are, in a sense the worst distribution for predicting an outcome of the data generating mechanism by independence. We also restrict the scope of new theoretical results by showing explicitly that, depending on context, independent ('Naïve') classifiers can be as bad as tossing coins. Regardless of this, independence may beat the generating model in learning supervised classification and we explicitly provide one such scenario.

Place, publisher, year, edition, pages
Springer Berlin/Heidelberg, 2007
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 4571
Keywords
independence, classification, supervised learning, pattern recognition, prediction
National Category
Mathematics
Identifiers
urn:nbn:se:liu:diva-38249 (URN)10.1007/978-3-540-73499-4_2 (DOI)000248523200001 ()43265 (Local ID)978-3-540-73498-7 (ISBN)978-3-540-73499-4 (ISBN)3-540-73498-8 (ISBN)43265 (Archive number)43265 (OAI)
Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2018-01-26Bibliographically approved
Koski, T., Corander, J. & Gyllenberg, M. (2007). Random Partition Models and Exchangeability for Bayesian Identification of Population Structure. Bulletin of Mathematical Biology, 69(3), 797-815
Open this publication in new window or tab >>Random Partition Models and Exchangeability for Bayesian Identification of Population Structure
2007 (English)In: Bulletin of Mathematical Biology, ISSN 0092-8240, E-ISSN 1522-9602, Vol. 69, no 3, p. 797-815Article in journal (Refereed) Published
Abstract [en]

 We introduce a statistical model for learning the genetic structure of populations. A novel MCMC type method is introduced. 

Keywords
ekologisk genetik
National Category
Mathematics
Identifiers
urn:nbn:se:liu:diva-37836 (URN)10.1007/s11538-006-9161-1 (DOI)39565 (Local ID)39565 (Archive number)39565 (OAI)
Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2017-12-13
Koski, T., Corander, J. & Gyllenberg, M. (2006). Bayesian model learning based on a parallel MCMC strategy. Statistics and computing, 16(2), 355-362
Open this publication in new window or tab >>Bayesian model learning based on a parallel MCMC strategy
2006 (English)In: Statistics and computing, ISSN 0960-3174, E-ISSN 1573-1375, Vol. 16, no 2, p. 355-362Article in journal (Refereed) Published
Abstract [en]

  Interacting parallell Markov chains are shown to converge to a maximum of the posterior on a set of partitions of a finite set of discrete data. This is demonstrated on an example of population genetics data.

Keywords
classification, learning
National Category
Mathematics
Identifiers
urn:nbn:se:liu:diva-35479 (URN)10.1007/s11222-006-9391-y (DOI)27026 (Local ID)27026 (Archive number)27026 (OAI)
Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2017-12-13
Ekdahl, M. & Koski, T. (2006). Bounds for the Loss in Probability of Correct Classification Under Model Based Approximation. Journal of Machine Learning Research, 7, 2449-2480
Open this publication in new window or tab >>Bounds for the Loss in Probability of Correct Classification Under Model Based Approximation
2006 (English)In: Journal of Machine Learning Research, ISSN 1532-4435, Vol. 7, p. 2449-2480Article in journal (Refereed) Published
Abstract [en]

In many pattern recognition/classification problem the true class conditional model and class probabilities are approximated for reasons of reducing complexity and/or of statistical estimation. The approximated classifier is expected to have worse performance, here measured by the probability of correct classification. We present an analysis valid in general, and easily computable formulas for estimating the degradation in probability of correct classification when compared to the optimal classifier. An example of an approximation is the Na¨ıve Bayes classifier. We show that the performance of the Naïve Bayes depends on the degree of functional dependence between the features and labels. We provide a sufficient condition for zero loss of performance, too.

Keywords
Bayesian networks, na¨ıve Bayes, plug-in classifier, Kolmogorov distance of variation, variational learning
National Category
Mathematics
Identifiers
urn:nbn:se:liu:diva-13104 (URN)
Available from: 2008-03-31 Created: 2008-03-31
Organisations

Search in DiVA

Show all publications