liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Accelerating Monte Carlo methods for Bayesian inference in dynamical models
Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.ORCID iD: 0000-0002-9424-1272
2016 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Making decisions and predictions from noisy observations are two important and challenging problems in many areas of society. Some examples of applications are recommendation systems for online shopping and streaming services, connecting genes with certain diseases and modelling climate change. In this thesis, we make use of Bayesian statistics to construct probabilistic models given prior information and historical data, which can be used for decision support and predictions. The main obstacle with this approach is that it often results in mathematical problems lacking analytical solutions. To cope with this, we make use of statistical simulation algorithms known as Monte Carlo methods to approximate the intractable solution. These methods enjoy well-understood statistical properties but are often computational prohibitive to employ.

The main contribution of this thesis is the exploration of different strategies for accelerating inference methods based on sequential Monte Carlo (SMC) and Markov chain Monte Carlo (MCMC). That is, strategies for reducing the computational effort while keeping or improving the accuracy. A major part of the thesis is devoted to proposing such strategies for the MCMC method known as the particle Metropolis-Hastings (PMH) algorithm. We investigate two strategies: (i) introducing estimates of the gradient and Hessian of the target to better tailor the algorithm to the problem and (ii) introducing a positive correlation between the point-wise estimates of the target.

Furthermore, we propose an algorithm based on the combination of SMC and Gaussian process optimisation, which can provide reasonable estimates of the posterior but with a significant decrease in computational effort compared with PMH. Moreover, we explore the use of sparseness priors for approximate inference in over-parametrised mixed effects models and autoregressive processes. This can potentially be a practical strategy for inference in the big data era. Finally, we propose a general method for increasing the accuracy of the parameter estimates in non-linear state space models by applying a designed input signal.

Abstract [sv]

Borde Riksbanken höja eller sänka reporäntan vid sitt nästa möte för att nå inflationsmålet? Vilka gener är förknippade med en viss sjukdom? Hur kan Netflix och Spotify veta vilka filmer och vilken musik som jag vill lyssna på härnäst?

Dessa tre problem är exempel på frågor där statistiska modeller kan vara användbara för att ge hjälp och underlag för beslut. Statistiska modeller kombinerar teoretisk kunskap om exempelvis det svenska ekonomiska systemet med historisk data för att ge prognoser av framtida skeenden. Dessa prognoser kan sedan användas för att utvärdera exempelvis vad som skulle hända med inflationen i Sverige om arbetslösheten sjunker eller hur värdet på mitt pensionssparande förändras när Stockholmsbörsen rasar. Tillämpningar som dessa och många andra gör statistiska modeller viktiga för många delar av samhället.

Ett sätt att ta fram statistiska modeller bygger på att kontinuerligt uppdatera en modell allteftersom mer information samlas in. Detta angreppssätt kallas för Bayesiansk statistik och är särskilt användbart när man sedan tidigare har bra insikter i modellen eller tillgång till endast lite historisk data för att bygga modellen. En nackdel med Bayesiansk statistik är att de beräkningar som krävs för att uppdatera modellen med den nya informationen ofta är mycket komplicerade. I sådana situationer kan man istället simulera utfallet från miljontals varianter av modellen och sedan jämföra dessa mot de historiska observationerna som finns till hands. Man kan sedan medelvärdesbilda över de varianter som gav bäst resultat för att på så sätt ta fram en slutlig modell. Det kan därför ibland ta dagar eller veckor för att ta fram en modell. Problemet blir särskilt stort när man använder mer avancerade modeller som skulle kunna ge bättre prognoser men som tar för lång tid för att bygga.

I denna avhandling använder vi ett antal olika strategier för att underlätta eller förbättra dessa simuleringar. Vi föreslår exempelvis att ta hänsyn till fler insikter om systemet och därmed minska antalet varianter av modellen som behöver undersökas. Vi kan således redan utesluta vissa modeller eftersom vi har en bra uppfattning om ungefär hur en bra modell ska se ut. Vi kan också förändra simuleringen så att den enklare rör sig mellan olika typer av modeller. På detta sätt utforskas rymden av alla möjliga modeller på ett mer effektivt sätt. Vi föreslår ett antal olika kombinationer och förändringar av befintliga metoder för att snabba upp anpassningen av modellen till observationerna. Vi visar att beräkningstiden i vissa fall kan minska ifrån några dagar till någon timme. Förhoppningsvis kommer detta i framtiden leda till att man i praktiken kan använda mer avancerade modeller som i sin tur resulterar i bättre prognoser och beslut.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2016.
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1754
Keyword [en]
Computational statistics, Monte Carlo, Markov chains, Particle filters, Machine learning, Bayesian optimisation, Approximate Bayesian Computations, Gaussian processes, Particle Metropolis-Hastings, Approximate inference, Pseudo-marginal methods
National Category
Probability Theory and Statistics Control Engineering Computational Mathematics
Identifiers
URN: urn:nbn:se:liu:diva-125992DOI: 10.3384/diss.diva-125992ISBN: 978-91-7685-797-7 (print)OAI: oai:DiVA.org:liu-125992DiVA: diva2:911089
Public defence
2016-05-04, Visionen, B-building, Campus Valla, Linköping, 10:15 (English)
Opponent
Supervisors
Funder
Swedish Research Council, 621-2013-5524Swedish Research Council, 637-2014-466Swedish Foundation for Strategic Research , IIS11-0081
Available from: 2016-03-22 Created: 2016-03-11 Last updated: 2016-04-01Bibliographically approved
List of papers
1. Particle Metropolis-Hastings using gradient and Hessian information
Open this publication in new window or tab >>Particle Metropolis-Hastings using gradient and Hessian information
2015 (English)In: Statistics and computing, ISSN 0960-3174, E-ISSN 1573-1375, Vol. 25, no 1, 81-92 p.Article in journal (Other academic) Published
Abstract [en]

Particle Metropolis-Hastings (PMH) allows for Bayesian parameter inference in nonlinear state space models by combining MCMC and particle filtering. The latter is used to estimate the intractable likelihood. In its original formulation, PMH makes use of a marginal MCMC proposal for the parameters, typically a Gaussian random walk. However, this can lead to a poor exploration of the parameter space and an inefficient use of the generated particles.

We propose two alternative versions of PMH that incorporate gradient and Hessian information about the posterior into the proposal. This information is more or less obtained as a byproduct of the likelihood estimation. Indeed, we show how to estimate the required information using a fixed-lag particle smoother, with a computational cost growing linearly in the number of particles. We conclude that the proposed methods can: (i) decrease the length of the burn-in phase, (ii) increase the mixing of the Markov chain at the stationary phase, and (iii) make the proposal distribution scale invariant which simplifies tuning.

Place, publisher, year, edition, pages
Springer, 2015
National Category
Control Engineering Signal Processing Probability Theory and Statistics
Identifiers
urn:nbn:se:liu:diva-106749 (URN)10.1007/s11222-014-9510-0 (DOI)000349028500013 ()
Projects
Probabilistic modelling of dynamical systems
Funder
Swedish Research Council, 621-2013-5524
Note

On the day of the defence date the status of this article was Manuscript.

Available from: 2014-05-21 Created: 2014-05-21 Last updated: 2017-12-05Bibliographically approved
2. Quasi-Newton particle Metropolis-Hastings
Open this publication in new window or tab >>Quasi-Newton particle Metropolis-Hastings
2015 (English)In: Proceedings of the 17th IFAC Symposium on System Identification., Elsevier, 2015, Vol. 48 Issue 28, 981-986 p.Conference paper, Published paper (Refereed)
Abstract [en]

Particle Metropolis-Hastings enables Bayesian parameter inference in general nonlinear state space models (SSMs). However, in many implementations a random walk proposal is used and this can result in poor mixing if not tuned correctly using tedious pilot runs. Therefore, we consider a new proposal inspired by quasi-Newton algorithms that may achieve similar (or better) mixing with less tuning. An advantage compared to other Hessian based proposals, is that it only requires estimates of the gradient of the log-posterior. A possible application is parameter inference in the challenging class of SSMs with intractable likelihoods.We exemplify this application and the benefits of the new proposal by modelling log-returns offuture contracts on coffee by a stochastic volatility model with alpha-stable observations.

Place, publisher, year, edition, pages
Elsevier, 2015
Keyword
Bayesian parameter inference; state space models; approximate Bayesian computations; particle Markov chain Monte Carlo; α-stable distributions
National Category
Control Engineering Probability Theory and Statistics
Identifiers
urn:nbn:se:liu:diva-123666 (URN)10.1016/j.ifacol.2015.12.258 (DOI)
Conference
Proceedings of the 17th IFAC Symposium on System Identification, Beijing, China, October 19-21, 2015.
Projects
CADICS
Funder
Swedish Research Council, 637-2014-466Swedish Research Council, 621-2013-5524
Available from: 2016-01-07 Created: 2016-01-07 Last updated: 2016-04-01
3. Hierarchical Bayesian approaches for robust inference in ARX models
Open this publication in new window or tab >>Hierarchical Bayesian approaches for robust inference in ARX models
2012 (English)In: Proceedings from the 16th IFAC Symposium on System Identification, 2012 / [ed] Michel Kinnaert, International Federation of Automatic Control , 2012, Vol. 16 Part 1, 131-136 p.Conference paper, Oral presentation only (Refereed)
Abstract [en]

Gaussian innovations are the typical choice in most ARX models but using other distributions such as the Student's t could be useful. We demonstrate that this choice of distribution for the innovations provides an increased robustness to data anomalies, such as outliers and missing observations. We consider these models in a Bayesian setting and perform inference using numerical procedures based on Markov Chain Monte Carlo methods. These models include automatic order determination by two alternative methods, based on a parametric model order and a sparseness prior, respectively. The methods and the advantage of our choice of innovations are illustrated in three numerical studies using both simulated data and real EEG data.

Place, publisher, year, edition, pages
International Federation of Automatic Control, 2012
Series
IFAC papers online, ISSN 1474-6670 ; 2012
Keyword
Particle Filtering/Monte Carlo Methods; Bayesian Methods
National Category
Signal Processing
Identifiers
urn:nbn:se:liu:diva-81258 (URN)10.3182/20120711-3-BE-2027.00318 (DOI)978-3-902823-06-9 (ISBN)
Conference
The 16th IFAC Symposium on System Identification, July 11-13, Brussels, Belgium
Projects
CADICSCNDS
Funder
Swedish Research Council
Available from: 2012-09-10 Created: 2012-09-10 Last updated: 2016-05-04Bibliographically approved

Open Access in DiVA

fulltext(9201 kB)460 downloads
File information
File name FULLTEXT01.pdfFile size 9201 kBChecksum SHA-512
fa57e0174afd95b40d66770f59c6f1dc70c85e7274c641a12b6f1ba66b097f7eabe7dfc1fac72cb7a957c0523dc41c26d80444babaf5530b72f21b9ea8fb3110
Type fulltextMimetype application/pdf
omslag(47 kB)20 downloads
File information
File name COVER01.pdfFile size 47 kBChecksum SHA-512
c115e9516256b177e1952108c27b39fe99013cfe7e3fa258b62ed20b79509b4fd7c54f92015502331914479cafceb85d01825be544c5cfc1cab074eaba83c940
Type coverMimetype application/pdf

Other links

Publisher's full text

Authority records BETA

Dahlin, Johan

Search in DiVA

By author/editor
Dahlin, Johan
By organisation
Automatic ControlThe Institute of Technology
Probability Theory and StatisticsControl EngineeringComputational Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 460 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 1809 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf