Existing Bayesian spatial priors for functional magnetic resonance imaging (fMRI) data correspond to stationary isotropic smoothing filters that may oversmooth at anatomical boundaries. We propose two anatomically informed Bayesian spatial models for fMRI data with local smoothing in each voxel based on a tensor field estimated from a T1-weighted anatomical image. We show that our anatomically informed Bayesian spatial models results in posterior probability maps that follow the anatomical structure.
Aerial robots hold great potential for aiding Search and Rescue (SAR) efforts over large areas, such as during natural disasters. Traditional approaches typically search an area exhaustively, thereby ignoring that the density of victims varies based on predictable factors, such as the terrain, population density and the type of disaster. We present a probabilistic model to automate SAR planning, with explicit minimization of the expected time to discovery. The proposed model is a spatial point process with three interacting spatial fields for i) the point patterns of persons in the area, ii) the probability of detecting persons and iii) the probability of injury. This structure allows inclusion of informative priors from e.g. geographic or cell phone traffic data, while falling back to latent Gaussian processes when priors are missing or inaccurate. To solve this problem in real-time, we propose a combination of fast approximate inference using Integrated Nested Laplace Approximation (INLA), and a novel Monte Carlo tree search tailored to the problem. Experiments using data simulated from real world Geographic Information System (GIS) maps show that the framework outperforms competing approaches, finding many more injured in the crucial first hours.
We propose a novel method for MAP parameter inference in nonlinear state space models with intractable likelihoods. The method is based on a combination of Gaussian process optimisation (GPO), sequential Monte Carlo (SMC) and approximate Bayesian computations (ABC). SMC and ABC are used to approximate the intractable likelihood by using the similarity between simulated realisations from the model and the data obtained from the system. The GPO algorithm is used for the MAP parameter estimation given noisy estimates of the log-likelihood. The proposed parameter inference method is evaluated in three problems using both synthetic and real-world data. The results are promising, indicating that the proposed algorithm converges fast and with reasonable accuracy compared with existing methods.
Analysis of functional magnetic resonance imaging (fMRI) data is becoming ever more computationally demanding as temporal and spatial resolutions improve, and large, publicly available data sets proliferate. Moreover, methodological improvements in the neuroimaging pipeline, such as non-linear spatial normalization, non-parametric permutation tests and Bayesian Markov Chain Monte Carlo approaches, can dramatically increase the computational burden. Despite these challenges, there do not yet exist any fMRI software packages which leverage inexpensive and powerful graphics processing units (GPUs) to perform these analyses. Here, we therefore present BROCCOLI, a free software package written in OpenCL (Open Computing Language) that can be used for parallel analysis of fMRI data on a large variety of hardware configurations. BROCCOLI has, for example, been tested with an Intel CPU, an Nvidia GPU, and an AMD GPU. These tests show that parallel processing of fMRI data can lead to significantly faster analysis pipelines. This speedup can be achieved on relatively standard hardware, but further, dramatic speed improvements require only a modest investment in GPU hardware. BROCCOLI (running on a GPU) can perform non-linear spatial normalization to a 1 mm3 brain template in 4–6 s, and run a second level permutation test with 10,000 permutations in about a minute. These non-parametric tests are generally more robust than their parametric counterparts, and can also enable more sophisticated analyses by estimating complicated null distributions. Additionally, BROCCOLI includes support for Bayesian first-level fMRI analysis using a Gibbs sampler. The new software is freely available under GNU GPL3 and can be downloaded from github (https://github.com/wanderine/BROCCOLI/).
Simple models and algorithms based on restrictive assumptions are often used in the field of neuroimaging for studies involving functional magnetic resonance imaging, voxel based morphometry, and diffusion tensor imaging. Nonparametric statistical methods or flexible Bayesian models can be applied rather easily to yield more trustworthy results. The spatial normalization step required for multisubject studies can also be improved by taking advantage of more robust algorithms for image registration. A common drawback of algorithms based on weaker assumptions, however, is the increase in computational complexity. In this short overview, we will therefore present some examples of how inexpensive PC graphics hardware, normally used for demanding computer games, can be used to enable practical use of more realistic models and accurate algorithms, such that the outcome of neuroimaging studies really can be trusted.
We demonstrate improvements in predictive power when introducing spline functions to take account of highly nonlinear relationships between firm failure and leverage, earnings, and liquidity in a logistic bankruptcy model. Our results show that modeling excessive nonlinearities yields substantially improved bankruptcy predictions, on the order of 70%-90%, compared with a standard logistic model. The spline model provides several important and surprising insights into nonmonotonic bankruptcy relationships. We find that low-leveraged as well as highly profitable firms are riskier than those given by a standard model, possibly a manifestation of credit rationing and excess cash-flow volatility.
Spatial regularization is a technique that exploits the dependence between nearby regions to locally pool data, with the effect of reducing noise and implicitly smoothing the data. Most of the currently proposed methods are focused on minimizing a cost function, during which the regularization parameter must be tuned in order to find the optimal solution. We propose a fast Markov chain Monte Carlo (MCMC) method for diffusion tensor estimation, for both 2D and 3D priors data. The regularization parameter is jointly with the tensor using MCMC. We compare FA (fractional anisotropy) maps for various b-values using three diffusion tensor estimation methods: least-squares and MCMC with and without spatial priors. Coefficient of variation (CV) is calculated to measure the uncertainty of the FA maps calculated from the MCMC samples, and our results show that the MCMC algorithm with spatial priors provides a denoising effect and reduces the uncertainty of the MCMC samples.
Bayesian models often involve a small set of hyperparameters determined by maximizing the marginal likelihood. Bayesian optimization is an iterative method where a Gaussian process posterior of the underlying function is sequentially updated by new function evaluations. We propose a novel Bayesian optimization framework for situations where the user controls the computational effort and therefore the precision of the function evaluations. This is a common in econometrics where the marginal likelihood is often computed by Markov chain Monte Carlo or importance sampling methods. The new acquisition strategy gives the optimizer the option to explore the function with cheap noisy evaluations and therefore find the optimum faster. The method is applied to estimating the prior hyperparameters in two popular models on US macroeconomic time series data: the steady-state Bayesian vector autoregressive (BVAR) and the time-varying parameter BVAR with stochastic volatility.
Methods for choosing a fixed set of knot locations in additive spline models are fairly well established in the statistical literature. The curse of dimensionality makes it nontrivial to extend these methods to nonadditive surface models, especially when there are more than a couple of covariates. We propose a multivariate Gaussian surface regression model that combines both additive splines and interactive splines, and a highly efficient Markov chain Monte Carlo algorithm that updates all the knot locations jointly. We use shrinkage prior to avoid overfitting with different estimated shrinkage factors for the additive and surface part of the model, and also different shrinkage parameters for the different response variables. Simulated data and an application to firm leverage data show that the approach is computationally efficient, and that allowing for freely estimated knot locations can offer a substantial improvement in out-of-sample predictive performance.
Modern mobile devices provide ultra-high resolutions in their display panels. This imposes ever increasing workload on the GPU leading to high power consumption and shortened battery life. In this paper, we first show that resolution scaling leads to significant power savings. Second, we propose a perception-aware adaptive scheme that sets the resolution during game play. We exploit the fact that game players are often willing to trade quality for longer battery life. Our scheme uses decision theory, where the predicted user perception is combined with a novel asymmetric loss function that encodes users' alterations in their willingness to save power.
Generating user interpretable multi-class predictions in data-rich environments with many classes and explanatory covariates is a daunting task. We introduce Diagonal Orthant Latent Dirichlet Allocation (DOLDA), a supervised topic model for multi-class classification that can handle many classes as well as many covariates. To handle many classes we use the recently proposed Diagonal Orthant probit model (Johndrow et al., in: Proceedings of the sixteenth international conference on artificial intelligence and statistics, 2013) together with an efficient Horseshoe prior for variable selection/shrinkage (Carvalho et al. in Biometrika 97:465–480, 2010). We propose a computationally efficient parallel Gibbs sampler for the new model. An important advantage of DOLDA is that learned topics are directly connected to individual classes without the need for a reference class. We evaluate the model’s predictive accuracy and scalability, and demonstrate DOLDA’s advantage in interpreting the generated predictions.
Topic models, and more specifically the class of Latent Dirichlet Allocation (LDA), are widely used for probabilistic modeling of text. MCMC sampling from the posterior distribution is typically performed using a collapsed Gibbs sampler. We propose a parallel sparse partially collapsed Gibbs sampler and compare its speed and efficiency to state-of-the-art samplers for topic models on five well-known text corpora of differing sizes and properties. In particular, we propose and compare two different strategies for sampling the parameter block with latent topic indicators. The experiments show that the increase in statistical inefficiency from only partial collapsing is smaller than commonly assumed, and can be more than compensated by the speedup from parallelization and sparsity on larger corpora. We also prove that the partially collapsed samplers scale well with the size of the corpus. The proposed algorithm is fast, efficient, exact, and can be used in more modeling situations than the ordinary collapsed sampler.
A mixture of experts models the conditional density of a response variable using a mixture of regression models with covariate-dependent mixture weights. We extend the finite mixture of experts model by allowing the parameters in both the mixture components and the weights to evolve in time by following random walk processes. Inference for time-varying parameters in richly parameterized mixture of experts models is challenging. We propose a sequential Monte Carlo algorithm for online inference and based on a tailored proposal distribution built on ideas from linear Bayes methods and the EM algorithm. The method gives a unified treatment for mixtures with time-varying parameters, including the special case of static parameters. We assess the properties of the method on simulated data and on industrial data where the aim is to predict software faults in a continuously upgraded large-scale software project.
Regression density estimation is the problem of flexibly estimating a response distribution as a function of covariates. An important approach to regression density estimation uses finite mixture models and our article considers flexible mixtures of heteroscedastic regression (MHR) models where the response distribution is a normal mixture, with the component means, variances and mixture weights all varying as a function of covariates. Our article develops fast variational approximation methods for inference. Our motivation is that alternative computationally intensive MCMC methods for fitting mixture models are difficult to apply when it is desired to fit models repeatedly in exploratory analysis and model choice. Our article makes three contributions. First, a variational approximation for MHR models is described where the variational lower bound is in closed form. Second, the basic approximation can be improved by using stochastic approximation methods to perturb the initial solution to attain higher accuracy. Third, the advantages of our approach for model choice and evaluation compared to MCMC based approaches are illustrated. These advantages are particularly compelling for time series data where repeated refitting for one step ahead prediction in model choice and diagnostics and in rolling window computations is very common. Supplemental materials for the article are available online.
The complexity of the Metropolis–Hastings (MH) algorithm arises from the requirement of a likelihood evaluation for the full dataset in each iteration. One solution has been proposed to speed up the algorithm by a delayed acceptance approach where the acceptance decision proceeds in two stages. In the first stage, an estimate of the likelihood based on a random subsample determines if it is likely that the draw will be accepted and, if so, the second stage uses the full data likelihood to decide upon final acceptance. Evaluating the full data likelihood is thus avoided for draws that are unlikely to be accepted. We propose a more precise likelihood estimator that incorporates auxiliary information about the full data likelihood while only operating on a sparse set of the data. We prove that the resulting delayed acceptance MH is more efficient. The caveat of this approach is that the full dataset needs to be evaluated in the second stage. We therefore propose to substitute this evaluation by an estimate and construct a state-dependent approximation thereof to use in the first stage. This results in an algorithm that (i) can use a smaller subsample m by leveraging on recent advances in Pseudo-Marginal MH (PMMH) and (ii) is provably within O(m^-2) of the true posterior.
The paper explores the potential of Multi-Output Gaussian Processes to tackle network-wide travel time prediction in an urban area. Forecasting in this context is challenging due to the complexity of the traffic network, noisy data and unexpected events. We build on recent methods to develop an online model that can be trained in seconds by relying on prior network dependences through a coregionalized covariance. The accuracy of the proposed model outperforms historical means and other simpler methods on a network of 47 streets in Stockholm, by using probe data from GPS-equipped taxis. Results show how traffic speeds are dependent on the historical correlations, and how prediction accuracy can be improved by relying on prior information while using a very limited amount of current-day observations, which allows for the development of models with low estimation times and high responsiveness.
Dynamic transportation networks have been analyzed for years by means of static graph-based indicators in order to study the temporal evolution of relevant network components, and to reveal complex dependencies that would not be easily detected by a direct inspection of the data. This paper presents a state-of-the-art probabilistic latent network model to forecast multilayer dynamic graphs that are increasingly common in transportation and proposes a community-based extension to reduce the computational burden. Flexible time series analysis is obtained by modeling the probability of edges between vertices through latent Gaussian processes. The models and Bayesian inference are illustrated on a sample of 10-year data from four major airlines within the US air transportation system. Results show how the estimated latent parameters from the models are related to the airlines’ connectivity dynamics, and their ability to project the multilayer graph into the future for out-of-sample full network forecasts, while stochastic blockmodeling allows for the identification of relevant communities. Reliable network predictions would allow policy-makers to better understand the dynamics of the transport system, and help in their planning on e.g. route development, or the deployment of new regulations.
Bayesian whole-brain functional magnetic resonance imaging (fMRI) analysis with three-dimensional spatial smoothing priors has been shown to produce state-of-the-art activity maps without pre-smoothing the data. The proposed inference algorithms are computationally demanding however, and the spatial priors used have several less appealing properties, such as being improper and having infinite spatial range.We propose a statistical inference framework for whole-brain fMRI analysis based on the class of Mat ern covariance functions. The framework uses the Gaussian Markov random field (GMRF) representation of possibly anisotropic spatial Mat ern fields via the stochastic partial differential equation (SPDE) approach of Lindgren et al. (2011). This allows for more flexible and interpretable spatial priors, while maintaining the sparsity required for fast inference in the high-dimensional whole-brain setting. We develop an accelerated stochastic gradient descent (SGD) optimization algorithm for empirical Bayes (EB) inference of the spatial hyperparameters. Conditionally on the inferred hyperparameters, we make a fully Bayesian treatment of the brain activity. The Mat ern prior is applied to both simulated and experimental task-fMRI data and clearly demonstrates that it is a more reasonable choice than the previously used priors, using comparisons of activity maps, prior simulation and cross-validation.
We present a framework for the analysis of process variation across semiconductor wafers. The framework is capable of quantifying the primary parameters affected by process variation, e.g., the effective channel length, which is in contrast with the former techniques wherein only secondary parameters were considered, e.g., the leakage current. Instead of taking direct measurements of the quantity of interest, we employ Bayesian inference to draw conclusions based on indirect observations, e.g., on temperature. The proposed approach has low costs since no deployment of expensive test structures might be needed or only a small subset of the test equipments already deployed for other purposes might need to be activated. The experimental results present an assessment of our framework for a wide range of configurations.
The posterior distribution of the number of lags in a multivariate autoregression is derived under an improper prior for the model parameters. The fractional Bayes approach is used to handle the indeterminacy in the model selection caused by the improper prior. An asymptotic equivalence between the fractional approach and the Schwarz Bayesian Criterion (SBC) is proved. Several priors and three loss functions are entertained in a simulation study which focuses on the choice of lag length. The fractional Bayes approach performs very well compared to the three most widely used information criteria, and it seems to be reasonably robust to changes in the prior distribution for the lag length, especially under the zero-one loss.
We propose a general class of models and a unified Bayesian inference methodology for flexibly estimating the density of a response variable conditional on a possibly high-dimensional set of covariates. Our model is a finite mixture of component models with covariate-dependent mixing weights. The component densities can belong to any parametric family, with each model parameter being a deterministic function of covariates through a link function. Our MCMC methodology allows for Bayesian variable selection among the covariates in the mixture components and in the mixing weights. The model's parameterization and variable selection prior are chosen to prevent overtting. We use simulated and real datasets to illustrate the methodology
Background: Inference from fMRI data faces the challenge that the hemodynamic system that relates neural activity to the observed BOLD fMRI signal is unknown.
New method: We propose a new Bayesian model for task fMRI data with the following features: (i) joint estimation of brain activity and the underlying hemodynamics, (ii) the hemodynamics is modeled non-parametrically with a Gaussian process (GP) prior guided by physiological information and (iii) the predicted BOLD is not necessarily generated by a linear time-invariant (LTI) system. We place a GP prior directly on the predicted BOLD response, rather than on the hemodynamic response function as in previous literature. This allows us to incorporate physiological information via the GP prior mean in aflexible way, and simultaneously gives us the nonparametric flexibility of the GP.
Results: Results on simulated data show that the proposed model is able to discriminate between active and non-active voxels also when the GP prior deviates from the true hemodynamics. Our modelfinds time varying dynamics when applied to real fMRI data.
Comparison with existing method(s): The proposed model is better at detecting activity in simulated data than standard models, without inflating the false positive rate. When applied to real fMRI data, our GP model in several cases finds brain activity where previously proposed LTI models does not.
Conclusions: We have proposed a new non-linear model for the hemodynamics in task fMRI, that is able to detect active voxels, and gives the opportunity to ask new kinds of questions related to hemodynamics.