liu.seSearch for publications in DiVA
Change search
Refine search result
123456 1 - 50 of 266
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the 'Create feeds' function.
  • 1.
    Ahlinder, Jon
    et al.
    Totalförsvarets Forskningsinstitut, FOI, Stockholm, Sweden.
    Nordgaard, Anders
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences. Swedish National Forensic Centre (NFC), Linköping, Sweden.
    Wiklund Lindström, Susanne
    Totalförsvarets Forskningsinstitut, FOI, Stockholm, Sweden.
    Chemometrics comes to court: evidence evaluation of chem–bio threat agent attacks2015In: Journal of Chemometrics, ISSN 0886-9383, E-ISSN 1099-128X, Vol. 29, no 5, 267-276 p.Article in journal (Refereed)
    Abstract [en]

    Forensic statistics is a well-established scientific field whose purpose is to statistically analyze evidence in order to support legal decisions. It traditionally relies on methods that assume small numbers of independent variables and multiple samples. Unfortunately, such methods are less applicable when dealing with highly correlated multivariate data sets such as those generated by emerging high throughput analytical technologies. Chemometrics is a field that has a wealth of methods for the analysis of such complex data sets, so it would be desirable to combine the two fields in order to identify best practices for forensic statistics in the future. This paper provides a brief introduction to forensic statistics and describes how chemometrics could be integrated with its established methods to improve the evaluation of evidence in court.

    The paper describes how statistics and chemometrics can be integrated, by analyzing a previous know forensic data set composed of bacterial communities from fingerprints. The presented strategy can be applied in cases where chemical and biological threat agents have been illegally disposed.

  • 2.
    Ahmad, M. Rauf
    et al.
    Linköping University, Department of Mathematics, Mathematical Statistics . Linköping University, The Institute of Technology.
    Ohlson, Martin
    Linköping University, Department of Mathematics, Mathematical Statistics . Linköping University, The Institute of Technology.
    von Rosen, Dietrich
    Linköping University, Department of Mathematics, Mathematical Statistics . Linköping University, The Institute of Technology.
    A U-statistics Based Approach to Mean Testing for High Dimensional Multivariate Data Under Non-normality2011Report (Other academic)
    Abstract [en]

    A test statistic is considered for testing a hypothesis for the mean vector for multivariate data, when the dimension of the vector, p, may exceed the number of vectors, n, and the underlying distribution need not necessarily be normal. With n, p large, and under mild assumptions, the statistic is shown to asymptotically follow a normal distribution. A by product of the paper is the approximate distribution of a quadratic form, based on the reformulation of well-known Box's approximation, under high-dimensional set up.

  • 3.
    Ahmad, M. Rauf
    et al.
    Linköping University, Department of Mathematics, Mathematical Statistics . Linköping University, The Institute of Technology.
    Ohlson, Martin
    Linköping University, Department of Mathematics, Mathematical Statistics . Linköping University, The Institute of Technology.
    von Rosen, Dietrich
    Department of Energy and Technology, Swedish Univerity of Agricultural Sciences, SE-750 07 Uppsala, Sweden.
    Some Tests of Covariance Matrices for High Dimensional Multivariate Data2011Report (Other academic)
    Abstract [en]

    Test statistics for sphericity and identity of the covariance matrix are presented, when the data are multivariate normal and the dimension, p, can exceed the sample size, n. Using the asymptotic theory of U-statistics, the test statistics are shown to follow an approximate normal distribution for large p, also when p >> n. The statistics are derived under very general conditions, particularly avoiding any strict assumptions on the traces of the unknown covariance matrix. Neither any relationship between n and p is assumed. The accuracy of the statistics is shown through simulation results, particularly emphasizing the case when p can be much larger than n. The validity of the commonly used assumptions for high-dimensional set up is also briefly discussed.

  • 4.
    Ahmad, M. Rauf
    et al.
    Swedish University of Agricultural Sciences, Uppsala, Sweden and Department of Statistics, Uppsala University, Sweden.
    von Rosen, Dietrich
    Linköping University, Department of Mathematics, Mathematical Statistics . Linköping University, The Institute of Technology.
    Singull, Martin
    Linköping University, Department of Mathematics, Mathematical Statistics . Linköping University, The Institute of Technology.
    A note on mean testing for high dimensional multivariate data under non-normality2013In: Statistica neerlandica (Print), ISSN 0039-0402, E-ISSN 1467-9574, Vol. 67, no 1, 81-99 p.Article in journal (Refereed)
    Abstract [en]

    A test statistic is considered for testing a hypothesis for the mean vector for multivariate data, when the dimension of the vector, p, may exceed the number of vectors, n, and the underlying distribution need not necessarily be normal. With n,p→∞, and under mild assumptions, but without assuming any relationship between n and p, the statistic is shown to asymptotically follow a chi-square distribution. A by product of the paper is the approximate distribution of a quadratic form, based on the reformulation of the well-known Box's approximation, under high-dimensional set up. Using a classical limit theorem, the approximation is further extended to an asymptotic normal limit under the same high dimensional set up. The simulation results, generated under different parameter settings, are used to show the accuracy of the approximation for moderate n and large p.

  • 5.
    Alnervik, Jonna
    et al.
    Linköping University, Department of Computer and Information Science, Statistics.
    Nord Andersson, Peter
    Linköping University, Department of Computer and Information Science, Statistics.
    En retrospektiv studie av vilka patientgrupper som erhåller insulinpump2010Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [sv]

    Målsättning

    Att utreda skillnader i tillgänglighet till insulinpump mellan olika patientgrupper samt vad som orsakar ett byte till insulinpump.

    Metod

    Data från 7224 individer med typ 1 diabetes vid tio olika vårdenheter analyserades för att utreda effekterna av njurfunktion, kön, långtidsblodsocker, insulindos, diabetesduration samt ålder. Jämförelsen mellan patientgrupper utfördes med logistisk regression som en tvärsnittsstudie och Cox-regression för att utreda vad som föregått ett byte till pump.

    Resultat

    Genom logistisk regression erhölls en bild av hur skillnader mellan patienter som använder insulinpump och patienter som inte gör det ser ut i dagsläget. Cox-regressionen tar med ett tidsperspektiv och ger på så sätt svar på vad som föregått ett byte till insulinpump. Dessa analyser gav liknande resultat gällande variabler konstanta över tiden. Kvinnor använder pump i större utsträckning än män och andelen pumpanvändare skiljer sig åt vid olika vårdenheter. I dagsläget visar sig hög ålder sänka sannolikheten att använda insulinpump, vilket bekräftas vid den tidsberoende studien som visade hur sannolikheten att byta till pump är avsevärt lägre vid hög ålder. Långtidsblodsockret har också tydlig effekt på sannolikheten att gå över till pump där ett högt långtidsblodsocker medför hög sannolikhet att byta till insulinpump.

    Slutsatser

    I dagsläget finns det skillnader i andelen insulinpumpanvändare mellan olika patientgrupper och skillnader finns även i de olika gruppernas benägenhet att byta från andra insulinbehandlingar till insulinpump. Beroende av patienters njurfunktion, kön, långtidsblodsocker, insulindos, diabetesduration och ålder har dessa olika sannolikheter att byta till insulinpump.

  • 6.
    Anderskär, Erika
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Thomasson, Frida
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Inkrementell responsanalys av Scandnavian Airlines medlemmar: Vilka kunder ska väljas vid riktad marknadsföring?2017Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    Scandinavian Airlines has a large database containing their Eurobonus members. In order to analyze which customers they should target with direct marketing, such as emails, uplift models have been used. With a binary response variable that indicates whether the customer has bought or not, and a binary dummy variable that indicates if the customer has received the campaign or not conclusions can be drawn about which customers are persuadable. That means that the customers that buy when they receive a campaign and not if they don't are spotted. Analysis have been done with one campaign for Sweden and Scandinavia. The methods that have been used are logistic regression with Lasso and logistic regression with Penalized Net Information Value. The best method for predicting purchases is Lasso regression when comparing with a confusion matrix. The variable that best describes persuadable customers in logistic regression with PNIV is Flown (customers that have own with SAS within the last six months). In Lassoregression the variable that describes a persuadable customer in Sweden is membership level1 (the rst level of membership) and in Scandinavia customers that receive campaigns with delivery code 13 are persuadable, which is a form of dispatch.

  • 7.
    Andersson Hagiwara, Magnus
    et al.
    University of Borås, Sweden.
    Andersson Gare, Boel
    Jönköping University, Sweden.
    Elg, Mattias
    Linköping University, Department of Management and Engineering, Logistics & Quality Management. Linköping University, Faculty of Science & Engineering. Linköping University, HELIX Vinn Excellence Centre.
    Interrupted Time Series Versus Statistical Process Control in Quality Improvement Projects2016In: Journal of Nursing Care Quality, ISSN 1057-3631, E-ISSN 1550-5065, Vol. 31, no 1, E1-E8 p.Article in journal (Refereed)
    Abstract [en]

    To measure the effect of quality improvement interventions, it is appropriate to use analysis methods that measure data over time. Examples of such methods include statistical process control analysis and interrupted time series with segmented regression analysis. This article compares the use of statistical process control analysis and interrupted time series with segmented regression analysis for evaluating the longitudinal effects of quality improvement interventions, using an example study on an evaluation of a computerized decision support system.

  • 8.
    Andersson Naesseth, Christian
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Nowcasting using Microblog Data2012Independent thesis Basic level (degree of Bachelor), 10,5 credits / 16 HE creditsStudent thesis
    Abstract [en]

    The explosion of information and user generated content made publicly available through the internet has made it possible to develop new ways of inferring interesting phenomena automatically. Some interesting examples are the spread of a contagious disease, earth quake occurrences, rainfall rates, box office results, stock market fluctuations and many many more. To this end a mathematical framework, based on theory from machine learning, has been employed to show how frequencies of relevant keywords in user generated content can estimate daily rainfall rates of different regions in Sweden using microblog data.

    Microblog data are collected using a microblog crawler. Properties of the data and data collection methods are both discussed extensively. In this thesis three different model types are studied for regression, linear and nonlinear parametric models as well as a nonparametric Gaussian process model. Using cross-validation and optimization the relevant parameters of each model are estimated and the model is evaluated on independent test data. All three models show promising results for nowcasting rainfall rates.

  • 9.
    Andersson Naesseth, Christian
    et al.
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Lindsten, Fredrik
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Schön, Thomas
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Capacity estimation of two-dimensional channels using Sequential Monte Carlo2014In: 2014 IEEE Information Theory Workshop, 2014, 431-435 p.Conference paper (Refereed)
    Abstract [en]

    We derive a new Sequential-Monte-Carlo-based algorithm to estimate the capacity of two-dimensional channel models. The focus is on computing the noiseless capacity of the 2-D (1, ∞) run-length limited constrained channel, but the underlying idea is generally applicable. The proposed algorithm is profiled against a state-of-the-art method, yielding more than an order of magnitude improvement in estimation accuracy for a given computation time.

  • 10.
    Andersson Naesseth, Christian
    et al.
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, Faculty of Science & Engineering.
    Lindsten, Fredrik
    The University of Cambridge, Cambridge, United Kingdom.
    Schön, Thomas
    Uppsala University, Uppsala, Sweden.
    Nested Sequential Monte Carlo Methods2015In: Proceedings of The 32nd International Conference on Machine Learning / [ed] Francis Bach, David Blei, Journal of Machine Learning Research (Online) , 2015, Vol. 37, 1292-1301 p.Conference paper (Refereed)
    Abstract [en]

    We propose nested sequential Monte Carlo (NSMC), a methodology to sample from sequences of probability distributions, even where the random variables are high-dimensional. NSMC generalises the SMC framework by requiring only approximate, properly weighted, samples from the SMC proposal distribution, while still resulting in a correct SMC algorithm. Furthermore, NSMC can in itself be used to produce such properly weighted samples. Consequently, one NSMC sampler can be used to construct an efficient high-dimensional proposal distribution for another NSMC sampler, and this nesting of the algorithm can be done to an arbitrary degree. This allows us to consider complex and high-dimensional models using SMC. We show results that motivate the efficacy of our approach on several filtering problems with dimensions in the order of 100 to 1 000.

  • 11.
    Andersson Naesseth, Christian
    et al.
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Lindsten, Fredrik
    University of Cambridge, Cambridge, UK.
    Schön, Thomas
    Uppsala University, Uppsala, Sweden.
    Sequential Monte Carlo for Graphical Models2014In: Advances in Neural Information Processing Systems, 2014, 1862-1870 p.Conference paper (Refereed)
    Abstract [en]

    We propose a new framework for how to use sequential Monte Carlo (SMC) algorithms for inference in probabilistic graphical models (PGM). Via a sequential decomposition of the PGM we find a sequence of auxiliary distributions defined on a monotonically increasing sequence of probability spaces. By targeting these auxiliary distributions using SMC we are able to approximate the full joint distribution defined by the PGM. One of the key merits of the SMC sampler is that it provides an unbiased estimate of the partition function of the model. We also show how it can be used within a particle Markov chain Monte Carlo framework in order to construct high-dimensional block-sampling algorithms for general PGMs.

  • 12.
    Andersson, Niklas
    et al.
    Linköping University, Department of Computer and Information Science, Statistics.
    Hansson, Josef
    Linköping University, Department of Computer and Information Science, Statistics.
    Metodik för detektering av vägåtgärder via tillståndsdata2010Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The Swedish Transport Administration has, and manages, a database containing information of the status of road condition on all paved and governmental operated Swedish roads. The purpose of the database is to support the Pavement Management System (PMS). The PMS is used to identify sections of roads where there is a need for treatment, how to allocate resources and to get a general picture of the state of the road network condition. All major treatments should be reported which has not always been done.

    The road condition is measured using a number of indicators on e.g. the roads unevenness. Rut depth is an indicator of the roads transverse unevenness. When a treatment has been done the condition drastically changes, which is also reflected by these indicators.

    The purpose of this master thesis is to; by using existing indicators make predictions to find points in time when a road has been treated.

    We have created a SAS-program based on simple linear regression to analyze rut depth changes over time. The function of the program is to find levels changes in the rut depth trend. A drastic negative change means that a treatment has been made.

    The proportion of roads with an alleged date for the latest treatment earlier than the programs latest detected date was 37 percent. It turned out that there are differences in the proportions of possible treatments found by the software and actually reported roads between different regions. The regions North and Central have the highest proportion of differences. There are also differences between the road groups with various amount of traffic. The differences between the regions do not depend entirely on the fact that the proportion of heavily trafficked roads is greater for some regions.

  • 13.
    Ansell, Ricky
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Biology. Linköping University, Faculty of Science & Engineering. Polismyndigheten - Nationellt Forensiskt Centrum.
    Nordgaard, Anders
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences. Polismyndigheten - Nationellt Forensiskt Centrum.
    Hedell, Ronny
    Polismyndigheten - Nationellt Forensiskt Centrum.
    Interpretation of DNA Evidence: Implications of Thresholds Used in the Forensic Laboratory2014Conference paper (Other academic)
    Abstract [en]

    Evaluation of forensic evidence is a process lined with decisions and balancing, not infrequently with a substantial deal of subjectivity. Already at the crime scene a lot of decisions have to be made about search strategies, the amount of evidence and traces recovered, later prioritised and sent further to the forensic laboratory etc. Within the laboratory there must be several criteria (often in terms of numbers) on how much and what parts of the material should be analysed. In addition there is often a restricted timeframe for delivery of a statement to the commissioner, which in reality might influence on the work done. The path of DNA evidence from the recovery of a trace at the crime scene to the interpretation and evaluation made in court involves several decisions based on cut-offs of different kinds. These include quality assurance thresholds like limits of detection and quantitation, but also less strictly defined thresholds like upper limits on prevalence of alleles not observed in DNA databases. In a verbal scale of conclusions there are lower limits on likelihood ratios for DNA evidence above which the evidence can be said to strongly support, very strongly support, etc. a proposition about the source of the evidence. Such thresholds may be arbitrarily chosen or based on logical reasoning with probabilities. However, likelihood ratios for DNA evidence depend strongly on the population of potential donors, and this may not be understood among the end-users of such a verbal scale. Even apparently strong DNA evidence against a suspect may be reported on each side of a threshold in the scale depending on whether a close relative is part of the donor population or not. In this presentation we review the use of thresholds and cut-offs in DNA analysis and interpretation and investigate the sensitivity of the final evaluation to how such rules are defined. In particular we show what are the effects of cut-offs when multiple propositions about alternative sources of a trace cannot be avoided, e.g. when there are close relatives to the suspect with high propensities to have left the trace. Moreover, we discuss the possibility of including costs (in terms of time or money) for a decision-theoretic approach in which expected values of information could be analysed.

  • 14.
    Arvid, Odencrants
    et al.
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, The Institute of Technology.
    Dennis, Dahl
    Linköping University, Department of Computer and Information Science. Linköping University, The Institute of Technology.
    Utvärdering av Transportstyrelsens flygtrafiksmodeller2014Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    The Swedish Transport Agency has for a long time collected data on a monthly basis for different variables that are used to make predictions, short projections as well as longer projections. They have used SAS for producing statistical models in air transport. The model with the largest value of coefficient of determination is the method that has been used for a long time. The Swedish Transport Agency felt it was time for an evaluation of their models and methods of how projections is estimated, they would also explore the possibilities to use different, completely new models for forecasting air travel. This Bachelor thesis examines how the Holt-Winters method does compare with SARIMA, error terms such as RMSE, MAPE, R2, AIC and BIC  will be compared between the methods. 

    The results which have been produced showing that there may be a risk that the Holt-Winters models adepts a bit too well in a few variables in which Holt-Winters method has been adapted. But overall the Holt-Winters method generates better forecasts .

  • 15.
    Barkhagen, Mathias
    Linköping University, Department of Management and Engineering, Production Economics. Linköping University, The Institute of Technology.
    Risk-Neutral and Physical Estimation of Equity Market Volatility2013Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    The overall purpose of the PhD project is to develop a framework for making optimal decisions on the equity derivatives markets. Making optimal decisions refers e.g. to how to optimally hedge an options portfolio or how to make optimal investments on the equity derivatives markets. The framework for making optimal decisions will be based on stochastic programming (SP) models, which means that it is necessary to generate high-quality scenarios of market prices at some future date as input to the models. This leads to a situation where the traditional methods, described in the literature, for modeling market prices do not provide scenarios of sufficiently high quality as input to the SP model. Thus, the main focus of this thesis is to develop methods that improve the estimation of option implied surfaces from a cross-section of observed option prices compared to the traditional methods described in the literature. The estimation is complicated by the fact that observed option prices contain a lot of noise and possibly also arbitrage. This means that in order to be able to estimate option implied surfaces which are free of arbitrage and of high quality, the noise in the input data has to be adequately handled by the estimation method.

    The first two papers of this thesis develop a non-parametric optimization based framework for the estimation of high-quality arbitrage-free option implied surfaces. The first paper covers the estimation of the risk-neutral density (RND) surface and the second paper the local volatility surface. Both methods provide smooth and realistic surfaces for market data. Estimation of the RND is a convex optimization problem, but the result is sensitive to the parameter choice. When the local volatility is estimated the parameter choice is much easier but the optimization problem is non-convex, even though the algorithm does not seem to get stuck in local optima. The SP models used to make optimal decisions on the equity derivatives markets also need generated scenarios for the underlying stock prices or index levels as input. The third paper of this thesis deals with the estimation and evaluation of existing equity market models. The third paper gives preliminary results which show that, out of the compared models, a GARCH(1,1) model with Poisson jumps provides a better fit compared to more complex models with stochastic volatility for the Swedish OMXS30 index.

    List of papers
    1. Non-parametric estimation of the option implied risk-neutral density surface
    Open this publication in new window or tab >>Non-parametric estimation of the option implied risk-neutral density surface
    (English)Manuscript (preprint) (Other academic)
    Abstract [en]

    Accurate pricing of exotic or illiquid derivatives which is consistent with noisy market prices presents a major challenge. The pricing accuracy will crucially depend on using arbitrage free inputs to the pricing engine. This paper develops a general optimization based framework for estimation of the option implied risk-neutral density (RND), while satisfying no-arbitrage constraints. Our developed framework is a generalization of the RNDs implied by existing parametric models such as the Heston model. Thus, the method considers all types of realistic surfaces and is hence not constrained to a certain function class. When solving the problem the RND is discretized making it possible to use general purpose optimization algorithms. The approach leads to an optimization model where it is possible to formulate the constraints as linear constraints making the resulting optimization problem convex. We show that our method produces smooth local volatility surfaces that can be used for pricing and hedging of exotic derivatives. By perturbing input data with random errors we demonstrate that our method gives better results than the Heston model in terms of yielding stable RNDs.

    Keyword
    Risk-neutral density surface, Non-parametric estimation, Optimization, No-arbitrage constraints, Implied volatility surface, Local volatility
    National Category
    Economics and Business Probability Theory and Statistics
    Identifiers
    urn:nbn:se:liu:diva-94357 (URN)
    Available from: 2013-06-25 Created: 2013-06-25 Last updated: 2013-06-26Bibliographically approved
    2. Non-parametric estimation of local variance surfaces
    Open this publication in new window or tab >>Non-parametric estimation of local variance surfaces
    (English)Manuscript (preprint) (Other academic)
    Abstract [en]

    In this paper we develop a general optimization based framework for estimation of the option implied local variance surface. Given a specific level of consistency with observed market prices there exist an infinite number of possible surfaces. Instead of assuming shape constraints for the surface, as in many traditional methods, we seek the solution in the subset of realistic surfaces. We select local volatilities as variables in the optimization problem since it makes it easy to ensure absence of arbitrage, and realistic local volatilities imply realistic risk-neutral density- (RND), implied volatility- and price surfaces. The objective function combines a measure of consistency with market prices, and a weighted integral of the squared second derivatives of local volatility in the strike and the time-to-maturity direction. Derivatives prices in the optimization model are calculated efficiently with a finite difference scheme on a non-uniform grid. The framework has previously been successfully applied to the estimation of RND surfaces. Compared to when modeling the RND, it is for local volatility much easier to choose the parameters in the model. Modeling the RND produces a convex optimization problem which is not the case when modeling local volatility, but empirical tests indicate that the solution does not get stuck in local optima. We show that our method produces local volatility surfaces with very high quality and which are consistent with observed option quotes. Thus, unlike many methods described in the literature, our method does not produce a local volatility surface with irregular shape and many spikes or a non-smooth and multimodal RND for input data with a lot of noise.

    Keyword
    Local volatility surface; Non-parametric estimation; Optimization; No-arbitrage conditions
    National Category
    Economics and Business Probability Theory and Statistics
    Identifiers
    urn:nbn:se:liu:diva-94358 (URN)
    Available from: 2013-06-25 Created: 2013-06-25 Last updated: 2013-06-26Bibliographically approved
    3. Statistical tests for selected equity market models
    Open this publication in new window or tab >>Statistical tests for selected equity market models
    (English)Manuscript (preprint) (Other academic)
    Abstract [en]

    In this paper we evaluate which of four candidate equity market models that provide the best fit to observed closing data for the OMXS30 index from 30 September 1986 to 6 May 2013. The candidate models are two GARCH type models and two stochastic volatility models. The stochastic volatility models are estimated with the help of Markov Chain Monte Carlo methods. We provide the full derivations of the posterior distributions for the two stochastic volatility models, which to our knowledge have not been provided in the literature before. With the help of statistical tests we conclude that, out of the four candidate models, a GARCH model which includes jumps in the index level provides the best fit to the observed OMXS30 closing data.

    Keyword
    GARCH models, stochastic volatility models, Markov Chain Monte Carlo methods, statistical tests
    National Category
    Economics and Business Probability Theory and Statistics
    Identifiers
    urn:nbn:se:liu:diva-94359 (URN)
    Available from: 2013-06-25 Created: 2013-06-25 Last updated: 2013-06-26Bibliographically approved
  • 16.
    Barkhagen, Mathias
    Linköping University, Department of Management and Engineering, Production Economics. Linköping University, The Institute of Technology.
    Statistical tests for selected equity market modelsManuscript (preprint) (Other academic)
    Abstract [en]

    In this paper we evaluate which of four candidate equity market models that provide the best fit to observed closing data for the OMXS30 index from 30 September 1986 to 6 May 2013. The candidate models are two GARCH type models and two stochastic volatility models. The stochastic volatility models are estimated with the help of Markov Chain Monte Carlo methods. We provide the full derivations of the posterior distributions for the two stochastic volatility models, which to our knowledge have not been provided in the literature before. With the help of statistical tests we conclude that, out of the four candidate models, a GARCH model which includes jumps in the index level provides the best fit to the observed OMXS30 closing data.

  • 17.
    Barkhagen, Mathias
    et al.
    Linköping University, Department of Management and Engineering, Production Economics. Linköping University, Faculty of Science & Engineering.
    Blomvall, Jörgen
    Linköping University, Department of Management and Engineering, Production Economics. Linköping University, Faculty of Science & Engineering.
    Modeling and evaluation of the option book hedging problem using stochastic programming2016In: Quantitative finance (Print), ISSN 1469-7688, E-ISSN 1469-7696, Vol. 16, no 2, 259-273 p.Article in journal (Refereed)
    Abstract [en]

    Hedging of an option book in an incomplete market with transaction costs is an important problem in finance that many banks have to solve on a daily basis. In this paper, we develop a stochastic programming (SP) model for the hedging problem in a realistic setting, where all transactions take place at observed bid and ask prices. The SP model relies on a realistic modeling of the important risk factors for the application, the price of the underlying security and the volatility surface. The volatility surface is unobservable and must be estimated from a cross section of observed option quotes that contain noise and possibly arbitrage. In order to produce arbitrage-free volatility surfaces of high quality as input to the SP model, a novel non-parametric estimation method is used. The dimension of the volatility surface is infinite and in order to be able solve the problem numerically, we use discretization and principal component analysis to reduce the dimensions of the problem. Testing the model out-of-sample for options on the Swedish OMXS30 index, we show that the SP model is able to produce a hedge that has both a lower realized risk and cost compared with dynamic delta and delta-vega hedging strategies.

  • 18.
    Barkhagen, Mathias
    et al.
    Linköping University, Department of Management and Engineering, Production Economics. Linköping University, The Institute of Technology.
    Blomvall, Jörgen
    Linköping University, Department of Management and Engineering, Production Economics. Linköping University, The Institute of Technology.
    Non-parametric estimation of local variance surfacesManuscript (preprint) (Other academic)
    Abstract [en]

    In this paper we develop a general optimization based framework for estimation of the option implied local variance surface. Given a specific level of consistency with observed market prices there exist an infinite number of possible surfaces. Instead of assuming shape constraints for the surface, as in many traditional methods, we seek the solution in the subset of realistic surfaces. We select local volatilities as variables in the optimization problem since it makes it easy to ensure absence of arbitrage, and realistic local volatilities imply realistic risk-neutral density- (RND), implied volatility- and price surfaces. The objective function combines a measure of consistency with market prices, and a weighted integral of the squared second derivatives of local volatility in the strike and the time-to-maturity direction. Derivatives prices in the optimization model are calculated efficiently with a finite difference scheme on a non-uniform grid. The framework has previously been successfully applied to the estimation of RND surfaces. Compared to when modeling the RND, it is for local volatility much easier to choose the parameters in the model. Modeling the RND produces a convex optimization problem which is not the case when modeling local volatility, but empirical tests indicate that the solution does not get stuck in local optima. We show that our method produces local volatility surfaces with very high quality and which are consistent with observed option quotes. Thus, unlike many methods described in the literature, our method does not produce a local volatility surface with irregular shape and many spikes or a non-smooth and multimodal RND for input data with a lot of noise.

  • 19.
    Barkhagen, Mathias
    et al.
    Linköping University, Department of Management and Engineering, Production Economics. Linköping University, The Institute of Technology.
    Blomvall, Jörgen
    Linköping University, Department of Management and Engineering, Production Economics. Linköping University, The Institute of Technology.
    Non-parametric estimation of the option implied risk-neutral density surfaceManuscript (preprint) (Other academic)
    Abstract [en]

    Accurate pricing of exotic or illiquid derivatives which is consistent with noisy market prices presents a major challenge. The pricing accuracy will crucially depend on using arbitrage free inputs to the pricing engine. This paper develops a general optimization based framework for estimation of the option implied risk-neutral density (RND), while satisfying no-arbitrage constraints. Our developed framework is a generalization of the RNDs implied by existing parametric models such as the Heston model. Thus, the method considers all types of realistic surfaces and is hence not constrained to a certain function class. When solving the problem the RND is discretized making it possible to use general purpose optimization algorithms. The approach leads to an optimization model where it is possible to formulate the constraints as linear constraints making the resulting optimization problem convex. We show that our method produces smooth local volatility surfaces that can be used for pricing and hedging of exotic derivatives. By perturbing input data with random errors we demonstrate that our method gives better results than the Heston model in terms of yielding stable RNDs.

  • 20.
    Bartoszek, Krzysztof
    Department of Mathematics, Uppsala University, Uppsala, Sweden.
    Phylogenetic effective sample size2016In: Journal of Theoretical Biology, ISSN 0022-5193, E-ISSN 1095-8541, Vol. 407, 371-386 p.Article in journal (Refereed)
    Abstract [en]

    In this paper I address the question—how large is a phylogenetic sample? I propose a definition of a phylogenetic effective sample size for Brownian motion and Ornstein-Uhlenbeck processes-the regression effective sample size. I discuss how mutual information can be used to define an effective sample size in the non-normal process case and compare these two definitions to an already present concept of effective sample size (the mean effective sample size). Through a simulation study I find that the AICc is robust if one corrects for the number of species or effective number of species. Lastly I discuss how the concept of the phylogenetic effective sample size can be useful for biodiversity quantification, identification of interesting clades and deciding on the importance of phylogenetic correlations.

  • 21.
    Bartoszek, Krzysztof
    Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg, Gothenburg, Sweden.
    Quantifying the effects of anagenetic and cladogenetic evolution2014In: Mathematical Biosciences, ISSN 0025-5564, E-ISSN 1879-3134, Vol. 254, 42-57 p.Article in journal (Refereed)
    Abstract [en]

    An ongoing debate in evolutionary biology is whether phenotypic change occurs predominantly around the time of speciation or whether it instead accumulates gradually over time. In this work I propose a general framework incorporating both types of change, quantify the effects of speciational change via the correlation between species and attribute the proportion of change to each type. I discuss results of parameter estimation of Hominoid body size in this light. I derive mathematical formulae related to this problem, the probability generating functions of the number of speciation events along a randomly drawn lineage and from the most recent common ancestor of two randomly chosen tip species for a conditioned Yule tree. Additionally I obtain in closed form the variance of the distance from the root to the most recent common ancestor of two randomly chosen tip species.

  • 22.
    Bartoszek, Krzysztof
    Gdansk University of Technology, Poland.
    The Bootstrap and Other Methods of Testing Phylogenetic Trees2007In: Zeszyty Naukowe Wydzialu ETI Politechniki Gdanskiej, 2007, 103-108 p.Conference paper (Refereed)
    Abstract [en]

    The final step of a phylogenetic analysis is the test of the generated tree. This is not a easy task for which there is an obvious methodology because we do not know the full probabilistic model of evolution. A number of methods have been proposed but there is a wide debate concerning the interpretations of the results they produce.

  • 23.
    Bartoszek, Krzysztof
    Department of Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg, Göteborg Sweden.
    The Laplace Motion in Phylogenetic Comparative Methods2012In: Proceedings of the 18th National Conference on Applications of Mathematics in Biology and Medicine, 2012, 25-30 p.Conference paper (Refereed)
    Abstract [en]

    The majority of current phylogenetic comparative methods assume that the stochastic evolutionaryprocess is homogeneous over the phylogeny or offer relaxations of this in rather limited and usually parameter expensive ways. Here we make a preliminary investigation, bymeans of a numerical experiment, whether the Laplace motion process can offer an alternative approach.

  • 24.
    Bartoszek, Krzysztof
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Trait evolution with jumps: illusionary normality2017In: Proceedings of the XXIII National Conference on Applications of Mathematics in Biology and Medicine, 2017, 23-28 p.Conference paper (Refereed)
    Abstract [en]

    Phylogenetic comparative methods for real-valued traits usually make use of stochastic process whose trajectories are continuous.This is despite biological intuition that evolution is rather punctuated thangradual. On the other hand, there has been a number of recent proposals of evolutionarymodels with jump components. However, as we are only beginning to understandthe behaviour of branching Ornstein-Uhlenbeck (OU) processes the asymptoticsof branching  OU processes with jumps is an even greater unknown. In thiswork we build up on a previous study concerning OU with jumps evolution on a pure birth tree.We introduce an extinction component and explore via simulations, its effects on the weak convergence of such a process.We furthermore, also use this work to illustrate the simulation and graphic generation possibilitiesof the mvSLOUCH package.

  • 25.
    Bartoszek, Krzysztof
    et al.
    Gdansk University of Technology, Poland.
    Bartoszek, Wojciech
    Gdansk University of Technology, Poland.
    On the Time Behaviour of Okazaki Fragments2006In: Journal of Applied Probability, ISSN 0021-9002, E-ISSN 1475-6072, Vol. 43, no 2, 500-509 p.Article in journal (Refereed)
    Abstract [en]

    We find explicit analytical formulae for the time dependence of the probability of the number of Okazaki fragments produced during the process of DNA replication. This extends a result of Cowan on the asymptotic probability distribution of these fragments.

  • 26.
    Bartoszek, Krzysztof
    et al.
    Gdansk University of Technology.
    Izydorek, Bartosz
    Gdansk University of Technology.
    Ratajczak, Tadeusz
    Gdansk University of Technology, Poland.
    Skokowski, Jaroslaw
    Medical University of Gdansk, Poland.
    Szwaracki, Karol
    Gdansk University of Technology, Poland.
    Tomczak, Wiktor
    Gdansk University of Technology, Poland.
    Neural Network Breast Cancer Relapse Time Prognosis2006In: ASO Summer School 2006 abstract book Ostrzyce 30.06-2.07. 2006 / [ed] J. Skokowski and K. Drucis, 2006, 8-10 p.Conference paper (Other academic)
    Abstract [en]

    This paper is a result of a project at the Faculty of Electronics, Telecommunication and Computer Science (Technical University of Gdansk). The aim of the project was to create a neural network to predict the relapsetime of breast cancer. The neural network was to be trained on data collected over the past 20 years by dr. Jarosław Skokowski. The data includes 439 patient records described by about 40 parameters. For our neuralnetwork we only considered 6 medically most significant parameters the number of nodes showing evidence of cancer, size of tumour (in mm.), age, bloom score, estrogen receptors and proestrogen receptors and the relapsetime as the outcome. Our neural network was created in the MATLAB environment.

  • 27.
    Bartoszek, Krzysztof
    et al.
    Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg, Gothenburg, Sweden.
    Jones, Graham
    Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg, Gothenburg, Sweden / Department of Biological and Environmental Science, University of Gothenburg, Gothenburg, Sweden.
    Oxelman, Bengt
    Department of Biological and Environmental Science, University of Gothenburg, Gothenburg, Sweden.
    Sagitov, Serik
    Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg, Gothenburg, Sweden.
    Time to a single hybridization event in a group of species with unknown ancestral history2013In: Journal of Theoretical Biology, ISSN 0022-5193, E-ISSN 1095-8541, Vol. 322, 1-6 p.Article in journal (Refereed)
    Abstract [en]

    We consider a stochastic process for the generation of species which combines a Yule process with a simple model for hybridization between pairs of co-existent species. We assume that the origin of the process, when there was one species, occurred at an unknown time in the past, and we condition the process on producing n species via the Yule process and a single hybridization event. We prove results about the distribution of the time of the hybridization event. In particular we calculate a formula for all moments, and show that under various conditions, the distribution tends to an exponential with rate twice that of the birth rate for the Yule process.

  • 28.
    Bartoszek, Krzysztof
    et al.
    Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg, Sweden.
    Krzeminski, Michal
    Gdansk University of Technology.
    Critical case stochastic phylogenetic tree model via the Laplace transform2014In: Demonstratio Matematicae, ISSN 0420-1213, Vol. 47, no 2, 474-481 p.Article in journal (Refereed)
    Abstract [en]

    Birth-and-death models are now a common mathematical tool to describe branching patterns observed in real-world phylogenetic trees. Liggett and Schinazi (2009) is one such example. The authors propose a simple birth-and-death model that is compatible with phylogenetic trees of both in uenza and HIV, depending on the birth rate parameter. An interesting special case of this model is the critical case where the birth rate equals the death rate. This is a non-trivial situation and to study its asymptotic behaviour we employed the Laplace transform. With this we correct the proof of Liggett and Schinazi (2009) in the critical case.

  • 29.
    Bartoszek, Krzysztof
    et al.
    Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg.
    Krzeminski, Michal
    Gdansk University of Technology.
    Skokowski, Jaroslaw
    Medical University of Gdansk.
    Survival time prognosis under a Markov model of cancer development2010In: Proceedings of the XVI National Conference Applications of Mathematics to Biology and Medicine, Krynica, Poland, September 14–18, 2010 / [ed] M. Ziółko, M. Bodnar and E. Kutafina, 2010, 6-11 p.Conference paper (Refereed)
    Abstract [en]

    In this study we look at a breast cancer data set of women from the Pomerania region collected in the year 1987- 1992 in the Medical University of Gdansk.We analyze the clinical risk factors in conjunction with a Markov model of cancer development. We evaluate Artificial Neural Network (ANN) survival time prediction (which was done on this data set in a previous study) via a simulation study.

  • 30.
    Bartoszek, Krzysztof
    et al.
    Mathematical Statistics, Chalmers University of Technology and University of Gothenburg, Gothenburg, Sweden.
    Liò, Pietro
    Computer Laboratory, University of Cambridge Cambridge, United Kingdom.
    Sorathiya, Anil
    Computer Laboratory, University of Cambridge Cambridge, United Kingdom.
    Influenza differentiation and evolution2010In: Acta Physica Polonica B Proceedings Supplement, 2010, Vol. 3, 417-452 p., 2Conference paper (Refereed)
    Abstract [en]

    The aim of the study is to do a very wide analysis of HA, NA and M influenza gene segments to find short nucleotide regions,which differentiate between strains (i.e. H1, H2, ... e.t.c.), hosts, geographic regions, time when sequence was found and combination of time and region using a simple methodology. Finding regions  differentiating between strains has as its goal the construction of a Luminex microarray which will allow quick and efficient strain recognition. Discovery for the other splitting factors could shed lighton structures significant for host specificity and on the history of influenza evolution. A large number of places in the HA, NA and M gene segments were found that can differentiate between hosts, regions, time and combination of time and region. Also very good differentiation between different Hx strains can be seen.We link one of our findings to a proposed stochastic model of creation of viral phylogenetic trees.

  • 31.
    Bartoszek, Krzysztof
    et al.
    Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg, Gothenburg, Sweden.
    Pienaar, Jason
    Department of Genetics, University of Pretoria, Pretoria 0002, South Africa.
    Mostad, Petter
    Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg, Gothenburg, Sweden.
    Andersson, Staffan
    Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden.
    Hansen, Thomas F.
    CEES, Department of Biology, University of Oslo, Oslo, Norway.
    A phylogenetic comparative method for studying multivariate adaptation2012In: Journal of Theoretical Biology, ISSN 0022-5193, E-ISSN 1095-8541, Vol. 314, 204-215 p.Article in journal (Refereed)
    Abstract [en]

    Phylogenetic comparative methods have been limited in the way they model adaptation. Although some progress has been made, there are still no methods that can fully account for coadaptationbetween traits. Based on Ornstein-Uhlenbeck (OU) models of adaptive evolution, we present a method,with R implementation, in which multiple traits evolve both in response to each other and, as inprevious OU models, to fixed or randomly evolving predictor variables. We present the interpretation ofthe model parameters in terms of evolutionary and optimal regressions enabling the study of allometric and adaptive relationships between traits. To illustrate the method we reanalyze a data set of antlerand body-size evolution in deer (Cervidae).

  • 32.
    Bartoszek, Krzysztof
    et al.
    Department of Mathematics, Uppsala University, Uppsala, Sweden.
    Pietro, Lio'
    Computer Laboratory , University of Cambridge, Cambridge, Un ited Kingdom.
    A novel algorithm to reconstruct phylogenies using gene sequences and expression data2014In: International Proceedings of Chemical, Biological & Environmental Engineering; Environment, Energy and Biotechnology III, 2014, Vol. 70, 8-12 p.Conference paper (Refereed)
    Abstract [en]

    Phylogenies based on single loci should be viewed with caution and the best approach for obtaining robust trees is to examine numerous loci across the genome. It often happens that for the same set of species trees derived from different genes are in conflict between each other. There are several methods that combine information from different genes in order to infer the species tree. One novel approach is to use informationfrom different -omics. Here we describe a phylogenetic method based on an Ornstein–Uhlenbeck process that combines sequence and gene expression data. We test our method on genes belonging to the histidine biosynthetic operon. We found that the method provides interesting insights into selection pressures and adaptive hypotheses concerning gene expression levels.

  • 33.
    Bartoszek, Krzysztof
    et al.
    Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg, Sweden.
    Pulka, Malgorzata
    Department of Probability and Biomathematics, Gdánsk University of Technology, Gdánsk, Poland.
    Quadratic stochastic operators as a tool in modelling the dynamics of a distribution of a population trait2013In: Proceedings of the 19th National Conference on Applications of Mathematics in Biology and Medicine / [ed] Katarzyna D. Lewandowska and Piotr Bogús, 2013, 19-24 p.Conference paper (Refereed)
    Abstract [en]

    Quadratic stochastic operators can exhibit a wide variety of asymptotic behaviours and these have been introducedand studied recently. In the present work we discuss biological interpretations that can be attributedto them. We also propose a computer simulation method to illustrate the behaviour of iterates of quadratic stochastic operators.

  • 34.
    Bartoszek, Krzysztof
    et al.
    Department of Mathematics, Uppsala University, Uppsala, Sweden.
    Pulka, Malgorzta
    Department of Probability and Biomathematics, Gdańsk University of Technology, Gdańsk, Poland.
    Asymptotic properties of quadratic stochastic operators acting on the L1 space2015In: Nonlinear Analysis, ISSN 0362-546X, E-ISSN 1873-5215, Vol. 114, 26-39 p.Article in journal (Refereed)
    Abstract [en]

    Quadratic stochastic operators can exhibit a wide variety of asymptotic behaviours andthese have been introduced and studied recently in the l1 space. It turns out that inprinciple most of the results can be carried over to the L1 space. However, due to topologicalproperties of this space one has to restrict in some situations to kernel quadratic stochasticoperators. In this article we study the uniform and strong asymptotic stability of quadratic stochastic operators acting on the L1 space in terms of convergence of the associated (linear)nonhomogeneous Markov chains.

  • 35.
    Bartoszek, Krzysztof
    et al.
    Department of Mathematics, Uppsala University, Uppsala, Sweden.
    Pułka, Małgorzata
    Department of Probability and Biomathematics, Gdańsk University of Technology, Gdańsk, Poland.
    Prevalence Problem in the Set of Quadratic StochasticOperators Acting on L12015In: Bulletin of the Malaysian Mathematical Sciences Society, ISSN 0126-6705, 1-15 p.Article in journal (Refereed)
    Abstract [en]

    This paper is devoted to the study of the problem of prevalence in the classof quadratic stochastic operators acting on the L1 space for the uniform topology.We obtain that the set of norm quasi-mixing quadratic stochastic operators is a denseand open set in the topology induced by a very natural metric. This shows the typicallong-term behaviour of iterates of quadratic stochastic operators.

  • 36.
    Bartoszek, Krzysztof
    et al.
    Uppsala universitet, Tillämpad matematik och statistik.
    Sagitov, Serik
    A consistent estimator of the evolutionary rate2015In: Journal of Theoretical Biology, ISSN 0022-5193, E-ISSN 1095-8541, Vol. 371, 69-78 p.Article in journal (Refereed)
    Abstract [en]

    We consider a branching particle system where particles reproduce according to the pure birth Yule process with the birth rate 2, conditioned on the observed number of particles to be equal to n. Particles are assumed to move independently on the real line according to the Brownian motion with the local variance sigma(2). In this paper we treat n particles as a sample of related species. The spatial Brownian motion of a particle describes the development of a trait value of interest (e.g. log-body-size). We propose an unbiased estimator 4 of the evolutionary rate rho(2) - sigma(2)/lambda. The estimator R-n(2) is proportional to the sample variance S-n(2) computed from n trait values. We find an approximate formula for the standard error of R-n(2), based on a neat asymptotic relation for the variance of S-n(2). (C) 2015 Elsevier Ltd. All rights reserved.

  • 37.
    Bartoszek, Krzysztof
    et al.
    Department of Mathematics, Uppsala University, Uppsala, Sweden.
    Sagitov, Serik
    Chalmers University of Technology and the Unversity of Gothenburg, Sweden.
    Phylogenetic confidence intervals for the optimal trait value2015In: Journal of Applied Probability, ISSN 0021-9002, E-ISSN 1475-6072, Vol. 52, no 4, 1115-1132 p.Article in journal (Refereed)
    Abstract [en]

    We consider a stochastic evolutionary model for a phenotype developing amongst n related species with unknown phylogeny. The unknown tree ismodelled by a Yule process conditioned on n contemporary nodes. The trait value is assumed to evolve along lineages as an Ornstein–Uhlenbeck process. As a result, the trait values of the n species form a sample with dependent observations. We establish three limit theorems for the samplemean corresponding to three domains for the adaptation rate. In the case of fast adaptation, we show that for large n the normalized sample mean isapproximately normally distributed. Using these limit theorems, we develop novel confidence interval formulae for the optimal trait value.

  • 38.
    Bartoszek, Krzysztof
    et al.
    Gdansk University of Technology, Poland.
    Signerska, Justyna
    Gdansk University of Technology, Poland.
    Moments of the Distribution of Okazaki Fragments2006In: Rose–Hulman Undergraduate Mathematics Journal, Vol. 7, no 2, 1-5 p.Article in journal (Refereed)
    Abstract [en]

    This paper is a continuation of Bartoszek & Bartoszek (2006) who provide formulae for the probability distributions of the number of Okazaki fragments at time t during the process of DNA replication. Given the expressions for the moments of the probability distribution of the number of Okazaki fragments at time t in the recursive form, we evaluated formulae for the third and fourth moments, using Mathematica, and obtained results in explicit form. Having done this, we calculated the distribution’s skewness and kurtosis.

  • 39.
    Bartoszek, Krzysztof
    et al.
    Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg.
    Stokowska, Anna
    University of Gothenburg.
    Performance of pseudo-likelihood estimator in modelling cells' proliferation with noisy measurements2010In: Conference proceedings from the 12th International Workshop for Young Mathematicians "Probability and Statistics", Krakow, Poland, 20th till 26th September 2009, 2010, 21-42 p.Conference paper (Other academic)
    Abstract [en]

    Branching processes are widely used to describe cell development and proliferation. Currently parameter estimation is studied in mathematical models describing the dynamics of cell cultures where we can get very accurate measurements of cell counts. In vivo samples we will not have this accuracy, here the noise levels can be very significant. We will study a newly proposed pseudo-likelihood estimator of a multitype Bellman-Harris process modelling cell development and see how it performs under noisy measurements of cell counts.

  • 40.
    Bendtsen, Marcus
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Gated Bayesian Networks2017Doctoral thesis, monograph (Other academic)
    Abstract [en]

    Bayesian networks have grown to become a dominant type of model within the domain of probabilistic graphical models. Not only do they empower users with a graphical means for describing the relationships among random variables, but they also allow for (potentially) fewer parameters to estimate, and enable more efficient inference. The random variables and the relationships among them decide the structure of the directed acyclic graph that represents the Bayesian network. It is the stasis over time of these two components that we question in this thesis.

    By introducing a new type of probabilistic graphical model, which we call gated Bayesian networks, we allow for the variables that we include in our model, and the relationships among them, to change overtime. We introduce algorithms that can learn gated Bayesian networks that use different variables at different times, required due to the process which we are modelling going through distinct phases. We evaluate the efficacy of these algorithms within the domain of algorithmic trading, showing how the learnt gated Bayesian networks can improve upon a passive approach to trading. We also introduce algorithms that detect changes in the relationships among the random variables, allowing us to create a model that consists of several Bayesian networks, thereby revealing changes and the structure by which these changes occur. The resulting models can be used to detect the currently most appropriate Bayesian network, and we show their use in real-world examples from both the domain of sports analytics and finance.

  • 41.
    Berglund, Frida
    et al.
    Linköping University, Department of Computer and Information Science, Statistics.
    Oskarsson, Mayumi Setsu
    Linköping University, Department of Computer and Information Science, Statistics.
    Modellering av spårvidd över bandel 119 inom Stambanan genom Övre Norrland: Kandidatuppsats i Statistik och dataanalys2015Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    The Swedish Transport Administration (Trafikverket) has been in charge of the maintenance of the railway systems since 2010. The railway requires regular maintenance in order to keep tracks in good condition for passengers and other transports safety. To insure this safety it is important to measure the tracks geometrical condition. The gauge is one of the most important geometrics that cannot be too wide or narrow.

    The aim of this report is to create a model that is able to simulate the deviation from normal gauge from track geometrics and properties.

    The deviation from normal gauge is a random quantity that we modeled as a generalized linear model or a generalized additive model. The models can be used to simulate the possible values of the deviation. It was demonstrated in this study that GAM was able to model most of the variation in the deviation from normal gauge with the information from some track geometrics and properties.

  • 42.
    Bergstrand, Frida
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Nguyen, Ngan
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Bakgrundsvariablers påverkan på enkätsvaren i en telefonintervju: En studie om effekt av intervjuarens, respondentens och intervjuns egenskaper2017Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    Norstat recurrently performs a survey that contains questions about how much the respondent is watching different tv-channels, how different media-devices are used, the ownership of different devices and the usage of different tv-channel sites on the internet, social media, internet services, magazine services and streaming services. In this thesis, data from the survey performed during the autumn of 2016 was used. The aim of this thesis is to examine if there is a difference in answers based on different characteristics of the interviewers and respondents. 

    The 15 most important questions from the survey were chosen in this thesis, and to further reduce the number of response variables principal component analysis was used. The new scores that were produced by the analysis were the reduced response variables, which kept the most important information from the questions in the survey. Thereafter multilevel analyses and regression analyses were performed to examine the effects.  

    The results showed that there was an effect of different characteristics in different questions in the survey. The characteristics that showed effect were the age of the interviewer, the length of the employment, the age of the respondent, education, sex and native language. Some of the questions also showed effect based on whether the respondent lived in a metropolitan region or not.

  • 43.
    Berntsson, Fredrik
    et al.
    Linköping University, Department of Mathematics, Computational Mathematics. Linköping University, Faculty of Science & Engineering.
    Ohlson, Martin
    Linköping University, Department of Mathematics, Mathematical Statistics . Linköping University, Faculty of Science & Engineering.
    More on Estimation of Banded and Banded Toeplitz Covariance Matrices2017Report (Other academic)
    Abstract [en]

    In this paper we consider two different linear covariance structures, e.g., banded and bended Toeplitz, and how to estimate them using different methods, e.g., by minimizing different norms.

    One way to estimate the parameters in a linear covariance structure is to use tapering, which has been shown to be the solution to a universal least squares problem. We know that tapering not always guarantee the positive definite constraints on the estimated covariance matrix and may not be a suitable method. We propose some new methods which preserves the positive definiteness and still give the correct structure.

    More specific we consider the problem of estimating parameters of a multivariate normal p–dimensional random vector for (i) a banded covariance structure reflecting m–dependence, and (ii) a banded Toeplitz covariance structure.

  • 44.
    Bondesson, Matilda
    et al.
    Linköping University, Department of Computer and Information Science.
    Svensson, Josefin
    Linköping University, Department of Computer and Information Science.
    Följdinvandring och medborgarskap: en statistisk analys2009Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    During the last years around 100 000 immigrants have arrived to Sweden, people with different reasons and different goals for settling down in Sweden. The reason for immigrating to Sweden that will be dealt with in this thesis is following immigration, i.e. when someone moves here because they have relatives living in the country.

    The reason why it is interesting to study following immigration is that it is an affecting factor for how many that will immigrate to Sweden the following years and may then be used to make a forecast, based on how many first time immigrants there are. To be able to investigate the following immigration analyses are made with time series, logistic regression and Poisson regression. An ARIMA-model has been used to estimate the number of following immigrants in the future.

    The other part of this thesis will inquire the matter how inclined immigrants are to become Swedish citizens, whether they even apply for citizenship and also how long time it takes from the time when they fulfil the conditions for Swedish citizenship until they apply. Here also multiple logistic regression will be used and then ordinary regression.

    The most common reason for permitted residence in Sweden is following immigration. Following immigration has increased since 1998, mainly over the last years there has been a substantial immigration increase. It is difficult to predict how the immigration will develop during the following years due to the occurred growth of immigrants at the end of the study period. Since 1998 about 5% of the persons that have got permitted residence in Sweden are association persons. Most common to be an association person is an older man and the reason he got permitted residence was asylum. The association persons have in average 3,16 following immigrants tied to them.

    To be Swedish citizen through naturalization there are conditions that need to be fulfilled, for example, the immigrant has to have been settled in Sweden for a certain time. For the immigrants that fulfil this time condition there are about 79 % that apply for Swedish citizenship. The largest probability that an immigrant apply for citizenship occur if the person is young, woman and following immigrant. The ones that apply for citizenship are waiting in average 57 days until they are applying after they fulfil the time condition.

     

  • 45.
    Bonneau, Maxime
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Reinforcement Learning for 5G Handover2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The development of the 5G network is in progress, and one part of the process that needs to be optimised is the handover. This operation, consisting of changing the base station (BS) providing data to a user equipment (UE), needs to be efficient enough to be a seamless operation. From the BS point of view, this operation should be as economical as possible, while satisfying the UE needs.  In this thesis, the problem of 5G handover has been addressed, and the chosen tool to solve this problem is reinforcement learning. A review of the different methods proposed by reinforcement learning led to the restricted field of model-free, off-policy methods, more specifically the Q-Learning algorithm. On its basic form, and used with simulated data, this method allows to get information on which kind of reward and which kinds of action-space and state-space produce good results. However, despite working on some restricted datasets, this algorithm does not scale well due to lengthy computation times. It means that the agent trained can not use a lot of data for its learning process, and both state-space and action-space can not be extended a lot, restricting the use of the basic Q-Learning algorithm to discrete variables. Since the strength of the signal (RSRP), which is of high interest to match the UE needs, is a continuous variable, a continuous form of the Q-learning needs to be used. A function approximation method is then investigated, namely artificial neural networks. In addition to the lengthy computational time, the results obtained are not convincing yet. Thus, despite some interesting results obtained from the basic form of the Q-Learning algorithm, the extension to the continuous case has not been successful. Moreover, the computation times make the use of reinforcement learning applicable in our domain only for really powerful computers.

  • 46.
    Borau, Noelia
    Linköping University, Department of Computer and Information Science.
    Kalman Filter with Adaptive Noise Models for Statistical Post-Processing of Weather Forecasts2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    We develop Kalman filter with adaptive noise models for statistical post-processing of 2-metre temperature forecasts for the purpose of reducing the systematic errors that numerical weather prediction models usually suffer. For this, we propose time-varying dynamic linear models for the system noise covariance matrix and the measurement noise covariance matrix, and we study how that affects the mean predictions of the underlying state and the observed data. Five Kalman filter models are introduced, a discrete Kalman filter model with the distinctive feature that the measurement (observation) at time t is the observed forecast error at that time, two Kalman filter with adaptive noise models where the measurement noise covariance matrix is time-varying, a Kalman filter model where the forecasts of the 10-metre wind components are included as explanatory variables, and a Kalman filter with heavy-tailed noise using the Student’s t-distribution under a Bayesian approach. Ten weather stations located in Sweden are selected trying to obtain a heterogeneous sample and six different forecasts issued are filtered with different sets of initial values. 

    The implementation of these methods has been done in Python and R.

  • 47.
    Brommesson, Peter
    et al.
    Linköping University, Department of Physics, Chemistry and Biology, Theoretical Biology. Linköping University, Faculty of Science & Engineering.
    Wennergren, Uno
    Linköping University, Department of Physics, Chemistry and Biology, Theoretical Biology. Linköping University, Faculty of Science & Engineering.
    Lindström, Tom
    Linköping University, Department of Physics, Chemistry and Biology, Theoretical Biology. Linköping University, Faculty of Science & Engineering.
    Spatiotemporal Variation in Distance Dependent Animal Movement Contacts: One Size Doesnt Fit All2016In: PLoS ONE, ISSN 1932-6203, E-ISSN 1932-6203, Vol. 11, no 10, e0164008- p.Article in journal (Refereed)
    Abstract [en]

    The structure of contacts that mediate transmission has a pronounced effect on the outbreak dynamics of infectious disease and simulation models are powerful tools to inform policy decisions. Most simulation models of livestock disease spread rely to some degree on predictions of animal movement between holdings. Typically, movements are more common between nearby farms than between those located far away from each other. Here, we assessed spatiotemporal variation in such distance dependence of animal movement contacts from an epidemiological perspective. We evaluated and compared nine statistical models, applied to Swedish movement data from 2008. The models differed in at what level ( if at all), they accounted for regional and/or seasonal heterogeneities in the distance dependence of the contacts. Using a kernel approach to describe how probability of contacts between farms changes with distance, we developed a hierarchical Bayesian framework and estimated parameters by using Markov Chain Monte Carlo techniques. We evaluated models by three different approaches of model selection. First, we used Deviance Information Criterion to evaluate their performance relative to each other. Secondly, we estimated the log predictive posterior distribution, this was also used to evaluate their relative performance. Thirdly, we performed posterior predictive checks by simulating movements with each of the parameterized models and evaluated their ability to recapture relevant summary statistics. Independent of selection criteria, we found that accounting for regional heterogeneity improved model accuracy. We also found that accounting for seasonal heterogeneity was beneficial, in terms of model accuracy, according to two of three methods used for model selection. Our results have important implications for livestock disease spread models where movement is an important risk factor for between farm transmission. We argue that modelers should refrain from using methods to simulate animal movements that assume the same pattern across all regions and seasons without explicitly testing for spatiotemporal variation.

  • 48.
    Brouwers, Jack
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Thellman, Björn
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Klassificering av vinkvalitet2017Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    The data used in this paper is an open source data, that was collected in Portugal over a three year period between 2004 and 2007. It consists of the physiochemical parameters, and the quality grade of the wines.

    This study focuses on assessing which variables that primarily affect the quality of a wine and how the effects of the variables interact with each other, and also compare which of the different classification methods work the best and have the highest degree of accuracy.

    The data is divided into red and white wine where the response variable is ordered and consists of the grades of quality for the different wines. Due to the distribution in the response variable having too few observations in some of the quality grades, a new response variable was created where several grades were pooled together so that each different grade category would have a good amount of observations.

    The statistical methods used are Bayesian ordered logistic regression as well as two data mining techniques which are neural networks and decision trees.

    The result obtained showed that for the two types of wine it is primarily the alcohol content and the amount of volatile acid that are recurring parameters which have a great influence on predicting the quality of the wines.

    The results also showed that among the three different methods, decision trees were the best at classifying the white wines and the neural network were the best for the red wines.

  • 49.
    Bruzzone, Andrea
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    P-SGLD: Stochastic Gradient Langevin Dynamics with control variates2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Year after years, the amount of data that we continuously generate is increasing. When this situation started the main challenge was to find a way to store the huge quantity of information. Nowadays, with the increasing availability of storage facilities, this problem is solved but it gives us a new issue to deal with: find tools that allow us to learn from this large data sets. In this thesis, a framework for Bayesian learning with the ability to scale to large data sets is studied. We present the Stochastic Gradient Langevin Dynamics (SGLD) framework and show that in some cases its approximation of the posterior distribution is quite poor. A reason for this can be that SGLD estimates the gradient of the log-likelihood with a high variability due to naïve sampling. Our approach combines accurate proxies for the gradient of the log-likelihood with SGLD. We show that it produces better results in terms of convergence to the correct posterior distribution than the standard SGLD, since accurate proxies dramatically reduce the variance of the gradient estimator. Moreover, we demonstrate that this approach is more efficient than the standard Markov Chain Monte Carlo (MCMC) method and that it exceeds other techniques of variance reduction proposed in the literature such as SAGA-LD algorithm. This approach also uses control variates to improve SGLD so that it is straightforward the comparison with our approach. We apply the method to the Logistic Regression model. 

  • 50.
    Burauskaite-Harju, Agne
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Characterizing Temporal Changes and Inter-Site Correlations in Daily and Sub-Daily Precipitation Extremes2011Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Information on weather extremes is essential for risk awareness in planning of infrastructure and agriculture, and it may also playa key role in our ability to adapt to recurrent or more or less unique extreme events. This thesis reports new statistical methodologies that can aid climate risk assessment under conditions of climate change. This increasing access to high temporal resolution of data is a central factor when developing novel techniques for this purpose. In particular, a procedure is introduced for analysis of long-term changes in daily and sub-daily records of observed or modelled weather extremes. Extreme value theory is employed to enhance the power of the proposed statistical procedure, and inter-site dependence is taken into account to enable regional analyses. Furthermore, new methods are presented to summarize and visualize spatial patterns in the temporal synchrony and dependence of weather events such as heavy precipitation at a network of meteorological stations. The work also demonstrates the significance of accounting for temporal synchrony in the diagnostics of inter-site asymptotic dependence.

    List of papers
    1. A test for network-wide trends in rainfall extremes
    Open this publication in new window or tab >>A test for network-wide trends in rainfall extremes
    2012 (English)In: International Journal of Climatology, ISSN 0899-8418, E-ISSN 1097-0088, ISSN 0899-8418, Vol. 32, no 1, 86-94 p.Article in journal (Refereed) Published
    Abstract [en]

    Temporal trends in meteorological extremes are often examined by first reducing daily data to annual index values, such as the 95th or 99th percentiles. Here, we report how this idea can be elaborated to provide an efficient test for trends at a network of stations. The initial step is to make separate estimates of tail probabilities of precipitation amounts for each combination of station and year by fitting a generalised Pareto distribution (GPD) to data above a user-defined threshold. The resulting time series of annual percentile estimates are subsequently fed into a multivariate Mann-Kendall (MK) test for monotonic trends. We performed extensive simulations using artificially generated precipitation data and noted that the power of tests for temporal trends was substantially enhanced when ordinary percentiles were substituted for GPD percentiles. Furthermore, we found that the trend detection was robust to misspecification of the extreme value distribution. An advantage of the MK test is that it can accommodate non-linear trends, and it can also take into account the dependencies between stations in a network. To illustrate our approach, we used long time series of precipitation data from a network of stations in The Netherlands.

    Place, publisher, year, edition, pages
    Wiley, 2012
    Keyword
    climate extremes; precipitation; temporal trend; generalised Pareto distribution; climate indices; global warming
    National Category
    Climate Research Probability Theory and Statistics
    Identifiers
    urn:nbn:se:liu:diva-63099 (URN)10.1002/joc.2263 (DOI)000298733800007 ()
    Note
    funding agencies|Swedish Environmental Protection Agency||Available from: 2010-12-13 Created: 2010-12-10 Last updated: 2012-02-27
    2. Statistical framework for assessing trends in sub-daily and daily precipitation extremes
    Open this publication in new window or tab >>Statistical framework for assessing trends in sub-daily and daily precipitation extremes
    (English)Manuscript (preprint) (Other academic)
    Abstract [en]

    Extreme precipitation events vary with regard to duration, and hence sub-daily data do not necessarily exhibit the same trends as daily data. Here, we present a framework for a comprehensive yet easily undertaken statistical analysis of long-term trends in daily and sub-daily extremes. A parametric peaks-over-threshold model is employed to estimate annual percentiles for data of different temporal resolution. Moreover, a trend-durationfrequency table is used to summarize how the statistical significance of trends in annual percentiles varies with the temporal resolution of the underlying data and the severity of the extremes. The proposed framework also includes nonparametric tests that can integrate information about nonlinear monotonic trends at a network of stations. To illustrate our methodology, we use climate model output data from Kalmar, Sweden, and observational data from Vancouver, Canada. In both these cases, the results show different trends for moderate and high extremes, and also a clear difference in the statistical evidence of trends for daily and sub-daily data.

    Keyword
    Rainfall extremes; precipitation; sub-daily, temporal trend; generalized Pareto distribution; climate indices; global warming
    National Category
    Probability Theory and Statistics
    Identifiers
    urn:nbn:se:liu:diva-71296 (URN)
    Available from: 2011-10-10 Created: 2011-10-10 Last updated: 2011-10-10Bibliographically approved
    3. Characterizing and visualizing spatio-temporal patterns in hourly precipitation records
    Open this publication in new window or tab >>Characterizing and visualizing spatio-temporal patterns in hourly precipitation records
    Show others...
    2012 (English)In: Journal of Theoretical and Applied Climatology, ISSN 0177-798X, E-ISSN 1434-4483, Vol. 109, no 3-4, 333-343 p.Article in journal (Refereed) Published
    Abstract [en]

    We develop new techniques to summarize and visualize spatial patterns of coincidence in weather events such as more or less heavy precipitation at a network of meteorological stations. The cosine similarity measure, which has a simple probabilistic interpretation for vectors of binary data, is generalized to characterize spatial dependencies of events that may reach different stations with a variable time lag. More specifically, we reduce such patterns into three parameters (dominant time lag, maximum cross-similarity, and window-maximum similarity) that can easily be computed for each pair of stations in a network. Furthermore, we visualize such threeparameter summaries by using colour-coded maps of dependencies to a given reference station and distance-decay plots for the entire network. Applications to hourly precipitation data from a network of 93 stations in Sweden illustrate how this method can be used to explore spatial patterns in the temporal synchrony of precipitation events.

    Place, publisher, year, edition, pages
    Springer, 2012
    Keyword
    precipitation; hourly rainfall records; spatial dependence; time lag; cosine similarity
    National Category
    Probability Theory and Statistics
    Identifiers
    urn:nbn:se:liu:diva-71297 (URN)10.1007/s00704-011-0574-x (DOI)000307243900002 ()
    Note

    funding agencies|Swedish Research Council (VR)||Gothenburg Atmospheric Science Centre (GAC)||FORMAS|2007-1048-8700*51|

    Available from: 2011-10-10 Created: 2011-10-10 Last updated: 2012-11-01Bibliographically approved
    4. Diagnostics for tail dependence in time-lagged random fields of precipitation
    Open this publication in new window or tab >>Diagnostics for tail dependence in time-lagged random fields of precipitation
    2013 (English)In: Journal of Theoretical and Applied Climatology, ISSN 0177-798X, E-ISSN 1434-4483, Vol. 112, no 3-4, 629-636 p.Article in journal (Refereed) Published
    Abstract [en]

    Weather extremes often occur along fronts passing different sites with some time lag. Here, we show how such temporal patterns can be taken into account when exploring inter-site dependence of extremes. We incorporate time lags into existing models and into measures of extremal associations and their relation to the distance between the investigated sites. Furthermore, we define summarizing parameters that can be used to explore tail dependence for a whole network of stations in the presence of fixed or stochastic time lags. Analysis of hourly precipitation data from Sweden showed that our methods can prevent underestimation of the strength and spatial extent of tail dependencies.

    Keyword
    Precipitation; Sub-daily; Tail dependence; Spatial dependence; Time lag
    National Category
    Probability Theory and Statistics
    Identifiers
    urn:nbn:se:liu:diva-71298 (URN)10.1007/s00704-012-0748-1 (DOI)000318246300022 ()
    Available from: 2011-10-10 Created: 2011-10-10 Last updated: 2013-05-31Bibliographically approved
123456 1 - 50 of 266
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf