liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Bootstrap confidence intervals for large-scale multivariate monotonic regression problems
Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Mathematics, Optimization . Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0003-1836-4200
2016 (English)In: Communications in statistics. Simulation and computation, ISSN 0361-0918, E-ISSN 1532-4141, Vol. 45, no 3, 1025-1040 p.Article in journal (Refereed) Published
Abstract [en]

Recently, the methods used to estimate monotonic regression (MR) models have been substantially improved, and some algorithms can now produce high-accuracy monotonic fits to multivariate datasets containing over a million observations. Nevertheless, the computational burden can be prohibitively large for resampling techniques in which numerous datasets are processed independently of each other. Here, we present efficient algorithms for estimation of confidence limits in large-scale settings that take into account the similarity of the bootstrap or jackknifed datasets to which MR models are fitted. In addition, we introduce modifications that substantially improve the accuracy of MR solutions for binary response variables. The performance of our algorithms isillustrated using data on death in coronary heart disease for a large population. This example also illustrates that MR can be a valuable complement to logistic regression.

Place, publisher, year, edition, pages
Taylor & Francis, 2016. Vol. 45, no 3, 1025-1040 p.
Keyword [en]
Big data, Bootstrap, Confidence intervals, Monotonic regression, Pool- adjacent-violators algorithm
National Category
Probability Theory and Statistics Computational Mathematics
Identifiers
URN: urn:nbn:se:liu:diva-85169DOI: 10.1080/03610918.2014.911899ISI: 000372527900014OAI: oai:DiVA.org:liu-85169DiVA: diva2:565741
Note

Vid tiden för disputation förelåg publikationen som manuskript

Available from: 2012-11-08 Created: 2012-11-08 Last updated: 2017-12-13
In thesis
1. Monotonic regression for large multivariate datasets
Open this publication in new window or tab >>Monotonic regression for large multivariate datasets
2010 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Monoton regression för stora multivariata datamateriaI
Abstract [en]

Monotonic regression is a non-parametric statistical method that is designed especially for applications in which the expected value of a response variable increases or decreases in one or more explanatory variables. Such applications can be found in business, physics, biology, medicine, signal processing, and other areas. Inasmuch as many of the collected datasets can contain a very large number of multivariate observations, there is a strong need for efficient numerical algorithms. Here, we present new methods that make it feasible to fit monotonic functions to more than one hundred thousand data points. By simulation, we show that our algorithms have high accuracy and represent  considerable improvements with respect to computational time and memory requirements. In particular , we demonstrate how segmentation of a large-scale problem can greatly improve the performance of existing algorithms. Moreover, we show how the uncertainty of a monotonic regression model can be estimated. One of the procedures we developed can be employed to estimate the variance of the random error present in the observed response. Other procedures are based on resampling  techniques and can provide confidence intervals for the expected response at given levels of a set of predictors.

Abstract [sv]

Monoton regression är en icke-parametrisk statistisk metod som är utvecklad speciellt för tillämpningar i vilka det förväntade värdet aven responsvariabel ökar eller minskar med en eller flera förklaringsvariabler. Sådana tillämpningar finns inom företagsekonomi, fysik, biologi, medicin, signalbehandling och andra områden. Eftersom många insamlade datamaterial kan innehålla ett mycket stort antal multivariata observationer finns ett starkt behov av effektiva numeriska algoritmer. Här presenterar vi nya metoder som gör det möjligt att anpassa monotona funktioner till mer än 100000 datapunkter. Genom simulering visar vi. att våra algoritmer har hög noggrannhet och innebär betydande förbättringar med avseende på beräkningstid och krav på minnesutrymme. Speciellt visar vi hur segmentering av ett storskaligt problem starkt kan förbättra existerande algoritmer. Dessutom visar vi hur osäkerheten aven monoton regressions modell kan uppskattas. En av de metoder vi utvecklat kan användas för att uppskatta variansen för de slumpkomponenter som kan finnas i den observerade responsvariabeln. Andra metoder, baserade på s.k. återsampling, kan ge konfidensintervall för den förväntade responsen för givna värden på ett antal prediktorer.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2010. 75 p.
Series
Linköping Studies in Statistics, ISSN 1651-1700 ; 11Linköping Studies in Arts and Science, ISSN 0282-9800 ; 514
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:liu:diva-65349 (URN)978-91-7393-412-1 (ISBN)
Public defence
2010-04-16, Glashuset, Building B, Campus Valla, Linköpings universitet, Linköping, 13:15 (English)
Opponent
Available from: 2011-02-04 Created: 2011-02-04 Last updated: 2012-11-08Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Sysoev, OlegGrimvall, AndersBurdakov, Oleg

Search in DiVA

By author/editor
Sysoev, OlegGrimvall, AndersBurdakov, Oleg
By organisation
StatisticsFaculty of Science & EngineeringOptimization
In the same journal
Communications in statistics. Simulation and computation
Probability Theory and StatisticsComputational Mathematics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 83 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf