liu.seSearch for publications in DiVA
Change search
ReferencesLink to record
Permanent link

Direct link
A segmentation-based algorithm for large-scale partially ordered monotonic regression
Linköping University, Department of Computer and Information Science. Linköping University, The Institute of Technology.
Linköping University, The Institute of Technology. Linköping University, Department of Mathematics, Optimization .ORCID iD: 0000-0003-1836-4200
Linköping University, Department of Computer and Information Science. Linköping University, The Institute of Technology.
2011 (English)In: Computational Statistics & Data Analysis, ISSN 0167-9473, Vol. 55, no 8, 2463-2476 p.Article in journal (Refereed) Published
Abstract [en]

Monotonic regression (MR) is an efficient tool for estimating functions that are monotonic with respect to input variables. A fast and highly accurate approximate algorithm called the GPAV was recently developed for efficient solving large-scale multivariate MR problems. When such problems are too large, the GPAV becomes too demanding in terms of computational time and memory. An approach, that extends the application area of the GPAV to encompass much larger MR problems, is presented. It is based on segmentation of a large-scale MR problem into a set of moderate-scale MR problems, each solved by the GPAV. The major contribution is the development of a computationally efficient strategy that produces a monotonic response using the local solutions. A theoretically motivated trend-following technique is introduced to ensure higher accuracy of the solution. The presented results of extensive simulations on very large data sets demonstrate the high efficiency of the new algorithm.

Place, publisher, year, edition, pages
Elsevier Science B.V., Amsterdam. , 2011. Vol. 55, no 8, 2463-2476 p.
Keyword [en]
Quadratic programming, Large-scale optimization, Least distance problem, Monotonic regression, Partially ordered data set, Pool-adjacent-violators algorithm
National Category
Social Sciences
URN: urn:nbn:se:liu:diva-69182DOI: 10.1016/j.csda.2011.03.001ISI: 000291181000002OAI: diva2:424299
Available from: 2011-06-17 Created: 2011-06-17 Last updated: 2015-06-02
In thesis
1. Monotonic regression for large multivariate datasets
Open this publication in new window or tab >>Monotonic regression for large multivariate datasets
2010 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Monoton regression för stora multivariata datamateriaI
Abstract [en]

Monotonic regression is a non-parametric statistical method that is designed especially for applications in which the expected value of a response variable increases or decreases in one or more explanatory variables. Such applications can be found in business, physics, biology, medicine, signal processing, and other areas. Inasmuch as many of the collected datasets can contain a very large number of multivariate observations, there is a strong need for efficient numerical algorithms. Here, we present new methods that make it feasible to fit monotonic functions to more than one hundred thousand data points. By simulation, we show that our algorithms have high accuracy and represent  considerable improvements with respect to computational time and memory requirements. In particular , we demonstrate how segmentation of a large-scale problem can greatly improve the performance of existing algorithms. Moreover, we show how the uncertainty of a monotonic regression model can be estimated. One of the procedures we developed can be employed to estimate the variance of the random error present in the observed response. Other procedures are based on resampling  techniques and can provide confidence intervals for the expected response at given levels of a set of predictors.

Abstract [sv]

Monoton regression är en icke-parametrisk statistisk metod som är utvecklad speciellt för tillämpningar i vilka det förväntade värdet aven responsvariabel ökar eller minskar med en eller flera förklaringsvariabler. Sådana tillämpningar finns inom företagsekonomi, fysik, biologi, medicin, signalbehandling och andra områden. Eftersom många insamlade datamaterial kan innehålla ett mycket stort antal multivariata observationer finns ett starkt behov av effektiva numeriska algoritmer. Här presenterar vi nya metoder som gör det möjligt att anpassa monotona funktioner till mer än 100000 datapunkter. Genom simulering visar vi. att våra algoritmer har hög noggrannhet och innebär betydande förbättringar med avseende på beräkningstid och krav på minnesutrymme. Speciellt visar vi hur segmentering av ett storskaligt problem starkt kan förbättra existerande algoritmer. Dessutom visar vi hur osäkerheten aven monoton regressions modell kan uppskattas. En av de metoder vi utvecklat kan användas för att uppskatta variansen för de slumpkomponenter som kan finnas i den observerade responsvariabeln. Andra metoder, baserade på s.k. återsampling, kan ge konfidensintervall för den förväntade responsen för givna värden på ett antal prediktorer.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2010. 75 p.
Linköping Studies in Statistics, ISSN 1651-1700 ; 11Linköping Studies in Arts and Science, ISSN 0282-9800 ; 514
National Category
Probability Theory and Statistics
urn:nbn:se:liu:diva-65349 (URN)978-91-7393-412-1 (ISBN)
Public defence
2010-04-16, Glashuset, Building B, Campus Valla, Linköpings universitet, Linköping, 13:15 (English)
Available from: 2011-02-04 Created: 2011-02-04 Last updated: 2012-11-08Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Sysoev, OlegBurdakov, OlegGrimvall, Anders
By organisation
Department of Computer and Information ScienceThe Institute of TechnologyOptimization
In the same journal
Computational Statistics & Data Analysis
Social Sciences

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 105 hits
ReferencesLink to record
Permanent link

Direct link