liu.seSearch for publications in DiVA
Change search
Refine search result
1 - 24 of 24
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Sysoev, Oleg
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Burdakov, Oleg
    Linköping University, Department of Mathematics, Optimization . Linköping University, Faculty of Science & Engineering.
    A smoothed monotonic regression via L2 regularization2019In: Knowledge and Information Systems, ISSN 0219-1377, E-ISSN 0219-3116, Vol. 59, no 1, p. 197-218Article in journal (Refereed)
    Abstract [en]

    Monotonic regression is a standard method for extracting a monotone function from non-monotonic data, and it is used in many applications. However, a known drawback of this method is that its fitted response is a piecewise constant function, while practical response functions are often required to be continuous. The method proposed in this paper achieves monotonicity and smoothness of the regression by introducing an L2 regularization term. In order to achieve a low computational complexity and at the same time to provide a high predictive power of the method, we introduce a probabilistically motivated approach for selecting the regularization parameters. In addition, we present a technique for correcting inconsistencies on the boundary. We show that the complexity of the proposed method is O(n2). Our simulations demonstrate that when the data are large and the expected response is a complicated function (which is typical in machine learning applications) or when there is a change point in the response, the proposed method has a higher predictive power than many of the existing methods.

  • 2.
    Svahn, Caroline
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences. Ericsson AB, Sweden.
    Sysoev, Oleg
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Cirkic, Mirsad
    Ericsson AB, Sweden.
    Gustafsson, Fredrik
    Ericsson AB, Sweden.
    Berglund, Joel
    Ericsson AB, Sweden.
    Inter-Frequency Radio Signal Quality Prediction for Handover, Evaluated in 3GPP LTE2019Conference paper (Refereed)
    Abstract [en]

    Radio resource management in cellular networks is typically based on device measurements reported to the serving base station. Frequent measuring of signal quality on available frequencies would allow for highly reliable networks and optimal connection at

  • 3.
    Sysoev, Oleg
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Bartoszek, Krzysztof
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Ekström, Eva-Charlotte
    Uppsala University, Akademiska Sjukhuset, Uppsala, Sweden.
    Ekström Selling, Katarina
    Uppsala University, Akademiska Sjukhuset, Uppsala, Sweden.
    PSICA: Decision trees for probabilistic subgroup identification with categorical treatments2019In: Statistics in Medicine, ISSN 0277-6715, E-ISSN 1097-0258, Vol. 38, no 22, p. 4436-4452Article in journal (Refereed)
    Abstract [en]

    Personalized medicine aims at identifying best treatments for a patient with given characteristics. It has been shown in the literature that these methods can lead to great improvements in medicine compared to traditional methods prescribing the same treatment to all patients. Subgroup identification is a branch of personalized medicine, which aims at finding subgroups of the patients with similar characteristics for which some of the investigated treatments have a better effect than the other treatments. A number of approaches based on decision trees have been proposed to identify such subgroups, but most of them focus on two‐arm trials (control/treatment) while a few methods consider quantitative treatments (defined by the dose). However, no subgroup identification method exists that can predict the best treatments in a scenario with a categorical set of treatments. We propose a novel method for subgroup identification in categorical treatment scenarios. This method outputs a decision tree showing the probabilities of a given treatment being the best for a given group of patients as well as labels showing the possible best treatments. The method is implemented in an R package psica available on CRAN. In addition to a simulation study, we present an analysis of a community‐based nutrition intervention trial that justifies the validity of our method.

  • 4.
    Svefors, Pernilla
    et al.
    Uppsala Universitet, Uppsala, Sweden; Center for Epidemiology and Community Medicine, Stockholm, Sweden.
    Sysoev, Oleg
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Ekstrom, Eva-Charlotte
    Uppsala Universitet, Uppsala, Sweden.
    Persson, Lars Ake
    London School of Hygiene and Tropical Medicine, London, UK.
    Arifeen, Shams E
    International Centre for Diarrhoeal Disease Research, Dhaka, Bangladesh.
    Naved, Ruchira T
    International Centre for Diarrhoeal Disease Research, Dhaka, Bangladesh.
    Rahman, Anisur
    International Centre for Diarrhoeal Disease Research, Dhaka, Bangladesh.
    Khan, Ashraful Islam
    International Centre for Diarrhoeal Disease Research, Dhaka, Bangladesh.
    Selling, Katarina
    Uppsala Universitet, Uppsala, Sweden.
    Relative importance of prenatal and postnatal determinants of stunting: data mining approaches to the MINIMat cohort, Bangladesh2019In: BMJ Open, ISSN 2044-6055, E-ISSN 2044-6055, Vol. 9, no 8Article in journal (Refereed)
    Abstract [en]

    Introduction WHO has set a goal to reduce the prevalence of stunted child growth by 40% by the year 2025. To reach this goal, it is imperative to establish the relative importance of risk factors for stunting to deliver appropriate interventions. Currently, most interventions take place in late infancy and early childhood. This study aimed to identify the most critical prenatal and postnatal determinants of linear growth 0–24 months and the risk factors for stunting at 2 years, and to identify subgroups with different growth trajectories and levels of stunting at 2 years.

    Methods Conditional inference tree-based methods were applied to the extensive Maternal and Infant Nutrition Interventions in Matlab trial database with 309 variables of 2723 children, their parents and living conditions, including socioeconomic, nutritional and other biological characteristics of the parents; maternal exposure to violence; household food security; breast and complementary feeding; and measurements of morbidity of the mothers during pregnancy and repeatedly of their children up to 24 months of age. Child anthropometry was measured monthly from birth to 12 months, thereafter quarterly to 24 months.

    Results Birth length and weight were the most critical factors for linear growth 0–24 months and stunting at 2 years, followed by maternal anthropometry and parental education. Conditions after birth, such as feeding practices and morbidity, were less strongly associated with linear growth trajectories and stunting at 2 years.

    Conclusion The results of this study emphasise the benefit of interventions before conception and during pregnancy to reach a substantial reduction in stunting.

  • 5.
    Burdakov, Oleg
    et al.
    Linköping University, Department of Mathematics, Optimization . Linköping University, Faculty of Science & Engineering.
    Sysoev, Oleg
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    A Dual Active-Set Algorithm for Regularized Monotonic Regression2017In: Journal of Optimization Theory and Applications, ISSN 0022-3239, E-ISSN 1573-2878, Vol. 172, no 3, p. 929-949Article in journal (Refereed)
    Abstract [en]

    Monotonic (isotonic) regression is a powerful tool used for solving a wide range of important applied problems. One of its features, which poses a limitation on its use in some areas, is that it produces a piecewise constant fitted response. For smoothing the fitted response, we introduce a regularization term in the monotonic regression, formulated as a least distance problem with monotonicity constraints. The resulting smoothed monotonic regression is a convex quadratic optimization problem. We focus on the case, where the set of observations is completely (linearly) ordered. Our smoothed pool-adjacent-violators algorithm is designed for solving the regularized problem. It belongs to the class of dual active-set algorithms. We prove that it converges to the optimal solution in a finite number of iterations that does not exceed the problem size. One of its advantages is that the active set is progressively enlarging by including one or, typically, more constraints per iteration. This resulted in solving large-scale test problems in a few iterations, whereas the size of that problems was prohibitively too large for the conventional quadratic optimization solvers. Although the complexity of our algorithm grows quadratically with the problem size, we found its running time to grow almost linearly in our computational experiments.

  • 6.
    Burdakov, Oleg
    et al.
    Linköping University, Department of Mathematics, Optimization . Linköping University, Faculty of Science & Engineering.
    Sysoev, Oleg
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    A Dual Active-Set Algorithm for Regularized Slope-Constrained Monotonic Regression2017In: Iranian Journal of Operations Research, ISSN 2008-1189, Vol. 8, no 2, p. 40-47Article in journal (Refereed)
    Abstract [en]

    In many problems, it is necessary to take into account monotonic relations. Monotonic (isotonic) Regression (MR) is often involved in solving such problems. The MR solutions are of a step-shaped form with a typical sharp change of values between adjacent steps. This, in some applications, is regarded as a disadvantage. We recently introduced a Smoothed MR (SMR) problem which is obtained from the MR by adding a regularization penalty term. The SMR is aimed at smoothing the aforementioned sharp change. Moreover, its solution has a far less pronounced step-structure, if at all available. The purpose of this paper is to further improve the SMR solution by getting rid of such a structure. This is achieved by introducing a lowed bound on the slope in the SMR. We call it Smoothed Slope-Constrained MR (SSCMR) problem. It is shown here how to reduce it to the SMR which is a convex quadratic optimization problem. The Smoothed Pool Adjacent Violators (SPAV) algorithm developed in our recent publications for solving the SMR problem is adapted here to solving the SSCMR problem. This algorithm belongs to the class of dual active-set algorithms. Although the complexity of the SPAV algorithm is o(n2) its running time is growing in our computational experiments almost linearly with n. We present numerical results which illustrate the predictive performance quality of our approach. They also show that the SSCMR solution is free of the undesirable features of the MR and SMR solutions.

  • 7.
    Sysoev, Oleg
    et al.
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Burdakov, Oleg
    Linköping University, Department of Mathematics, Optimization . Linköping University, Faculty of Science & Engineering.
    A Smoothed Monotonic Regression via L2 Regularization2016Report (Other academic)
    Abstract [en]

    Monotonic Regression (MR) is a standard method for extracting a monotone function from non-monotonic data, and it is used in many applications. However, a known drawback of this method is that its fitted response is a piecewise constant function, while practical response functions are often required to be continuous. The method proposed in this paper achieves monotonicity and smoothness of the regression by introducing an L2 regularization term, and it is shown that the complexity of this method is O(n2). In addition, our simulations demonstrate that the proposed method normally has higher predictive power than some commonly used alternative methods, such as monotonic kernel smoothers. In contrast to these methods, our approach is probabilistically motivated and has connections to Bayesian modeling.

  • 8.
    Kalish, Michael L.
    et al.
    Syracuse University, USA.
    Dunn, John C.
    University of Adelaide, Australia.
    Burdakov, Oleg P.
    Linköping University, Department of Mathematics, Optimization . Linköping University, Faculty of Science & Engineering.
    Sysoev, Oleg
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    A statistical test of the equality of latent orders2016In: Journal of mathematical psychology (Print), ISSN 0022-2496, E-ISSN 1096-0880, Vol. 70, p. 1-11, article id YJMPS2051Article in journal (Refereed)
    Abstract [en]

    It is sometimes the case that a theory proposes that the population means on two variables should have the same rank order across a set of experimental conditions. This paper presents a test of this hypothesis. The test statistic is based on the coupled monotonic regression algorithm developed by the authors. The significance of the test statistic is determined by comparison to an empirical distribution specific to each case, obtained via non-parametric or semi-parametric bootstrap. We present an analysis of the power and Type I error control of the test based on numerical simulation. Partial order constraints placed on the variables may sometimes be theoretically justified. These constraints are easily incorporated into the computation of the test statistic and are shown to have substantial effects on power. The test can be applied to any form of data, as long as an appropriate statistical model can be specified.

  • 9.
    Sysoev, Oleg
    et al.
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Science & Engineering.
    Grimvall, Anders
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Science & Engineering.
    Burdakov, Oleg
    Linköping University, Department of Mathematics, Optimization . Linköping University, Faculty of Science & Engineering.
    Bootstrap confidence intervals for large-scale multivariate monotonic regression problems2016In: Communications in statistics. Simulation and computation, ISSN 0361-0918, E-ISSN 1532-4141, Vol. 45, no 3, p. 1025-1040Article in journal (Refereed)
    Abstract [en]

    Recently, the methods used to estimate monotonic regression (MR) models have been substantially improved, and some algorithms can now produce high-accuracy monotonic fits to multivariate datasets containing over a million observations. Nevertheless, the computational burden can be prohibitively large for resampling techniques in which numerous datasets are processed independently of each other. Here, we present efficient algorithms for estimation of confidence limits in large-scale settings that take into account the similarity of the bootstrap or jackknifed datasets to which MR models are fitted. In addition, we introduce modifications that substantially improve the accuracy of MR solutions for binary response variables. The performance of our algorithms isillustrated using data on death in coronary heart disease for a large population. This example also illustrates that MR can be a valuable complement to logistic regression.

  • 10.
    Burdakov, Oleg
    et al.
    Linköping University, Department of Mathematics, Optimization . Linköping University, Faculty of Science & Engineering.
    Sysoev, Oleg
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Regularized monotonic regression2016Report (Other academic)
    Abstract [en]

    Monotonic (isotonic) Regression (MR) is a powerful tool used for solving a wide range of important applied problems. One of its features, which poses a limitation on its use in some areas, is that it produces a piecewise constant fitted response. For smoothing the fitted response, we introduce a regularization term in the MR formulated as a least distance problem with monotonicity constraints. The resulting Smoothed Monotonic Regrassion (SMR) is a convex quadratic optimization problem. We focus on the SMR, where the set of observations is completely (linearly) ordered. Our Smoothed Pool-Adjacent-Violators (SPAV) algorithm is designed for solving the SMR. It belongs to the class of dual activeset algorithms. We proved its finite convergence to the optimal solution in, at most, n iterations, where n is the problem size. One of its advantages is that the active set is progressively enlarging by including one or, typically, more constraints per iteration. This resulted in solving large-scale SMR test problems in a few iterations, whereas the size of that problems was prohibitively too large for the conventional quadratic optimization solvers. Although the complexity of the SPAV algorithm is O(n2), its running time was growing in our computational experiments in proportion to n1:16.

  • 11.
    Sysoev, Oleg
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, Statistics.
    Grimvall, Anders
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, Statistics.
    Burdakov, Oleg
    Linköping University, The Institute of Technology. Linköping University, Department of Mathematics, Optimization .
    Bootstrap estimation of the variance of the error term in monotonic regression models2013In: Journal of Statistical Computation and Simulation, ISSN 0094-9655, E-ISSN 1563-5163, Vol. 83, no 4, p. 625-638Article in journal (Refereed)
    Abstract [en]

    The variance of the error term in ordinary regression models and linear smoothers is usually estimated by adjusting the average squared residual for the trace of the smoothing matrix (the degrees of freedom of the predicted response). However, other types of variance estimators are needed when using monotonic regression (MR) models, which are particularly suitable for estimating response functions with pronounced thresholds. Here, we propose a simple bootstrap estimator to compensate for the over-fitting that occurs when MR models are estimated from empirical data. Furthermore, we show that, in the case of one or two predictors, the performance of this estimator can be enhanced by introducing adjustment factors that take into account the slope of the response function and characteristics of the distribution of the explanatory variables. Extensive simulations show that our estimators perform satisfactorily for a great variety of monotonic functions and error distributions.

  • 12.
    Sysoev, Oleg
    et al.
    Linköping University, Department of Computer and Information Science. Linköping University, The Institute of Technology.
    Burdakov, Oleg
    Linköping University, The Institute of Technology. Linköping University, Department of Mathematics, Optimization .
    Grimvall, Anders
    Linköping University, Department of Computer and Information Science. Linköping University, The Institute of Technology.
    A segmentation-based algorithm for large-scale partially ordered monotonic regression2011In: Computational Statistics & Data Analysis, ISSN 0167-9473, E-ISSN 1872-7352, Vol. 55, no 8, p. 2463-2476Article in journal (Refereed)
    Abstract [en]

    Monotonic regression (MR) is an efficient tool for estimating functions that are monotonic with respect to input variables. A fast and highly accurate approximate algorithm called the GPAV was recently developed for efficient solving large-scale multivariate MR problems. When such problems are too large, the GPAV becomes too demanding in terms of computational time and memory. An approach, that extends the application area of the GPAV to encompass much larger MR problems, is presented. It is based on segmentation of a large-scale MR problem into a set of moderate-scale MR problems, each solved by the GPAV. The major contribution is the development of a computationally efficient strategy that produces a monotonic response using the local solutions. A theoretically motivated trend-following technique is introduced to ensure higher accuracy of the solution. The presented results of extensive simulations on very large data sets demonstrate the high efficiency of the new algorithm.

  • 13.
    Sysoev, Oleg
    Linköping University, Department of Computer and Information Science. Linköping University, The Institute of Technology.
    Monotonic regression for large multivariate datasets2010Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Monotonic regression is a non-parametric statistical method that is designed especially for applications in which the expected value of a response variable increases or decreases in one or more explanatory variables. Such applications can be found in business, physics, biology, medicine, signal processing, and other areas. Inasmuch as many of the collected datasets can contain a very large number of multivariate observations, there is a strong need for efficient numerical algorithms. Here, we present new methods that make it feasible to fit monotonic functions to more than one hundred thousand data points. By simulation, we show that our algorithms have high accuracy and represent  considerable improvements with respect to computational time and memory requirements. In particular , we demonstrate how segmentation of a large-scale problem can greatly improve the performance of existing algorithms. Moreover, we show how the uncertainty of a monotonic regression model can be estimated. One of the procedures we developed can be employed to estimate the variance of the random error present in the observed response. Other procedures are based on resampling  techniques and can provide confidence intervals for the expected response at given levels of a set of predictors.

    List of papers
    1. An O(n2) algorithm for isotonic regression problems
    Open this publication in new window or tab >>An O(n2) algorithm for isotonic regression problems
    2006 (English)In: Large-Scale Nonlinear Optimization / [ed] G. Di Pillo and M. Roma, Springer-Verlag , 2006, p. 25-33Chapter in book (Refereed)
    Abstract [en]

    Large-Scale Nonlinear Optimization reviews and discusses recent advances in the development of methods and algorithms for nonlinear optimization and its applications, focusing on the large-dimensional case, the current forefront of much research.

    The chapters of the book, authored by some of the most active and well-known researchers in nonlinear optimization, give an updated overview of the field from different and complementary standpoints, including theoretical analysis, algorithmic development, implementation issues and applications

    Place, publisher, year, edition, pages
    Springer-Verlag, 2006
    Series
    Nonconvex Optimization and Its Applications ; 83
    Keywords
    Quadratic programming, large scale optimization, least distance problem, isotonic regression, pool-adjacent-violators algorithm
    National Category
    Computational Mathematics
    Identifiers
    urn:nbn:se:liu:diva-60581 (URN)978-0-387-30063-4 (ISBN)0-387-3-0065-1 (ISBN)
    Available from: 2010-10-20 Created: 2010-10-20 Last updated: 2015-06-02Bibliographically approved
    2. Data preordering in generalized PAV algorithm for monotonic regression
    Open this publication in new window or tab >>Data preordering in generalized PAV algorithm for monotonic regression
    2006 (English)In: Journal of Computational Mathematics, ISSN 0254-9409, E-ISSN 1991-7139, Vol. 24, no 6, p. 771-790Article in journal (Refereed) Published
    Abstract [en]

    Monotonic regression (MR) is a least distance problem with monotonicity constraints induced by a partially ordered data set of observations. In our recent publication [In Ser. {\sl Nonconvex Optimization and Its Applications}, Springer-Verlag, (2006) {\bf 83}, pp. 25-33], the Pool-Adjacent-Violators algorithm (PAV) was generalized from completely to partially ordered data sets (posets). The new algorithm, called GPAV, is characterized by the very low computational complexity, which is of second order in the number of observations. It treats the observations in a consecutive order, and it can follow any arbitrarily chosen topological order of the poset of observations. The GPAV algorithm produces a sufficiently accurate solution to the MR problem, but the accuracy depends on the chosen topological order. Here we prove that there exists a topological order for which the resulted GPAV solution is optimal. Furthermore, we present results of extensive numerical experiments, from which we draw conclusions about the most and the least preferable topological orders.

    National Category
    Mathematics
    Identifiers
    urn:nbn:se:liu:diva-36278 (URN)30826 (Local ID)30826 (Archive number)30826 (OAI)
    Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2017-12-13
    3. Generalized PAV algorithm with block refinement for partially ordered monotonic regression
    Open this publication in new window or tab >>Generalized PAV algorithm with block refinement for partially ordered monotonic regression
    2009 (English)In: Proceedings of the Workshop on Learning Monotone Models from Data / [ed] A. Feelders and R. Potharst, 2009, p. 23-37Conference paper, Published paper (Other academic)
    Abstract [en]

    In this paper, the monotonic regression problem (MR) is considered. We have recentlygeneralized for MR the well-known Pool-Adjacent-Voilators algorithm(PAV) from the case of completely to partially ordered data sets. Thenew algorithm, called GPAV, combines both high accuracy and lowcomputational complexity which grows quadratically with the problemsize. The actual growth observed in practice is typically far lowerthan quadratic. The fitted values of the exact MR solution composeblocks of equal values. Its GPAV approximation has also a blockstructure. We present here a technique for refining blocks produced bythe GPAV algorithm to make the new blocks more close to those in theexact solution. This substantially improves the accuracy of the GPAVsolution and does not deteriorate its computational complexity. Thecomputational time for the new technique is approximately triple thetime of running the GPAV algorithm. Its efficiency is demonstrated byresults of our numerical experiments.

    Keywords
    Monotonic regression, Partially ordered data set, Pool-adjacent-violators algorithm, Quadratic programming, Large scale optimization, Least distance problem.
    National Category
    Computational Mathematics
    Identifiers
    urn:nbn:se:liu:diva-52535 (URN)
    Conference
    the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Bled, Slovenia, September 7-11, 2009
    Available from: 2010-01-02 Created: 2010-01-02 Last updated: 2017-12-06
    4. A segmentation-based algorithm for large-scale partially ordered monotonic regression
    Open this publication in new window or tab >>A segmentation-based algorithm for large-scale partially ordered monotonic regression
    2011 (English)In: Computational Statistics & Data Analysis, ISSN 0167-9473, E-ISSN 1872-7352, Vol. 55, no 8, p. 2463-2476Article in journal (Refereed) Published
    Abstract [en]

    Monotonic regression (MR) is an efficient tool for estimating functions that are monotonic with respect to input variables. A fast and highly accurate approximate algorithm called the GPAV was recently developed for efficient solving large-scale multivariate MR problems. When such problems are too large, the GPAV becomes too demanding in terms of computational time and memory. An approach, that extends the application area of the GPAV to encompass much larger MR problems, is presented. It is based on segmentation of a large-scale MR problem into a set of moderate-scale MR problems, each solved by the GPAV. The major contribution is the development of a computationally efficient strategy that produces a monotonic response using the local solutions. A theoretically motivated trend-following technique is introduced to ensure higher accuracy of the solution. The presented results of extensive simulations on very large data sets demonstrate the high efficiency of the new algorithm.

    Place, publisher, year, edition, pages
    Elsevier Science B.V., Amsterdam., 2011
    Keywords
    Quadratic programming, Large-scale optimization, Least distance problem, Monotonic regression, Partially ordered data set, Pool-adjacent-violators algorithm
    National Category
    Social Sciences
    Identifiers
    urn:nbn:se:liu:diva-69182 (URN)10.1016/j.csda.2011.03.001 (DOI)000291181000002 ()
    Available from: 2011-06-17 Created: 2011-06-17 Last updated: 2017-12-11
    5. Bootstrap estimation of the variance of the error term in monotonic regression models
    Open this publication in new window or tab >>Bootstrap estimation of the variance of the error term in monotonic regression models
    2013 (English)In: Journal of Statistical Computation and Simulation, ISSN 0094-9655, E-ISSN 1563-5163, Vol. 83, no 4, p. 625-638Article in journal (Refereed) Published
    Abstract [en]

    The variance of the error term in ordinary regression models and linear smoothers is usually estimated by adjusting the average squared residual for the trace of the smoothing matrix (the degrees of freedom of the predicted response). However, other types of variance estimators are needed when using monotonic regression (MR) models, which are particularly suitable for estimating response functions with pronounced thresholds. Here, we propose a simple bootstrap estimator to compensate for the over-fitting that occurs when MR models are estimated from empirical data. Furthermore, we show that, in the case of one or two predictors, the performance of this estimator can be enhanced by introducing adjustment factors that take into account the slope of the response function and characteristics of the distribution of the explanatory variables. Extensive simulations show that our estimators perform satisfactorily for a great variety of monotonic functions and error distributions.

    Place, publisher, year, edition, pages
    Taylor & Francis Group, 2013
    Keywords
    uncertainty estimation; bootstrap; monotonic regression; pool-adjacent-violators algorithm
    National Category
    Probability Theory and Statistics
    Identifiers
    urn:nbn:se:liu:diva-78858 (URN)10.1080/00949655.2011.631138 (DOI)000317276900003 ()
    Available from: 2012-06-21 Created: 2012-06-21 Last updated: 2017-12-07
    6. Bootstrap confidence intervals for large-scale multivariate monotonic regression problems
    Open this publication in new window or tab >>Bootstrap confidence intervals for large-scale multivariate monotonic regression problems
    2016 (English)In: Communications in statistics. Simulation and computation, ISSN 0361-0918, E-ISSN 1532-4141, Vol. 45, no 3, p. 1025-1040Article in journal (Refereed) Published
    Abstract [en]

    Recently, the methods used to estimate monotonic regression (MR) models have been substantially improved, and some algorithms can now produce high-accuracy monotonic fits to multivariate datasets containing over a million observations. Nevertheless, the computational burden can be prohibitively large for resampling techniques in which numerous datasets are processed independently of each other. Here, we present efficient algorithms for estimation of confidence limits in large-scale settings that take into account the similarity of the bootstrap or jackknifed datasets to which MR models are fitted. In addition, we introduce modifications that substantially improve the accuracy of MR solutions for binary response variables. The performance of our algorithms isillustrated using data on death in coronary heart disease for a large population. This example also illustrates that MR can be a valuable complement to logistic regression.

    Place, publisher, year, edition, pages
    Taylor & Francis, 2016
    Keywords
    Big data, Bootstrap, Confidence intervals, Monotonic regression, Pool- adjacent-violators algorithm
    National Category
    Probability Theory and Statistics Computational Mathematics
    Identifiers
    urn:nbn:se:liu:diva-85169 (URN)10.1080/03610918.2014.911899 (DOI)000372527900014 ()
    Note

    Vid tiden för disputation förelåg publikationen som manuskript

    Available from: 2012-11-08 Created: 2012-11-08 Last updated: 2017-12-13
  • 14.
    Burdakov, Oleg
    et al.
    Linköping University, Department of Mathematics, Optimization . Linköping University, The Institute of Technology.
    Grimvall, Anders
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Sysoev, Oleg
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Generalized PAV algorithm with block refinement for partially ordered monotonic regression2009In: Proceedings of the Workshop on Learning Monotone Models from Data / [ed] A. Feelders and R. Potharst, 2009, p. 23-37Conference paper (Other academic)
    Abstract [en]

    In this paper, the monotonic regression problem (MR) is considered. We have recentlygeneralized for MR the well-known Pool-Adjacent-Voilators algorithm(PAV) from the case of completely to partially ordered data sets. Thenew algorithm, called GPAV, combines both high accuracy and lowcomputational complexity which grows quadratically with the problemsize. The actual growth observed in practice is typically far lowerthan quadratic. The fitted values of the exact MR solution composeblocks of equal values. Its GPAV approximation has also a blockstructure. We present here a technique for refining blocks produced bythe GPAV algorithm to make the new blocks more close to those in theexact solution. This substantially improves the accuracy of the GPAVsolution and does not deteriorate its computational complexity. Thecomputational time for the new technique is approximately triple thetime of running the GPAV algorithm. Its efficiency is demonstrated byresults of our numerical experiments.

  • 15.
    Burdakov, Oleg
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Mathematics, Optimization .
    Grimvall, Anders
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Computer and Information Science, Statistics.
    Sysoev, Oleg
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Computer and Information Science, Statistics.
    Kapyrin, Ivan
    Institute of Numerical Mathematics Russian Academy of Sciences, Moscow, Russia.
    Vassilevski, Yuri
    Institute of Numerical Mathematics Russian Academy of Sciences, Moscow, Russia.
    Monotonic data fitting and interpolation with application to postprocessing of FE solutions2007In: CERFACS 20th Anniversary Conference on High-performance Computing,2007, 2007, p. 11-12Conference paper (Other academic)
    Abstract [en]

    In this talk we consider the isotonic regression (IR) problem which can be formulated as follows. Given a vector $\bar{x} \in R^n$, find $x_* \in R^n$ which solves the problem: \begin{equation}\label{ir2} \begin{array}{cl} \mbox{min} & \|x-\bar{x}\|^2 \\ \mbox{s.t.} & Mx \ge 0. \end{array} \end{equation} The set of constraints $Mx \ge 0$ represents here the monotonicity relations of the form $x_i \le x_j$ for a given set of pairs of the components of $x$. The corresponding row of the matrix $M$ is composed mainly of zeros, but its $i$th and $j$th elements, which are equal to $-1$ and $+1$, respectively. The most challenging applications of (\ref{ir2}) are characterized by very large values of $n$. We introduce new IR algorithms. Our numerical experiments demonstrate the high efficiency of our algorithms, especially for very large-scale problems, and their robustness. They are able to solve some problems which all existing IR algorithms fail to solve. We outline also our new algorithms for monotonicity-preserving interpolation of scattered multivariate data. In this talk we focus on application of our IR algorithms in postprocessing of FE solutions. Non-monotonicity of the numerical solution is a typical drawback of the conventional methods of approximation, such as finite elements (FE), finite volumes, and mixed finite elements. The problem of monotonicity is particularly important in cases of highly anisotropic diffusion tensors or distorted unstructured meshes. For instance, in the nuclear waste transport simulation, the non-monotonicity results in the presence of negative concentrations which may lead to unacceptable concentration and chemistry calculations failure. Another drawback of the conventional methods is a possible violation of the discrete maximum principle, which establishes lower and upper bounds for the solution. We suggest here a least-change correction to the available FE solution $\bar{x} \in R^n$. This postprocessing procedure is aimed on recovering the monotonicity and some other important properties that may not be exhibited by $\bar{x}$. The mathematical formulation of the postprocessing problem is reduced to the following convex quadratic programming problem \begin{equation}\label{ls2} \begin{array}{cl} \mbox{min} & \|x-\bar{x}\|^2 \\ \mbox{s.t.} & Mx \ge 0, \quad l \le x \le u, \quad e^Tx = m, \end{array} \end{equation} where$e=(1,1, \ldots ,1)^T \in R^n$. The set of constraints $Mx \ge 0$ represents here the monotonicity relations between some of the adjacent mesh cells. The constraints $l \le x \le u$ originate from the discrete maximum principle. The last constraint formulates the conservativity requirement. The postprocessing based on (\ref{ls2}) is typically a large scale problem. We introduce here algorithms for solving this problem. They are based on the observation that, in the presence of the monotonicity constraints only, problem (\ref{ls2}) is the classical monotonic regression problem, which can be solved efficiently by some of the available monotonic regression algorithms. This solution is used then for producing the optimal solution to problem (\ref{ls2}) in the presence of all the constraints. We present results of numerical experiments to illustrate the efficiency of our algorithms.

  • 16.
    Burdakov, Oleg
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Mathematics, Optimization .
    Grimvall, Anders
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Computer and Information Science, Statistics.
    Sysoev, Oleg
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Computer and Information Science, Statistics.
    New optimization algorithms for large-scale isotonic regression in L2-norm2007In: EUROPT-OMS Conference on Optimization,2007, University of Hradec Kralove, Czech Republic: Guadeamus , 2007, p. 44-44Conference paper (Other academic)
    Abstract [en]

    Isotonic regression problem (IR) has numerous important applications in statistics, operations research, biology, image and signal processing and other areas. IR in L2-norm is a minimization problem in which the objective function is the squared Euclidean distance from a given point to a convex set defined by monotonicity constraints of the form: i-th component of the decision vector is less or equal to its j-th component. Unfortunately, the conventional optimization methods are unable to solve IR problems originating from large data sets. The existing IR algorithms, such as the minimum lower sets algorithm by Brunk, the min-max algorithm by Lee, the network flow algorithm by Maxwell & Muchstadt and the IBCR algorithm by Block et al. are able to find exact solution to IR problem for at most a few thousands of variables. The IBCR algorithm, which proved to be the most efficient of them, is not robust enough. An alternative approach is related to solving IR problem approximately. Following this approach, Burdakov et al. developed an algorithm, called GPAV, whose block refinement extension, GPAVR, is able to solve IR problems with a very high accuracy in a far shorter time than the exact algorithms. Apart from this, GPAVR is a very robust algorithm, and it allows us to solve IR problems with over hundred thousands of variables. In this talk, we introduce new exact IR algorithms, which can be viewed as active set methods. They use the approximate solution produced by the GPAVR algorithm as a starting point. We present results of our numerical experiments demonstrating the high efficiency of the new algorithms, especially for very large-scale problems, and their robustness. They are able to solve the problems which all existing exact IR algorithms fail to solve.

  • 17.
    Sysoev, Oleg
    et al.
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Computer and Information Science, Statistics.
    Burdakov, Oleg
    Linköping University, The Institute of Technology. Linköping University, Department of Mathematics, Optimization .
    Grimvall, Anders
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Computer and Information Science, Statistics.
    New optimization methods for isotonic regression in L1 norm2007In: EUROPT-OMS Conference on Optimization,2007, University of Hradec Kralove, Czech Republic: Guadeamus , 2007, p. 133-133Conference paper (Other academic)
    Abstract [en]

    Isotonic regression problem (IR) has numerous important applications in statistics, operations research, biology, image and signal processing and other areas. IR is a minimization problem with the objective function defined by the distance from a given point to a convex set defined by monotonicity constraints of the form: i-th component of the decision vector is less or equal to its j-th component. The distance in IR is usually associated with the Lp norm, whereas the norms L1 and L2 are of the highest practical interest. The conventional optimization methods are unable to solve large-scale IR problems originating from large data sets. Historically, the major efforts were focused on IR problem in the L2 norm. Exact algorithms such as the minimum lower sets algorithm by Brunk, the min-max algorithm by Lee, the network flow algorithm by Maxwell & Muchstadt and the IBCR algorithm by Block et al. were developed. Among them the IBCR algorithm has been proved to be the most numerically efficient, but it is not robust enough. An alternative approach is related to solving IR problem approximately. Following this approach, Burdakov et al. developed GPAV algorithm whose block refinement extension, GPAVR, is able to solve IR problem with high accuracy in a far shorter time than the exact algorithms. Apart from this, GPAVR is a very robust algorithm. Unfortunately, for the norm L1 there are no algorithms which are as efficient as those in L2 norm. In our talk, we introduce new algorithms, GPAVR1 and IBCR1. They are extensions of the algorithms GPAV and IBCR to L1 norm. We present also results of numerical experiments, which demonstrate the high efficiency of the new algorithms, especially for very large-scale problems.

  • 18.
    Burdakov, Oleg
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Mathematics, Optimization .
    Sysoev, Oleg
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Mathematics, Statistics.
    Grimvall, Anders
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Mathematics, Statistics.
    Hussian, Mohamed
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Mathematics, Statistics.
    An O(n2) algorithm for isotonic regression2006In: Large-Scale Nonlinear Optimization / [ed] Pillo, Gianni; Roma, Massimo, New York: Springer Science+Business Media B.V., 2006, p. 25-33Conference paper (Other academic)
    Abstract [en]

    We consider the problem of minimizing the distance from a given n-dimensional vector to a set defined by constraints of the form xixj. Such constraints induce a partial order of the components xi, which can be illustrated by an acyclic directed graph. This problem is also known as the isotonic regression (IR) problem. IR has important applications in statistics, operations research and signal processing, with most of them characterized by a very large value of n. For such large-scale problems, it is of great practical importance to develop algorithms whose complexity does not rise with n too rapidly. The existing optimization-based algorithms and statistical IR algorithms have either too high computational complexity or too low accuracy of the approximation to the optimal solution they generate. We introduce a new IR algorithm, which can be viewed as a generalization of the Pool-Adjacent-Violator (PAV) algorithm from completely to partially ordered data. Our algorithm combines both low computational complexity O(n2) and high accuracy. This allows us to obtain sufficiently accurate solutions to IR problems with thousands of observations.

  • 19.
    Burdakov, Oleg
    et al.
    Linköping University, Department of Mathematics, Optimization . Linköping University, The Institute of Technology.
    Sysoev, Oleg
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Grimvall, Anders
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Hussian, Mohammed
    Linköping University, Department of Mathematics, Statistics. Linköping University, Faculty of Arts and Sciences.
    An O(n2) algorithm for isotonic regression problems2006In: Large-Scale Nonlinear Optimization / [ed] G. Di Pillo and M. Roma, Springer-Verlag , 2006, p. 25-33Chapter in book (Refereed)
    Abstract [en]

    Large-Scale Nonlinear Optimization reviews and discusses recent advances in the development of methods and algorithms for nonlinear optimization and its applications, focusing on the large-dimensional case, the current forefront of much research.

    The chapters of the book, authored by some of the most active and well-known researchers in nonlinear optimization, give an updated overview of the field from different and complementary standpoints, including theoretical analysis, algorithmic development, implementation issues and applications

  • 20.
    Burdakov, Oleg
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Mathematics, Optimization .
    Grimvall, Anders
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Mathematics, Statistics.
    Sysoev, Oleg
    Linköping University, Department of Mathematics.
    Data preordering in generalized pav algorithm for monotonic regression2006Report (Other academic)
  • 21.
    Burdakov, Oleg
    et al.
    Linköping University, Department of Mathematics. Linköping University, The Institute of Technology.
    Grimvall, Anders
    Linköping University, The Institute of Technology. Linköping University, Department of Mathematics, Optimization .
    Sysoev, Oleg
    Linköping University, Department of Mathematics. Linköping University, The Institute of Technology.
    Data preordering in generalized PAV algorithm for monotonic regression2006In: Journal of Computational Mathematics, ISSN 0254-9409, E-ISSN 1991-7139, Vol. 24, no 6, p. 771-790Article in journal (Refereed)
    Abstract [en]

    Monotonic regression (MR) is a least distance problem with monotonicity constraints induced by a partially ordered data set of observations. In our recent publication [In Ser. {\sl Nonconvex Optimization and Its Applications}, Springer-Verlag, (2006) {\bf 83}, pp. 25-33], the Pool-Adjacent-Violators algorithm (PAV) was generalized from completely to partially ordered data sets (posets). The new algorithm, called GPAV, is characterized by the very low computational complexity, which is of second order in the number of observations. It treats the observations in a consecutive order, and it can follow any arbitrarily chosen topological order of the poset of observations. The GPAV algorithm produces a sufficiently accurate solution to the MR problem, but the accuracy depends on the chosen topological order. Here we prove that there exists a topological order for which the resulted GPAV solution is optimal. Furthermore, we present results of extensive numerical experiments, from which we draw conclusions about the most and the least preferable topological orders.

  • 22.
    Burdakov, Oleg
    et al.
    Linköping University, Department of Mathematics, Optimization . Linköping University, The Institute of Technology.
    Grimvall, Anders
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Hussian, Mohamed
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Sysoev, Oleg
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Hasse diagrams and the generalized PAV-algorithm for monotonic regression in several explanatory variables2005In: Computational Statistics and Data Analysis, ISSN 0167-9473Article in journal (Refereed)
    Abstract [en]

    Monotonic regression is a nonparametric method for estimation ofmodels in which the expected value of a response variable y increases ordecreases in all coordinates of a vector of explanatory variables x = (x1, …, xp).Here, we examine statistical and computational aspects of our recentlyproposed generalization of the pool-adjacent-violators (PAV) algorithm fromone to several explanatory variables. In particular, we show how the goodnessof-fit and accuracy of obtained solutions can be enhanced by presortingobserved data with respect to their level in a Hasse diagram of the partial orderof the observed x-vectors, and we also demonstrate how these calculations canbe carried out to save computer memory and computational time. Monte Carlosimulations illustrate how rapidly the mean square difference between fittedand expected response values tends to zero, and how quickly the mean squareresidual approaches the true variance of the random error, as the number of observations increases up to 104.

  • 23.
    Hussian, Mohamed
    et al.
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Grimvall, Anders
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Burdakov, Oleg
    Linköping University, Department of Mathematics, Optimization . Linköping University, The Institute of Technology.
    Sysoev, Oleg
    Linköping University, Department of Computer and Information Science, Statistics. Linköping University, Faculty of Arts and Sciences.
    Monotonic regression for the detection of temporal trends in environmental quality data2005In: Match, ISSN 0340-6253, Vol. 54, no 3, p. 535-550Article in journal (Refereed)
  • 24.
    Burdakov, Oleg
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Mathematics, Optimization .
    Sysoev, Oleg
    Linköping University, Department of Computer and Information Science, Statistics.
    Grimvall, Anders
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Mathematics, Statistics.
    Hussian, Mohamed
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Mathematics, Statistics.
    An algorithm for isotonic regression problems2004In: European Congress on Computational Methods in Applied Sciences and Engineering ECCOMAS / [ed] P. Neittaanmäki, T. Rossi, K. Majava and O. Pironneau, Jyväskylä: University of Jyväskylä , 2004, p. 1-9Conference paper (Refereed)
    Abstract [en]

    We consider the problem of minimizing the distance from a given n-dimensional vector to a set defined by constraintsof the form   xi  xj Such constraints induce a partial order of the components xi, which can be illustrated by an acyclic directed graph.This problem is known as the isotonic regression (IR) problem. It has important applications in statistics, operations research and signal processing. The most of the applied IR problems are characterized by a very large value of n. For such large-scale problems, it is of great practical importance to develop algorithms whose complexity does not rise with n too rapidly.The existing optimization-based algorithms and statistical IR algorithms have either too high computational complexity or too low accuracy of the approximation to the optimal solution they generate. We introduce a new IR algorithm, which can be viewed as a generalization of the Pool-Adjacent-Violator (PAV) algorithm from completely to partially ordered data. Our algorithm combines both low computational complexity O(n2) and high accuracy. This allows us to obtain sufficiently accurate solutions to the IR problems with thousands of observations.

1 - 24 of 24
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf