liu.seSearch for publications in DiVA
Change search
ReferencesLink to record
Permanent link

Direct link
Horseshoe RuleFit: Learning Rule Ensembles via Bayesian Regularization
Linköping University, Department of Computer and Information Science, Statistics.
2016 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This work proposes Hs-RuleFit, a learning method for regression and classification, which combines rule ensemble learning based on the RuleFit algorithm with Bayesian regularization through the horseshoe prior. To this end theoretical properties and potential problems of this combination are studied. A second step is the implementation, which utilizes recent sampling schemes to make the Hs-RuleFit computationally feasible. Additionally, changes to the RuleFit algorithm are proposed such as Decision Rule post-processing and the usage of Decision rules generated via Random Forest.

Hs-RuleFit addresses the problem of finding highly accurate and yet interpretable models. The method shows to be capable of finding compact sets of informative decision rules that give a good insight in the data. Through the careful choice of prior distributions the horse-shoe prior shows to be superior to the Lasso in this context. In an empirical evaluation on 16 real data sets Hs-RuleFit shows excellent performance in regression and outperforms the popular methods Random Forest, BART and RuleFit in terms of prediction error. The interpretability is demonstrated on selected data sets. This makes the Hs-RuleFit a good choice for science domains in which interpretability is desired.

Problems are found in classification, regarding the usage of the horseshoe prior and rule ensemble learning in general. A simulation study is performed to isolate the problems and potential solutions are discussed.

Arguments are presented, that the horseshoe prior could be a good choice in other machine learning areas, such as artificial neural networks and support vector machines.

Place, publisher, year, edition, pages
2016. , 54 p.
Keyword [en]
Bayesian Statistics, Regularization, Ensemble Learning, Decision Rules, Horseshoe prior, Machine Learning, Knowledge Discovery
National Category
Probability Theory and Statistics Computer Science Bioinformatics (Computational Biology) Other Computer and Information Science
URN: urn:nbn:se:liu:diva-130249ISRN: LIU-IDA/STAT-A--16/009--SEOAI: diva2:950073
Subject / course
Available from: 2016-08-05 Created: 2016-07-27 Last updated: 2016-08-05Bibliographically approved

Open Access in DiVA

fulltext(1322 kB)41 downloads
File information
File name FULLTEXT01.pdfFile size 1322 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
Probability Theory and StatisticsComputer ScienceBioinformatics (Computational Biology)Other Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 41 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 84 hits
ReferencesLink to record
Permanent link

Direct link