Ensemble learning with trees and rules: Supervised, semi-supervised, unsupervised
Abstract
In this article, we propose several new approaches for post processing a large ensemble of conjunctive rules for supervised, semi-supervised and unsupervised learning problems. We show with various examples that for high dimensional regression problems the models constructed by post processing the rules with partial least squares regression have significantly better prediction performance than the ones produced by the random forest or the rulefit algorithms which use equal weights or weights estimated from lasso regression. When rule ensembles are used for semi-supervised and unsupervised learning, the internal and external measures of cluster validity point to high quality groupings.