A hybrid algorithm for Bayesian network structure learning with application to multilabel learning
Abstract
We present a novel hybrid algorithm for Bayesian network structure learning, called H2PC. It first reconstructs the skeleton of a Bayesian network and then performs a Bayesianscoring greedy hillclimbing search to orient the edges. The algorithm is based on divideandconquer constraintbased subroutines to learn the local structure around a target variable. We conduct two series of experimental comparisons of H2PC against MaxMin HillClimbing (MMHC), which is currently the most powerful stateoftheart algorithm for Bayesian network structure learning. First, we use eight wellknown Bayesian network benchmarks with various data sizes to assess the quality of the learned structure returned by the algorithms. Our extensive experiments show that H2PC outperforms MMHC in terms of goodness of fit to new data and quality of the network structure with respect to the true dependence structure of the data. Second, we investigate H2PC's ability to solve the multilabel learning problem. We provide theoretical results to characterize and identify graphically the socalled minimal label powersets that appear as irreducible factors in the joint distribution under the faithfulness condition. The multilabel learning problem is then decomposed into a series of multiclass classification problems, where each multiclass variable encodes a label powerset. H2PC is shown to compare favorably to MMHC in terms of global classification accuracy over ten multilabel data sets covering different application domains. Overall, our experiments support the conclusions that local structural learning with H2PC in the form of local neighborhood induction is a theoretically wellmotivated and empirically effective learning framework that is well suited to multilabel learning. The source code (in R) of H2PC as well as all data sets used for the empirical tests are publicly available.
 Publication:

arXiv eprints
 Pub Date:
 June 2015
 arXiv:
 arXiv:1506.05692
 Bibcode:
 2015arXiv150605692G
 Keywords:

 Statistics  Machine Learning;
 Computer Science  Artificial Intelligence;
 Computer Science  Machine Learning
 EPrint:
 arXiv admin note: text overlap with arXiv:1101.5184 by other authors