Attention and Self-Attention in Random Forests
Abstract
New models of random forests jointly using the attention and self-attention mechanisms are proposed for solving the regression problem. The models can be regarded as extensions of the attention-based random forest whose idea stems from applying a combination of the Nadaraya-Watson kernel regression and the Huber's contamination model to random forests. The self-attention aims to capture dependencies of the tree predictions and to remove noise or anomalous predictions in the random forest. The self-attention module is trained jointly with the attention module for computing weights. It is shown that the training process of attention weights is reduced to solving a single quadratic or linear optimization problem. Three modifications of the general approach are proposed and compared. A specific multi-head self-attention for the random forest is also considered. Heads of the self-attention are obtained by changing its tuning parameters including the kernel parameters and the contamination parameter of models. Numerical experiments with various datasets illustrate the proposed models and show that the supplement of the self-attention improves the model performance for many datasets.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2022
- DOI:
- arXiv:
- arXiv:2207.04293
- Bibcode:
- 2022arXiv220704293U
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- arXiv admin note: text overlap with arXiv:2201.02880