Effects of Sampling Methods on Prediction Quality. The Case of Classifying Land Cover Using Decision Trees
Abstract
Clever sampling methods can be used to improve the handling of big data and increase its usefulness. The subject of this study is remote sensing, specifically airborne laser scanning point clouds representing different classes of ground cover. The aim is to derive a supervised learning model for the classification using CARTs. In order to measure the effect of different sampling methods on the classification accuracy, various experiments with varying types of sampling methods, sample sizes, and accuracy metrics have been designed. Numerical results for a subset of a large surveying project covering the lower Rhine area in Germany are shown. General conclusions regarding sampling design are drawn and presented.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2014
- DOI:
- 10.48550/arXiv.1405.3295
- arXiv:
- arXiv:1405.3295
- Bibcode:
- 2014arXiv1405.3295H
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Machine Learning;
- Statistics - Applications
- E-Print:
- Proceedings of COMPSTAT 2014: 585-592. 2014