Implementation of free and open-source semi-automatic feature engineering tool in landslide susceptibility mapping using the machine-learning algorithms RF, SVM, and XGBoost
Yükleniyor...
Tarih
2023
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
SPRINGER
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
Various machine learning (ML) techniques have been recommended and used in the literature to produce landslide susceptibility map (LSM). On the other hand, feature engineering (FE) is an important topic in ML studies, but the concept is ignored by most research. In this study, a novel FE framework, including feature selection, feature transformation, feature binning, and feature weighting, is proposed to produce LSMs using eXtreme gradient boosting (XGBoost), random forest (RF), and support vector machine (SVM). For this purpose, first, thirteen landslide conditioning factors used in data preprocessing were utilized for producing LSM models in the study area, Babadag district of Denizli Province in the Aegean region of Turkey. Second, two irrelevant factors eliminated from the input feature subset using the feature selection in the FE framework. Third, features determined as skewed data were converted into symmetric form by applying feature transformation analysis with log transformation. Then, the remaining factors having continuous values were turned into categorical values using the quantile classifier technique. During the feature weighting phase, four different feature weighting methods, namely, eXtreme Gradient Boosting, random forest (RF), non-negative least squares (NNLS), and Frequency Ratio, were utilized to calculate the weights in each subclass of each landslide-related factor. In addition, the proposed feature subsets were also compared with raw data. At the end of process, the XGBoost model constructed with a FR-selected subset (Overall Accuracy (Acc) = 0.907 and area under curve (AUC) = 0.9822) outperformed both raw (Acc = 0.874; AUC = 0.960) and other methods (i.e., RF-FR and SVM-NNLS). Consequently, the study results revealed that the proposed FE approach could be a useful framework to increase the performance of ML techniques in identifying and extracting relevant features to develop highly optimized and enriched models.
Açıklama
Anahtar Kelimeler
Landslide Susceptibility, Data Preparation, Feature Engineering, Feature Transformation, Feature Weighting, Machine Learning
Kaynak
Stochastic Environmental Research and Risk Assessment
WoS Q Değeri
Q1
Scopus Q Değeri
Q1
Cilt
37
Sayı
3
Künye
Sahin, E. K. (2023). Implementation of free and open-source semi-automatic feature engineering tool in landslide susceptibility mapping using the machine-learning algorithms RF, SVM, and XGBoost. Stochastic Environmental Research and Risk Assessment, 37(3), 1067-1092.