Implementation of free and open-source semi-automatic feature engineering tool in landslide susceptibility mapping using the machine-learning algorithms RF, SVM, and XGBoost

dc.authorid0000-0002-9830-8585en_US
dc.contributor.authorŞahin, Emrehan Kutluğ
dc.date.accessioned2023-08-29T07:53:05Z
dc.date.available2023-08-29T07:53:05Z
dc.date.issued2023en_US
dc.departmentBAİBÜ, Mühendislik Fakültesi, İnşaat Mühendisliği Bölümüen_US
dc.description.abstractVarious machine learning (ML) techniques have been recommended and used in the literature to produce landslide susceptibility map (LSM). On the other hand, feature engineering (FE) is an important topic in ML studies, but the concept is ignored by most research. In this study, a novel FE framework, including feature selection, feature transformation, feature binning, and feature weighting, is proposed to produce LSMs using eXtreme gradient boosting (XGBoost), random forest (RF), and support vector machine (SVM). For this purpose, first, thirteen landslide conditioning factors used in data preprocessing were utilized for producing LSM models in the study area, Babadag district of Denizli Province in the Aegean region of Turkey. Second, two irrelevant factors eliminated from the input feature subset using the feature selection in the FE framework. Third, features determined as skewed data were converted into symmetric form by applying feature transformation analysis with log transformation. Then, the remaining factors having continuous values were turned into categorical values using the quantile classifier technique. During the feature weighting phase, four different feature weighting methods, namely, eXtreme Gradient Boosting, random forest (RF), non-negative least squares (NNLS), and Frequency Ratio, were utilized to calculate the weights in each subclass of each landslide-related factor. In addition, the proposed feature subsets were also compared with raw data. At the end of process, the XGBoost model constructed with a FR-selected subset (Overall Accuracy (Acc) = 0.907 and area under curve (AUC) = 0.9822) outperformed both raw (Acc = 0.874; AUC = 0.960) and other methods (i.e., RF-FR and SVM-NNLS). Consequently, the study results revealed that the proposed FE approach could be a useful framework to increase the performance of ML techniques in identifying and extracting relevant features to develop highly optimized and enriched models.en_US
dc.description.sponsorshipThe raw data used in this paper was obtained from the project ``Development of ArcGIS Interfaces with R programming language for Landslide Susceptibility Mapping'' (No. 118Y090) funded by The Scientific and Technological Research Council of Turkey (TUBITAK).en_US
dc.identifier.citationSahin, E. K. (2023). Implementation of free and open-source semi-automatic feature engineering tool in landslide susceptibility mapping using the machine-learning algorithms RF, SVM, and XGBoost. Stochastic Environmental Research and Risk Assessment, 37(3), 1067-1092.en_US
dc.identifier.doi10.1007/s00477-022-02330-y
dc.identifier.endpage1092en_US
dc.identifier.issn1436-3240
dc.identifier.issn1436-3259
dc.identifier.issue3en_US
dc.identifier.scopus2-s2.0-85141386495en_US
dc.identifier.scopusqualityQ1en_US
dc.identifier.startpage1067en_US
dc.identifier.urihttp://dx.doi.org/10.1007/s00477-022-02330-y
dc.identifier.urihttps://hdl.handle.net/20.500.12491/11608
dc.identifier.volume37en_US
dc.identifier.wosWOS:000878953800001en_US
dc.identifier.wosqualityQ1en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.institutionauthorŞahin, Emrehan Kutluğ
dc.language.isoenen_US
dc.publisherSPRINGERen_US
dc.relation.ispartofStochastic Environmental Research and Risk Assessmenten_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.relation.tubitakScientific and Technological Research Council of Turkey (TUBITAK) [118Y090]
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectLandslide Susceptibilityen_US
dc.subjectData Preparationen_US
dc.subjectFeature Engineeringen_US
dc.subjectFeature Transformationen_US
dc.subjectFeature Weightingen_US
dc.subjectMachine Learningen_US
dc.titleImplementation of free and open-source semi-automatic feature engineering tool in landslide susceptibility mapping using the machine-learning algorithms RF, SVM, and XGBoosten_US
dc.typeArticleen_US

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
emrehan-kutlug-sahin.pdf
Boyut:
4.46 MB
Biçim:
Adobe Portable Document Format
Açıklama:
Tam Metin / Full Text
Lisans paketi
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
license.txt
Boyut:
1.44 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: