Random forest ≈ Random forest
View article
LightGBM: A Highly Efficient Gradient Boosting Decision Tree Open
Gradient Boosting Decision Tree (GBDT) is a popular machine learning algorithm, and has quite a few effective implementations such as XGBoost and pGBRT. Although many engineering optimizations have been adopted in these implementations, th…
View article
SoilGrids250m: Global gridded soil information based on machine learning Open
This paper describes the technical development and accuracy assessment of the most recent and improved version of the SoilGrids system at 250m resolution (June 2016 update). SoilGrids provides global predictions for standard numeric soil p…
View article
Random Erasing Data Augmentation Open
In this paper, we introduce Random Erasing, a new data augmentation method for training the convolutional neural network (CNN). In training, Random Erasing randomly selects a rectangle region in an image and erases its pixels with random v…
View article
The 30 m annual land cover dataset and its dynamics in China from 1990 to 2019 Open
Land cover (LC) determines the energy exchange, water and carbon cycle between Earth's spheres. Accurate LC information is a fundamental parameter for the environment and climate studies. Considering that the LC in China has been altered d…
View article
A Deep Learning Approach for Intrusion Detection Using Recurrent Neural Networks Open
Intrusion detection plays an important role in ensuring information security, and the key technology is to accurately identify various attacks in the network. In this paper, we explore how to model an intrusion detection system based on de…
View article
Implementation of machine-learning classification in remote sensing: an applied review Open
Machine learning offers the potential for effective and efficient classification of remotely sensed imagery. The strengths of machine learning include the capacity to handle data of high dimensionality and to map classes with very complex …
View article
Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques Open
Heart disease is one of the most significant causes of mortality in the world today. Prediction of cardiovascular disease is a critical challenge in the area of clinical data analysis. Machine learning (ML) has been shown to be effective i…
View article
Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimization Open
Hyperparameters are important for machine learning algorithms since they directly control the behaviors of training algorithms and have a significant effect on the performance of machine learning models. Several techniques have been develo…
View article
CatBoost: gradient boosting with categorical features support Open
In this paper we present CatBoost, a new open-sourced gradient boosting library that successfully handles categorical features and outperforms existing publicly available implementations of gradient boosting in terms of quality on a set of…
View article
Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery Open
In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a f…
View article
SRAMP: prediction of mammalian N<sup>6</sup>-methyladenosine (m<sup>6</sup>A) sites based on sequence-derived features Open
N(6)-methyladenosine (m(6)A) is a prevalent RNA methylation modification involved in the regulation of degradation, subcellular localization, splicing and local conformation changes of RNA transcripts. High-throughput experiments have demo…
View article
The random forest algorithm for statistical learning Open
Random forests (Breiman, 2001, Machine Learning 45: 5–32) is a statistical- or machine-learning algorithm for prediction. In this article, we introduce a corresponding new command, rforest. We overview the random forest algorithm and illus…
View article
CatBoost: unbiased boosting with categorical features Open
This paper presents the key algorithmic techniques behind CatBoost, a new gradient boosting toolkit. Their combination leads to CatBoost outperforming other publicly available boosting implementations in terms of quality on a variety of da…
View article
Land-Use Land-Cover Classification by Machine Learning Classifiers for Satellite Observations—A Review Open
Rapid and uncontrolled population growth along with economic and industrial development, especially in developing countries during the late twentieth and early twenty-first centuries, have increased the rate of land-use/land-cover (LULC) c…
View article
Radiomics: the facts and the challenges of image analysis Open
Radiomics is an emerging translational field of research aiming to extract mineable high-dimensional data from clinical images. The radiomic process can be divided into distinct steps with definable inputs and outputs, such as image acquis…
View article
Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables Open
Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still exis…
View article
Support Vector Machine Versus Random Forest for Remote Sensing Image Classification: A Meta-Analysis and Systematic Review Open
1.\tClimate change poses a significant threat to Arctic freshwater biodiversity, but impacts depend upon the strength of organism response to climate‐related drivers. Currently, there is insufficient knowledge about Arctic freshwater biodi…
View article
A Survey of Ensemble Learning: Concepts, Algorithms, Applications, and Prospects Open
Ensemble learning techniques have achieved state-of-the-art performance in diverse machine learning applications by combining the predictions from two or more base models. This paper presents a concise overview of ensemble learning, coveri…
View article
Trees vs Neurons: Comparison between random forest and ANN for high-resolution prediction of building energy consumption Open
Energy prediction models are used in buildings as a performance evaluation engine in advanced control and optimisation, and in making informed decisions by facility managers and utilities for enhanced energy efficiency. Simplified and data…
View article
Metalearners for estimating heterogeneous treatment effects using machine learning Open
Significance Estimating and analyzing heterogeneous treatment effects is timely, yet challenging. We introduce a unifying framework for many conditional average treatment effect estimators, and we propose a metalearner, the X-learner, whic…
View article
First Experience with Sentinel-2 Data for Crop and Tree Species Classifications in Central Europe Open
The study presents the preliminary results of two classification exercises assessing the capabilities of pre-operational (August 2015) Sentinel-2 (S2) data for mapping crop types and tree species. In the first case study, an S2 image was u…
View article
Predicting Diabetes Mellitus With Machine Learning Techniques Open
Diabetes mellitus is a chronic disease characterized by hyperglycemia. It may cause many complications. According to the growing morbidity in recent years, in 2040, the world's diabetic patients will reach 642 million, which means that one…
View article
A review of supervised object-based land-cover image classification Open
Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors,…
View article
Analysis of Dimensionality Reduction Techniques on Big Data Open
Due to digitization, a huge volume of data is being generated across several sectors such as healthcare, production, sales, IoT devices, Web, organizations. Machine learning algorithms are used to uncover patterns among the attributes of t…
View article
Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches Open
Attack and anomaly detection in the Internet of Things (IoT) infrastructure is a rising concern in the domain of IoT. With the increased use of IoT infrastructure in every domain, threats and attacks in these infrastructures are also growi…
View article
Discovering biomarkers associated and predicting cardiovascular disease with high accuracy using a novel nexus of machine learning techniques for precision medicine Open
Personalized interventions are deemed vital given the intricate characteristics, advancement, inherent genetic composition, and diversity of cardiovascular diseases (CVDs). The appropriate utilization of artificial intelligence (AI) and ma…
View article
AF Classification from a Short Single Lead ECG Recording: the Physionet Computing in Cardiology Challenge 2017 Open
The PhysioNet/Computing in Cardiology (CinC) Challenge 2017 focused on differentiating AF from noise, normal or other rhythms in short term (from 9-61 s) ECG recordings performed by patients. A total of 12,186 ECGs were used: 8,528 in the …
View article
Extracting spatial effects from machine learning model using local interpretation method: An example of SHAP and XGBoost Open
Machine learning and artificial intelligence (ML/AI), previously considered black box approaches, are becoming more interpretable, as a result of the recent advances in eXplainable AI (XAI). In particular, local interpretation methods such…
View article
Random Forests for Global and Regional Crop Yield Predictions Open
Accurate predictions of crop yield are critical for developing effective agricultural and food policies at the regional and global scales. We evaluated a machine-learning method, Random Forests (RF), for its ability to predict crop yield r…
View article
Random Erasing Data Augmentation Open
In this paper, we introduce Random Erasing, a new data augmentation method for training the convolutional neural network (CNN). In training, Random Erasing randomly selects a rectangle region in an image and erases its pixels with random v…