Ensemble of optimised machine learning algorithms for predicting surface soil moisture content at a global scale Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.17615/7m6b-2555
Accurate information on surface soil moisture (SSM) content at a global scale under different climatic conditions is important for hydrological and climatological applications. Machine-learning-based systematic integration of in situ hydrological measurements, complex environmental and climate data, and satellite observation facilitate the generation of reliable data products to monitor and analyse the exchange of water, energy, and carbon in the Earth system at a proper space–time resolution. This study investigates the estimation of daily SSM using 8 optimised machine learning (ML) algorithms and 10 ensemble models (constructed via model bootstrap aggregating techniques and five-fold cross-validation). The algorithmic implementations were trained and tested using International Soil Moisture Network (ISMN) data collected from 1722 stations distributed across the world. The result showed that the K-neighbours Regressor (KNR) had the lowest root-mean-square error (0.0379 cm3 cm−3) on the “test_random” set (for testing the performance of randomly split data during training), the Random Forest Regressor (RFR) had the lowest RMSE (0.0599 cm3 cm−3) on the “test_temporal” set (for testing the performance on the period that was not used in training), and AdaBoost (AB) had the lowest RMSE (0.0786 cm3 cm−3) on the “test_independent-stations” set (for testing the performance on the stations that were not used in training). Independent evaluation on novel stations across different climate zones was conducted. For the optimised ML algorithms, the median RMSE values were below 0.1 cm3 cm−3. GradientBoosting (GB), Multi-layer Perceptron Regressor (MLPR), Stochastic Gradient Descent Regressor (SGDR), and RFR achieved a median r score of 0.6 in 12, 11, 9, and 9 climate zones, respectively, out of 15 climate zones. The performance of ensemble models improved significantly, with the median RMSE value below 0.075 cm3 cm−3 for all climate zones. All voting regressors achieved r scores of above 0.6 in 13 climate zones; BSh (hot semi-arid climate) and BWh (hot desert climate) were the exceptions because of the sparse distribution of training stations. The metric evaluation showed that ensemble models can improve the performance of single ML algorithms and achieve more stable results. Based on the results computed for three different test sets, the ensemble model with KNR, RFR and Extreme Gradient Boosting (XB) performed the best. Overall, our investigation shows that ensemble machine learning algorithms have a greater capability with respect to predicting SSM compared with the optimised or base ML algorithms; this indicates their huge potential applicability in estimating water cycle budgets, managing irrigation, and predicting crop yields.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.17615/7m6b-2555
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4416207383
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4416207383Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.17615/7m6b-2555Digital Object Identifier
- Title
-
Ensemble of optimised machine learning algorithms for predicting surface soil moisture content at a global scaleWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-03-18Full publication date if available
- Authors
-
Qianqian Han, Yijian Zeng, Ting Duan, Chao Wang, Brigitta Szabó, Salvatore Manfreda, Ruodan ZhuangList of authors in order
- Landing page
-
https://doi.org/10.17615/7m6b-2555Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://doi.org/10.17615/7m6b-2555Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4416207383 |
|---|---|
| doi | https://doi.org/10.17615/7m6b-2555 |
| ids.doi | https://doi.org/10.17615/7m6b-2555 |
| ids.openalex | https://openalex.org/W4416207383 |
| fwci | |
| type | article |
| title | Ensemble of optimised machine learning algorithms for predicting surface soil moisture content at a global scale |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | doi:10.17615/7m6b-2555 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S7407051488 |
| locations[0].source.type | repository |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | UNC Libraries |
| locations[0].source.host_organization | |
| locations[0].source.host_organization_name | |
| locations[0].license | |
| locations[0].pdf_url | |
| locations[0].version | |
| locations[0].raw_type | article-journal |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.17615/7m6b-2555 |
| indexed_in | datacite |
| authorships[0].author.id | https://openalex.org/A5101398355 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Qianqian Han |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Han, Qianqian |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5036791093 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-2166-5314 |
| authorships[1].author.display_name | Yijian Zeng |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zeng, Yijian |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5085206018 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-6694-4520 |
| authorships[2].author.display_name | Ting Duan |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Duan, Ting |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5115076700 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-4887-923X |
| authorships[3].author.display_name | Chao Wang |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Wang, Chao |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5070013564 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-1485-8908 |
| authorships[4].author.display_name | Brigitta Szabó |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Szabó, Brigitta |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5023933354 |
| authorships[5].author.orcid | https://orcid.org/0000-0002-0225-144X |
| authorships[5].author.display_name | Salvatore Manfreda |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Manfreda, Salvatore |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5044104989 |
| authorships[6].author.orcid | https://orcid.org/0000-0003-0421-8245 |
| authorships[6].author.display_name | Ruodan Zhuang |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Zhuang, Ruodan |
| authorships[6].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://doi.org/10.17615/7m6b-2555 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Ensemble of optimised machine learning algorithms for predicting surface soil moisture content at a global scale |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-12-01T00:03:43.161839 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | doi:10.17615/7m6b-2555 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S7407051488 |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | UNC Libraries |
| best_oa_location.source.host_organization | |
| best_oa_location.source.host_organization_name | |
| best_oa_location.license | |
| best_oa_location.pdf_url | |
| best_oa_location.version | |
| best_oa_location.raw_type | article-journal |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.17615/7m6b-2555 |
| primary_location.id | doi:10.17615/7m6b-2555 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S7407051488 |
| primary_location.source.type | repository |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | UNC Libraries |
| primary_location.source.host_organization | |
| primary_location.source.host_organization_name | |
| primary_location.license | |
| primary_location.pdf_url | |
| primary_location.version | |
| primary_location.raw_type | article-journal |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.17615/7m6b-2555 |
| publication_date | 2025-03-18 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.8 | 75 |
| abstract_inverted_index.9 | 252 |
| abstract_inverted_index.a | 9, 62, 241, 368 |
| abstract_inverted_index.r | 243, 285 |
| abstract_inverted_index.10 | 82 |
| abstract_inverted_index.13 | 291 |
| abstract_inverted_index.15 | 258 |
| abstract_inverted_index.9, | 250 |
| abstract_inverted_index.ML | 216, 327, 382 |
| abstract_inverted_index.at | 8, 61 |
| abstract_inverted_index.in | 27, 57, 173, 200, 247, 290, 390 |
| abstract_inverted_index.is | 16 |
| abstract_inverted_index.of | 26, 42, 52, 71, 140, 245, 257, 263, 287, 307, 311, 325 |
| abstract_inverted_index.on | 2, 132, 158, 166, 185, 193, 204, 335 |
| abstract_inverted_index.or | 380 |
| abstract_inverted_index.to | 46, 373 |
| abstract_inverted_index.0.1 | 224 |
| abstract_inverted_index.0.6 | 246, 289 |
| abstract_inverted_index.11, | 249 |
| abstract_inverted_index.12, | 248 |
| abstract_inverted_index.All | 281 |
| abstract_inverted_index.BSh | 294 |
| abstract_inverted_index.BWh | 299 |
| abstract_inverted_index.For | 213 |
| abstract_inverted_index.RFR | 239, 349 |
| abstract_inverted_index.SSM | 73, 375 |
| abstract_inverted_index.The | 94, 116, 261, 314 |
| abstract_inverted_index.all | 278 |
| abstract_inverted_index.and | 20, 33, 36, 48, 55, 81, 91, 99, 175, 238, 251, 298, 329, 350, 397 |
| abstract_inverted_index.can | 321 |
| abstract_inverted_index.cm3 | 130, 156, 183, 225, 275 |
| abstract_inverted_index.for | 18, 277, 339 |
| abstract_inverted_index.had | 124, 151, 178 |
| abstract_inverted_index.not | 171, 198 |
| abstract_inverted_index.our | 359 |
| abstract_inverted_index.out | 256 |
| abstract_inverted_index.set | 135, 161, 188 |
| abstract_inverted_index.the | 40, 50, 58, 69, 114, 120, 125, 133, 138, 146, 152, 159, 164, 167, 179, 186, 191, 194, 214, 218, 269, 304, 308, 323, 336, 344, 356, 378 |
| abstract_inverted_index.via | 86 |
| abstract_inverted_index.was | 170, 211 |
| abstract_inverted_index.(AB) | 177 |
| abstract_inverted_index.(ML) | 79 |
| abstract_inverted_index.(XB) | 354 |
| abstract_inverted_index.(for | 136, 162, 189 |
| abstract_inverted_index.(hot | 295, 300 |
| abstract_inverted_index.1722 | 110 |
| abstract_inverted_index.KNR, | 348 |
| abstract_inverted_index.RMSE | 154, 181, 220, 271 |
| abstract_inverted_index.Soil | 103 |
| abstract_inverted_index.This | 66 |
| abstract_inverted_index.base | 381 |
| abstract_inverted_index.crop | 399 |
| abstract_inverted_index.data | 44, 107, 143 |
| abstract_inverted_index.from | 109 |
| abstract_inverted_index.have | 367 |
| abstract_inverted_index.huge | 387 |
| abstract_inverted_index.more | 331 |
| abstract_inverted_index.situ | 28 |
| abstract_inverted_index.soil | 4 |
| abstract_inverted_index.test | 342 |
| abstract_inverted_index.that | 119, 169, 196, 318, 362 |
| abstract_inverted_index.this | 384 |
| abstract_inverted_index.used | 172, 199 |
| abstract_inverted_index.were | 97, 197, 222, 303 |
| abstract_inverted_index.with | 268, 347, 371, 377 |
| abstract_inverted_index.(GB), | 228 |
| abstract_inverted_index.(KNR) | 123 |
| abstract_inverted_index.(RFR) | 150 |
| abstract_inverted_index.(SSM) | 6 |
| abstract_inverted_index.0.075 | 274 |
| abstract_inverted_index.Based | 334 |
| abstract_inverted_index.Earth | 59 |
| abstract_inverted_index.above | 288 |
| abstract_inverted_index.below | 223, 273 |
| abstract_inverted_index.best. | 357 |
| abstract_inverted_index.cycle | 393 |
| abstract_inverted_index.daily | 72 |
| abstract_inverted_index.data, | 35 |
| abstract_inverted_index.error | 128 |
| abstract_inverted_index.model | 87, 346 |
| abstract_inverted_index.novel | 205 |
| abstract_inverted_index.scale | 11 |
| abstract_inverted_index.score | 244 |
| abstract_inverted_index.sets, | 343 |
| abstract_inverted_index.shows | 361 |
| abstract_inverted_index.split | 142 |
| abstract_inverted_index.study | 67 |
| abstract_inverted_index.their | 386 |
| abstract_inverted_index.three | 340 |
| abstract_inverted_index.under | 12 |
| abstract_inverted_index.using | 74, 101 |
| abstract_inverted_index.value | 272 |
| abstract_inverted_index.water | 392 |
| abstract_inverted_index.zones | 210 |
| abstract_inverted_index.(ISMN) | 106 |
| abstract_inverted_index.Forest | 148 |
| abstract_inverted_index.Random | 147 |
| abstract_inverted_index.across | 113, 207 |
| abstract_inverted_index.carbon | 56 |
| abstract_inverted_index.desert | 301 |
| abstract_inverted_index.during | 144 |
| abstract_inverted_index.global | 10 |
| abstract_inverted_index.lowest | 126, 153, 180 |
| abstract_inverted_index.median | 219, 242, 270 |
| abstract_inverted_index.metric | 315 |
| abstract_inverted_index.models | 84, 265, 320 |
| abstract_inverted_index.period | 168 |
| abstract_inverted_index.proper | 63 |
| abstract_inverted_index.result | 117 |
| abstract_inverted_index.scores | 286 |
| abstract_inverted_index.showed | 118, 317 |
| abstract_inverted_index.single | 326 |
| abstract_inverted_index.sparse | 309 |
| abstract_inverted_index.stable | 332 |
| abstract_inverted_index.system | 60 |
| abstract_inverted_index.tested | 100 |
| abstract_inverted_index.values | 221 |
| abstract_inverted_index.voting | 282 |
| abstract_inverted_index.water, | 53 |
| abstract_inverted_index.world. | 115 |
| abstract_inverted_index.zones, | 254 |
| abstract_inverted_index.zones. | 260, 280 |
| abstract_inverted_index.zones; | 293 |
| abstract_inverted_index.(0.0379 | 129 |
| abstract_inverted_index.(0.0599 | 155 |
| abstract_inverted_index.(0.0786 | 182 |
| abstract_inverted_index.(MLPR), | 232 |
| abstract_inverted_index.(SGDR), | 237 |
| abstract_inverted_index.Descent | 235 |
| abstract_inverted_index.Extreme | 351 |
| abstract_inverted_index.Network | 105 |
| abstract_inverted_index.achieve | 330 |
| abstract_inverted_index.analyse | 49 |
| abstract_inverted_index.because | 306 |
| abstract_inverted_index.climate | 34, 209, 253, 259, 279, 292 |
| abstract_inverted_index.complex | 31 |
| abstract_inverted_index.content | 7 |
| abstract_inverted_index.energy, | 54 |
| abstract_inverted_index.greater | 369 |
| abstract_inverted_index.improve | 322 |
| abstract_inverted_index.machine | 77, 364 |
| abstract_inverted_index.monitor | 47 |
| abstract_inverted_index.respect | 372 |
| abstract_inverted_index.results | 337 |
| abstract_inverted_index.surface | 3 |
| abstract_inverted_index.testing | 137, 163, 190 |
| abstract_inverted_index.trained | 98 |
| abstract_inverted_index.yields. | 400 |
| abstract_inverted_index.Accurate | 0 |
| abstract_inverted_index.AdaBoost | 176 |
| abstract_inverted_index.Boosting | 353 |
| abstract_inverted_index.Gradient | 234, 352 |
| abstract_inverted_index.Moisture | 104 |
| abstract_inverted_index.Overall, | 358 |
| abstract_inverted_index.achieved | 240, 284 |
| abstract_inverted_index.budgets, | 394 |
| abstract_inverted_index.climate) | 297, 302 |
| abstract_inverted_index.climatic | 14 |
| abstract_inverted_index.compared | 376 |
| abstract_inverted_index.computed | 338 |
| abstract_inverted_index.ensemble | 83, 264, 319, 345, 363 |
| abstract_inverted_index.exchange | 51 |
| abstract_inverted_index.improved | 266 |
| abstract_inverted_index.learning | 78, 365 |
| abstract_inverted_index.managing | 395 |
| abstract_inverted_index.moisture | 5 |
| abstract_inverted_index.products | 45 |
| abstract_inverted_index.randomly | 141 |
| abstract_inverted_index.reliable | 43 |
| abstract_inverted_index.results. | 333 |
| abstract_inverted_index.stations | 111, 195, 206 |
| abstract_inverted_index.training | 312 |
| abstract_inverted_index.Regressor | 122, 149, 231, 236 |
| abstract_inverted_index.bootstrap | 88 |
| abstract_inverted_index.collected | 108 |
| abstract_inverted_index.different | 13, 208, 341 |
| abstract_inverted_index.five-fold | 92 |
| abstract_inverted_index.important | 17 |
| abstract_inverted_index.indicates | 385 |
| abstract_inverted_index.optimised | 76, 215, 379 |
| abstract_inverted_index.performed | 355 |
| abstract_inverted_index.potential | 388 |
| abstract_inverted_index.satellite | 37 |
| abstract_inverted_index.semi-arid | 296 |
| abstract_inverted_index.stations. | 313 |
| abstract_inverted_index.Perceptron | 230 |
| abstract_inverted_index.Stochastic | 233 |
| abstract_inverted_index.algorithms | 80, 328, 366 |
| abstract_inverted_index.capability | 370 |
| abstract_inverted_index.cm−3 | 276 |
| abstract_inverted_index.conditions | 15 |
| abstract_inverted_index.conducted. | 212 |
| abstract_inverted_index.estimating | 391 |
| abstract_inverted_index.estimation | 70 |
| abstract_inverted_index.evaluation | 203, 316 |
| abstract_inverted_index.exceptions | 305 |
| abstract_inverted_index.facilitate | 39 |
| abstract_inverted_index.generation | 41 |
| abstract_inverted_index.predicting | 374, 398 |
| abstract_inverted_index.regressors | 283 |
| abstract_inverted_index.systematic | 24 |
| abstract_inverted_index.techniques | 90 |
| abstract_inverted_index.training), | 145, 174 |
| abstract_inverted_index.training). | 201 |
| abstract_inverted_index.Independent | 202 |
| abstract_inverted_index.Multi-layer | 229 |
| abstract_inverted_index.aggregating | 89 |
| abstract_inverted_index.algorithmic | 95 |
| abstract_inverted_index.algorithms, | 217 |
| abstract_inverted_index.algorithms; | 383 |
| abstract_inverted_index.cm−3) | 131, 157, 184 |
| abstract_inverted_index.cm−3. | 226 |
| abstract_inverted_index.distributed | 112 |
| abstract_inverted_index.information | 1 |
| abstract_inverted_index.integration | 25 |
| abstract_inverted_index.irrigation, | 396 |
| abstract_inverted_index.observation | 38 |
| abstract_inverted_index.performance | 139, 165, 192, 262, 324 |
| abstract_inverted_index.resolution. | 65 |
| abstract_inverted_index.(constructed | 85 |
| abstract_inverted_index.K-neighbours | 121 |
| abstract_inverted_index.distribution | 310 |
| abstract_inverted_index.hydrological | 19, 29 |
| abstract_inverted_index.investigates | 68 |
| abstract_inverted_index.International | 102 |
| abstract_inverted_index.applicability | 389 |
| abstract_inverted_index.applications. | 22 |
| abstract_inverted_index.environmental | 32 |
| abstract_inverted_index.investigation | 360 |
| abstract_inverted_index.measurements, | 30 |
| abstract_inverted_index.respectively, | 255 |
| abstract_inverted_index.climatological | 21 |
| abstract_inverted_index.significantly, | 267 |
| abstract_inverted_index.implementations | 96 |
| abstract_inverted_index.GradientBoosting | 227 |
| abstract_inverted_index.root-mean-square | 127 |
| abstract_inverted_index.space–time | 64 |
| abstract_inverted_index.cross-validation). | 93 |
| abstract_inverted_index.Machine-learning-based | 23 |
| abstract_inverted_index.“test_random” | 134 |
| abstract_inverted_index.“test_temporal” | 160 |
| abstract_inverted_index.“test_independent-stations” | 187 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 7 |
| citation_normalized_percentile |