A Hybrid Ensemble Method with Focal Loss for Improved Forecasting Accuracy on Imbalanced Datasets Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.20944/preprints202504.0831.v1
The inherent complexity and dynamic characteristics of diverse datasets present significant challenges for achieving high predictive accuracy in forecasting tasks. This study tackles these challenges by implementing a hybrid ensemble model aimed at enhancing predictive performance across imbalanced datasets. Using data from a competitive data source, the approach integrates LightGBM, XGBoost, and Logistic Regression models within a weighted ensemble framework to improve overall prediction accuracy. Data preprocessing techniques, including KNN imputation, Z-score normalization, and SMOTE, are employed to handle missing values, outliers, and class imbalances, ensuring a robust input for model training. The ensemble framework incorporates a Focal Loss function to specifically address class imbalances and refine prediction precision. Comparative analyses reveal that the proposed ensemble model consistently outperforms individual models in terms of accuracy, precision, recall, and AUC. This study offers a versatile and reliable solution for forecasting challenges, demonstrating enhanced robustness and broad applicability across domains.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- https://doi.org/10.20944/preprints202504.0831.v1
- https://www.preprints.org/frontend/manuscript/e9f636e1fe05510dc6a06a52b272e9d3/download_pub
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4409400561
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4409400561Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.20944/preprints202504.0831.v1Digital Object Identifier
- Title
-
A Hybrid Ensemble Method with Focal Loss for Improved Forecasting Accuracy on Imbalanced DatasetsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-04-10Full publication date if available
- Authors
-
Xiaojun Guo, Wenxiu Cai, Cheng Yu, Jiaqi Chen, Liyang WangList of authors in order
- Landing page
-
https://doi.org/10.20944/preprints202504.0831.v1Publisher landing page
- PDF URL
-
https://www.preprints.org/frontend/manuscript/e9f636e1fe05510dc6a06a52b272e9d3/download_pubDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://www.preprints.org/frontend/manuscript/e9f636e1fe05510dc6a06a52b272e9d3/download_pubDirect OA link when available
- Concepts
-
Computer science, Artificial intelligence, Machine learning, Econometrics, Statistics, MathematicsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4409400561 |
|---|---|
| doi | https://doi.org/10.20944/preprints202504.0831.v1 |
| ids.doi | https://doi.org/10.20944/preprints202504.0831.v1 |
| ids.openalex | https://openalex.org/W4409400561 |
| fwci | 0.0 |
| type | preprint |
| title | A Hybrid Ensemble Method with Focal Loss for Improved Forecasting Accuracy on Imbalanced Datasets |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11652 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9621999859809875 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Imbalanced Data Classification Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.5099700689315796 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C154945302 |
| concepts[1].level | 1 |
| concepts[1].score | 0.4527644217014313 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[1].display_name | Artificial intelligence |
| concepts[2].id | https://openalex.org/C119857082 |
| concepts[2].level | 1 |
| concepts[2].score | 0.3405503034591675 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[2].display_name | Machine learning |
| concepts[3].id | https://openalex.org/C149782125 |
| concepts[3].level | 1 |
| concepts[3].score | 0.3352709412574768 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q160039 |
| concepts[3].display_name | Econometrics |
| concepts[4].id | https://openalex.org/C105795698 |
| concepts[4].level | 1 |
| concepts[4].score | 0.3203771710395813 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q12483 |
| concepts[4].display_name | Statistics |
| concepts[5].id | https://openalex.org/C33923547 |
| concepts[5].level | 0 |
| concepts[5].score | 0.18411952257156372 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[5].display_name | Mathematics |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.5099700689315796 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[1].score | 0.4527644217014313 |
| keywords[1].display_name | Artificial intelligence |
| keywords[2].id | https://openalex.org/keywords/machine-learning |
| keywords[2].score | 0.3405503034591675 |
| keywords[2].display_name | Machine learning |
| keywords[3].id | https://openalex.org/keywords/econometrics |
| keywords[3].score | 0.3352709412574768 |
| keywords[3].display_name | Econometrics |
| keywords[4].id | https://openalex.org/keywords/statistics |
| keywords[4].score | 0.3203771710395813 |
| keywords[4].display_name | Statistics |
| keywords[5].id | https://openalex.org/keywords/mathematics |
| keywords[5].score | 0.18411952257156372 |
| keywords[5].display_name | Mathematics |
| language | en |
| locations[0].id | doi:10.20944/preprints202504.0831.v1 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S6309402219 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | Preprints.org |
| locations[0].source.host_organization | |
| locations[0].source.host_organization_name | |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310310987 |
| locations[0].source.host_organization_lineage_names | Multidisciplinary Digital Publishing Institute |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://www.preprints.org/frontend/manuscript/e9f636e1fe05510dc6a06a52b272e9d3/download_pub |
| locations[0].version | acceptedVersion |
| locations[0].raw_type | posted-content |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.20944/preprints202504.0831.v1 |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5039235939 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-6694-3827 |
| authorships[0].author.display_name | Xiaojun Guo |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Xiaojun Guo |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5111376387 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Wenxiu Cai |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wenxiu Cai |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5026930299 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-1039-1300 |
| authorships[2].author.display_name | Cheng Yu |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Yu Cheng |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5100434197 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-4069-193X |
| authorships[3].author.display_name | Jiaqi Chen |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Jiaqi Chen |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5089456973 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-9820-3310 |
| authorships[4].author.display_name | Liyang Wang |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Liyang Wang |
| authorships[4].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.preprints.org/frontend/manuscript/e9f636e1fe05510dc6a06a52b272e9d3/download_pub |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | A Hybrid Ensemble Method with Focal Loss for Improved Forecasting Accuracy on Imbalanced Datasets |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T11652 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9621999859809875 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Imbalanced Data Classification Techniques |
| related_works | https://openalex.org/W2961085424, https://openalex.org/W4306674287, https://openalex.org/W4387369504, https://openalex.org/W4394896187, https://openalex.org/W3170094116, https://openalex.org/W4386462264, https://openalex.org/W3107602296, https://openalex.org/W4364306694, https://openalex.org/W4312192474, https://openalex.org/W4283697347 |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | doi:10.20944/preprints202504.0831.v1 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S6309402219 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | Preprints.org |
| best_oa_location.source.host_organization | |
| best_oa_location.source.host_organization_name | |
| best_oa_location.source.host_organization_lineage | https://openalex.org/P4310310987 |
| best_oa_location.source.host_organization_lineage_names | Multidisciplinary Digital Publishing Institute |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://www.preprints.org/frontend/manuscript/e9f636e1fe05510dc6a06a52b272e9d3/download_pub |
| best_oa_location.version | acceptedVersion |
| best_oa_location.raw_type | posted-content |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.20944/preprints202504.0831.v1 |
| primary_location.id | doi:10.20944/preprints202504.0831.v1 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S6309402219 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | Preprints.org |
| primary_location.source.host_organization | |
| primary_location.source.host_organization_name | |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310310987 |
| primary_location.source.host_organization_lineage_names | Multidisciplinary Digital Publishing Institute |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://www.preprints.org/frontend/manuscript/e9f636e1fe05510dc6a06a52b272e9d3/download_pub |
| primary_location.version | acceptedVersion |
| primary_location.raw_type | posted-content |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.20944/preprints202504.0831.v1 |
| publication_date | 2025-04-10 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 27, 42, 56, 86, 96, 132 |
| abstract_inverted_index.at | 32 |
| abstract_inverted_index.by | 25 |
| abstract_inverted_index.in | 17, 121 |
| abstract_inverted_index.of | 6, 123 |
| abstract_inverted_index.to | 60, 77, 100 |
| abstract_inverted_index.KNN | 69 |
| abstract_inverted_index.The | 0, 92 |
| abstract_inverted_index.and | 3, 51, 73, 82, 105, 127, 134, 143 |
| abstract_inverted_index.are | 75 |
| abstract_inverted_index.for | 12, 89, 137 |
| abstract_inverted_index.the | 46, 113 |
| abstract_inverted_index.AUC. | 128 |
| abstract_inverted_index.Data | 65 |
| abstract_inverted_index.Loss | 98 |
| abstract_inverted_index.This | 20, 129 |
| abstract_inverted_index.data | 40, 44 |
| abstract_inverted_index.from | 41 |
| abstract_inverted_index.high | 14 |
| abstract_inverted_index.that | 112 |
| abstract_inverted_index.Focal | 97 |
| abstract_inverted_index.Using | 39 |
| abstract_inverted_index.aimed | 31 |
| abstract_inverted_index.broad | 144 |
| abstract_inverted_index.class | 83, 103 |
| abstract_inverted_index.input | 88 |
| abstract_inverted_index.model | 30, 90, 116 |
| abstract_inverted_index.study | 21, 130 |
| abstract_inverted_index.terms | 122 |
| abstract_inverted_index.these | 23 |
| abstract_inverted_index.SMOTE, | 74 |
| abstract_inverted_index.across | 36, 146 |
| abstract_inverted_index.handle | 78 |
| abstract_inverted_index.hybrid | 28 |
| abstract_inverted_index.models | 54, 120 |
| abstract_inverted_index.offers | 131 |
| abstract_inverted_index.refine | 106 |
| abstract_inverted_index.reveal | 111 |
| abstract_inverted_index.robust | 87 |
| abstract_inverted_index.tasks. | 19 |
| abstract_inverted_index.within | 55 |
| abstract_inverted_index.Z-score | 71 |
| abstract_inverted_index.address | 102 |
| abstract_inverted_index.diverse | 7 |
| abstract_inverted_index.dynamic | 4 |
| abstract_inverted_index.improve | 61 |
| abstract_inverted_index.missing | 79 |
| abstract_inverted_index.overall | 62 |
| abstract_inverted_index.present | 9 |
| abstract_inverted_index.recall, | 126 |
| abstract_inverted_index.source, | 45 |
| abstract_inverted_index.tackles | 22 |
| abstract_inverted_index.values, | 80 |
| abstract_inverted_index.Logistic | 52 |
| abstract_inverted_index.XGBoost, | 50 |
| abstract_inverted_index.accuracy | 16 |
| abstract_inverted_index.analyses | 110 |
| abstract_inverted_index.approach | 47 |
| abstract_inverted_index.datasets | 8 |
| abstract_inverted_index.domains. | 147 |
| abstract_inverted_index.employed | 76 |
| abstract_inverted_index.enhanced | 141 |
| abstract_inverted_index.ensemble | 29, 58, 93, 115 |
| abstract_inverted_index.ensuring | 85 |
| abstract_inverted_index.function | 99 |
| abstract_inverted_index.inherent | 1 |
| abstract_inverted_index.proposed | 114 |
| abstract_inverted_index.reliable | 135 |
| abstract_inverted_index.solution | 136 |
| abstract_inverted_index.weighted | 57 |
| abstract_inverted_index.LightGBM, | 49 |
| abstract_inverted_index.accuracy, | 124 |
| abstract_inverted_index.accuracy. | 64 |
| abstract_inverted_index.achieving | 13 |
| abstract_inverted_index.datasets. | 38 |
| abstract_inverted_index.enhancing | 33 |
| abstract_inverted_index.framework | 59, 94 |
| abstract_inverted_index.including | 68 |
| abstract_inverted_index.outliers, | 81 |
| abstract_inverted_index.training. | 91 |
| abstract_inverted_index.versatile | 133 |
| abstract_inverted_index.Regression | 53 |
| abstract_inverted_index.challenges | 11, 24 |
| abstract_inverted_index.complexity | 2 |
| abstract_inverted_index.imbalanced | 37 |
| abstract_inverted_index.imbalances | 104 |
| abstract_inverted_index.individual | 119 |
| abstract_inverted_index.integrates | 48 |
| abstract_inverted_index.precision, | 125 |
| abstract_inverted_index.precision. | 108 |
| abstract_inverted_index.prediction | 63, 107 |
| abstract_inverted_index.predictive | 15, 34 |
| abstract_inverted_index.robustness | 142 |
| abstract_inverted_index.Comparative | 109 |
| abstract_inverted_index.challenges, | 139 |
| abstract_inverted_index.competitive | 43 |
| abstract_inverted_index.forecasting | 18, 138 |
| abstract_inverted_index.imbalances, | 84 |
| abstract_inverted_index.imputation, | 70 |
| abstract_inverted_index.outperforms | 118 |
| abstract_inverted_index.performance | 35 |
| abstract_inverted_index.significant | 10 |
| abstract_inverted_index.techniques, | 67 |
| abstract_inverted_index.consistently | 117 |
| abstract_inverted_index.implementing | 26 |
| abstract_inverted_index.incorporates | 95 |
| abstract_inverted_index.specifically | 101 |
| abstract_inverted_index.applicability | 145 |
| abstract_inverted_index.demonstrating | 140 |
| abstract_inverted_index.preprocessing | 66 |
| abstract_inverted_index.normalization, | 72 |
| abstract_inverted_index.characteristics | 5 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile.value | 0.04926108 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |