An effective up-sampling approach for breast cancer prediction with imbalanced data: A machine learning model-based comparative analysis Article Swipe
YOU?
·
· 2022
· Open Access
·
· DOI: https://doi.org/10.1371/journal.pone.0269135
Early detection of breast cancer plays a critical role in successful treatment that saves thousands of lives of patients every year. Despite massive clinical data have been collected and stored by healthcare organizations, only a small portion of the data has been used to support decision-making for treatments. In this study, we proposed an engineered up-sampling method (ENUS) for handling imbalanced data to improve predictive performance of machine learning models. Our experiment results showed that when the ratio of the minority to the majority class is less than 20%, training models with ENUS improved the balanced accuracy 3.74%, sensitivity 8.36% and F1 score 3.83%. Our study also identified that XGBoost Tree ( XGBTree ) using ENUS achieved the best performance with an average balanced accuracy of 97.47% (min = 93%, max = 100%), sensitivity of 97.88% (min = 89% and max = 100%), and F1 score of 96.20% (min = 89.5%, max = 100%) in the validation dataset. Furthermore, our ensemble algorithm identified Cell_Shape and Nuclei as the most important attributes in predicting breast cancer. The finding re-affirms the previous knowledge of the relationship between Cell_Shape , Nuclei , and the grades of breast cancer using a data-driven approach. Finally, our experiment showed that Random Forest and Neural Network models had the least training time. Our study provided a comprehensive comparison of a wide range of machine learning methods in predicting breast cancer risk. It can be used as a tool for healthcare practitioners to effectively detect and treat breast cancer.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1371/journal.pone.0269135
- https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0269135&type=printable
- OA Status
- gold
- Cited By
- 19
- References
- 40
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4281986473
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4281986473Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1371/journal.pone.0269135Digital Object Identifier
- Title
-
An effective up-sampling approach for breast cancer prediction with imbalanced data: A machine learning model-based comparative analysisWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2022Year of publication
- Publication date
-
2022-05-27Full publication date if available
- Authors
-
Tuan Tran, Uyen Le, Yihui ShiList of authors in order
- Landing page
-
https://doi.org/10.1371/journal.pone.0269135Publisher landing page
- PDF URL
-
https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0269135&type=printableDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
goldOpen access status per OpenAlex
- OA URL
-
https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0269135&type=printableDirect OA link when available
- Concepts
-
Breast cancer, Machine learning, Decision tree, Artificial intelligence, Random forest, Computer science, Artificial neural network, Sampling (signal processing), Tree (set theory), Cancer, Predictive modelling, Sensitivity (control systems), Data mining, Medicine, Mathematics, Internal medicine, Computer vision, Filter (signal processing), Mathematical analysis, Engineering, Electronic engineeringTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
19Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 3, 2024: 7, 2023: 8, 2022: 1Per-year citation counts (last 5 years)
- References (count)
-
40Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4281986473 |
|---|---|
| doi | https://doi.org/10.1371/journal.pone.0269135 |
| ids.doi | https://doi.org/10.1371/journal.pone.0269135 |
| ids.pmid | https://pubmed.ncbi.nlm.nih.gov/35622821 |
| ids.openalex | https://openalex.org/W4281986473 |
| fwci | 3.72017759 |
| mesh[0].qualifier_ui | |
| mesh[0].descriptor_ui | D000465 |
| mesh[0].is_major_topic | False |
| mesh[0].qualifier_name | |
| mesh[0].descriptor_name | Algorithms |
| mesh[1].qualifier_ui | Q000175 |
| mesh[1].descriptor_ui | D001943 |
| mesh[1].is_major_topic | True |
| mesh[1].qualifier_name | diagnosis |
| mesh[1].descriptor_name | Breast Neoplasms |
| mesh[2].qualifier_ui | |
| mesh[2].descriptor_ui | D005260 |
| mesh[2].is_major_topic | False |
| mesh[2].qualifier_name | |
| mesh[2].descriptor_name | Female |
| mesh[3].qualifier_ui | |
| mesh[3].descriptor_ui | D006801 |
| mesh[3].is_major_topic | False |
| mesh[3].qualifier_name | |
| mesh[3].descriptor_name | Humans |
| mesh[4].qualifier_ui | |
| mesh[4].descriptor_ui | D000069550 |
| mesh[4].is_major_topic | False |
| mesh[4].qualifier_name | |
| mesh[4].descriptor_name | Machine Learning |
| mesh[5].qualifier_ui | |
| mesh[5].descriptor_ui | D016571 |
| mesh[5].is_major_topic | False |
| mesh[5].qualifier_name | |
| mesh[5].descriptor_name | Neural Networks, Computer |
| mesh[6].qualifier_ui | |
| mesh[6].descriptor_ui | D000465 |
| mesh[6].is_major_topic | False |
| mesh[6].qualifier_name | |
| mesh[6].descriptor_name | Algorithms |
| mesh[7].qualifier_ui | Q000175 |
| mesh[7].descriptor_ui | D001943 |
| mesh[7].is_major_topic | True |
| mesh[7].qualifier_name | diagnosis |
| mesh[7].descriptor_name | Breast Neoplasms |
| mesh[8].qualifier_ui | |
| mesh[8].descriptor_ui | D005260 |
| mesh[8].is_major_topic | False |
| mesh[8].qualifier_name | |
| mesh[8].descriptor_name | Female |
| mesh[9].qualifier_ui | |
| mesh[9].descriptor_ui | D006801 |
| mesh[9].is_major_topic | False |
| mesh[9].qualifier_name | |
| mesh[9].descriptor_name | Humans |
| mesh[10].qualifier_ui | |
| mesh[10].descriptor_ui | D000069550 |
| mesh[10].is_major_topic | False |
| mesh[10].qualifier_name | |
| mesh[10].descriptor_name | Machine Learning |
| mesh[11].qualifier_ui | |
| mesh[11].descriptor_ui | D016571 |
| mesh[11].is_major_topic | False |
| mesh[11].qualifier_name | |
| mesh[11].descriptor_name | Neural Networks, Computer |
| mesh[12].qualifier_ui | |
| mesh[12].descriptor_ui | D000465 |
| mesh[12].is_major_topic | False |
| mesh[12].qualifier_name | |
| mesh[12].descriptor_name | Algorithms |
| mesh[13].qualifier_ui | Q000175 |
| mesh[13].descriptor_ui | D001943 |
| mesh[13].is_major_topic | True |
| mesh[13].qualifier_name | diagnosis |
| mesh[13].descriptor_name | Breast Neoplasms |
| mesh[14].qualifier_ui | |
| mesh[14].descriptor_ui | D005260 |
| mesh[14].is_major_topic | False |
| mesh[14].qualifier_name | |
| mesh[14].descriptor_name | Female |
| mesh[15].qualifier_ui | |
| mesh[15].descriptor_ui | D006801 |
| mesh[15].is_major_topic | False |
| mesh[15].qualifier_name | |
| mesh[15].descriptor_name | Humans |
| mesh[16].qualifier_ui | |
| mesh[16].descriptor_ui | D000069550 |
| mesh[16].is_major_topic | False |
| mesh[16].qualifier_name | |
| mesh[16].descriptor_name | Machine Learning |
| mesh[17].qualifier_ui | |
| mesh[17].descriptor_ui | D016571 |
| mesh[17].is_major_topic | False |
| mesh[17].qualifier_name | |
| mesh[17].descriptor_name | Neural Networks, Computer |
| type | article |
| title | An effective up-sampling approach for breast cancer prediction with imbalanced data: A machine learning model-based comparative analysis |
| biblio.issue | 5 |
| biblio.volume | 17 |
| biblio.last_page | e0269135 |
| biblio.first_page | e0269135 |
| topics[0].id | https://openalex.org/T10862 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9998999834060669 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | AI in cancer detection |
| topics[1].id | https://openalex.org/T12422 |
| topics[1].field.id | https://openalex.org/fields/27 |
| topics[1].field.display_name | Medicine |
| topics[1].score | 0.9865999817848206 |
| topics[1].domain.id | https://openalex.org/domains/4 |
| topics[1].domain.display_name | Health Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2741 |
| topics[1].subfield.display_name | Radiology, Nuclear Medicine and Imaging |
| topics[1].display_name | Radiomics and Machine Learning in Medical Imaging |
| topics[2].id | https://openalex.org/T11396 |
| topics[2].field.id | https://openalex.org/fields/36 |
| topics[2].field.display_name | Health Professions |
| topics[2].score | 0.9476000070571899 |
| topics[2].domain.id | https://openalex.org/domains/4 |
| topics[2].domain.display_name | Health Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/3605 |
| topics[2].subfield.display_name | Health Information Management |
| topics[2].display_name | Artificial Intelligence in Healthcare |
| is_xpac | False |
| apc_list.value | 1805 |
| apc_list.currency | USD |
| apc_list.value_usd | 1805 |
| apc_paid.value | 1805 |
| apc_paid.currency | USD |
| apc_paid.value_usd | 1805 |
| concepts[0].id | https://openalex.org/C530470458 |
| concepts[0].level | 3 |
| concepts[0].score | 0.7785950899124146 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q128581 |
| concepts[0].display_name | Breast cancer |
| concepts[1].id | https://openalex.org/C119857082 |
| concepts[1].level | 1 |
| concepts[1].score | 0.7594172954559326 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[1].display_name | Machine learning |
| concepts[2].id | https://openalex.org/C84525736 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6963607668876648 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q831366 |
| concepts[2].display_name | Decision tree |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.6789534091949463 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C169258074 |
| concepts[4].level | 2 |
| concepts[4].score | 0.6733863353729248 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q245748 |
| concepts[4].display_name | Random forest |
| concepts[5].id | https://openalex.org/C41008148 |
| concepts[5].level | 0 |
| concepts[5].score | 0.6217214465141296 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[5].display_name | Computer science |
| concepts[6].id | https://openalex.org/C50644808 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5827338695526123 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q192776 |
| concepts[6].display_name | Artificial neural network |
| concepts[7].id | https://openalex.org/C140779682 |
| concepts[7].level | 3 |
| concepts[7].score | 0.4667387902736664 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q210868 |
| concepts[7].display_name | Sampling (signal processing) |
| concepts[8].id | https://openalex.org/C113174947 |
| concepts[8].level | 2 |
| concepts[8].score | 0.43294116854667664 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q2859736 |
| concepts[8].display_name | Tree (set theory) |
| concepts[9].id | https://openalex.org/C121608353 |
| concepts[9].level | 2 |
| concepts[9].score | 0.42554253339767456 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q12078 |
| concepts[9].display_name | Cancer |
| concepts[10].id | https://openalex.org/C45804977 |
| concepts[10].level | 2 |
| concepts[10].score | 0.41805270314216614 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q7239673 |
| concepts[10].display_name | Predictive modelling |
| concepts[11].id | https://openalex.org/C21200559 |
| concepts[11].level | 2 |
| concepts[11].score | 0.411520779132843 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q7451068 |
| concepts[11].display_name | Sensitivity (control systems) |
| concepts[12].id | https://openalex.org/C124101348 |
| concepts[12].level | 1 |
| concepts[12].score | 0.3466758728027344 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q172491 |
| concepts[12].display_name | Data mining |
| concepts[13].id | https://openalex.org/C71924100 |
| concepts[13].level | 0 |
| concepts[13].score | 0.2830055356025696 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q11190 |
| concepts[13].display_name | Medicine |
| concepts[14].id | https://openalex.org/C33923547 |
| concepts[14].level | 0 |
| concepts[14].score | 0.18807131052017212 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[14].display_name | Mathematics |
| concepts[15].id | https://openalex.org/C126322002 |
| concepts[15].level | 1 |
| concepts[15].score | 0.12378951907157898 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q11180 |
| concepts[15].display_name | Internal medicine |
| concepts[16].id | https://openalex.org/C31972630 |
| concepts[16].level | 1 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q844240 |
| concepts[16].display_name | Computer vision |
| concepts[17].id | https://openalex.org/C106131492 |
| concepts[17].level | 2 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q3072260 |
| concepts[17].display_name | Filter (signal processing) |
| concepts[18].id | https://openalex.org/C134306372 |
| concepts[18].level | 1 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q7754 |
| concepts[18].display_name | Mathematical analysis |
| concepts[19].id | https://openalex.org/C127413603 |
| concepts[19].level | 0 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q11023 |
| concepts[19].display_name | Engineering |
| concepts[20].id | https://openalex.org/C24326235 |
| concepts[20].level | 1 |
| concepts[20].score | 0.0 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q126095 |
| concepts[20].display_name | Electronic engineering |
| keywords[0].id | https://openalex.org/keywords/breast-cancer |
| keywords[0].score | 0.7785950899124146 |
| keywords[0].display_name | Breast cancer |
| keywords[1].id | https://openalex.org/keywords/machine-learning |
| keywords[1].score | 0.7594172954559326 |
| keywords[1].display_name | Machine learning |
| keywords[2].id | https://openalex.org/keywords/decision-tree |
| keywords[2].score | 0.6963607668876648 |
| keywords[2].display_name | Decision tree |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.6789534091949463 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/random-forest |
| keywords[4].score | 0.6733863353729248 |
| keywords[4].display_name | Random forest |
| keywords[5].id | https://openalex.org/keywords/computer-science |
| keywords[5].score | 0.6217214465141296 |
| keywords[5].display_name | Computer science |
| keywords[6].id | https://openalex.org/keywords/artificial-neural-network |
| keywords[6].score | 0.5827338695526123 |
| keywords[6].display_name | Artificial neural network |
| keywords[7].id | https://openalex.org/keywords/sampling |
| keywords[7].score | 0.4667387902736664 |
| keywords[7].display_name | Sampling (signal processing) |
| keywords[8].id | https://openalex.org/keywords/tree |
| keywords[8].score | 0.43294116854667664 |
| keywords[8].display_name | Tree (set theory) |
| keywords[9].id | https://openalex.org/keywords/cancer |
| keywords[9].score | 0.42554253339767456 |
| keywords[9].display_name | Cancer |
| keywords[10].id | https://openalex.org/keywords/predictive-modelling |
| keywords[10].score | 0.41805270314216614 |
| keywords[10].display_name | Predictive modelling |
| keywords[11].id | https://openalex.org/keywords/sensitivity |
| keywords[11].score | 0.411520779132843 |
| keywords[11].display_name | Sensitivity (control systems) |
| keywords[12].id | https://openalex.org/keywords/data-mining |
| keywords[12].score | 0.3466758728027344 |
| keywords[12].display_name | Data mining |
| keywords[13].id | https://openalex.org/keywords/medicine |
| keywords[13].score | 0.2830055356025696 |
| keywords[13].display_name | Medicine |
| keywords[14].id | https://openalex.org/keywords/mathematics |
| keywords[14].score | 0.18807131052017212 |
| keywords[14].display_name | Mathematics |
| keywords[15].id | https://openalex.org/keywords/internal-medicine |
| keywords[15].score | 0.12378951907157898 |
| keywords[15].display_name | Internal medicine |
| language | en |
| locations[0].id | doi:10.1371/journal.pone.0269135 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S202381698 |
| locations[0].source.issn | 1932-6203 |
| locations[0].source.type | journal |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | 1932-6203 |
| locations[0].source.is_core | True |
| locations[0].source.is_in_doaj | True |
| locations[0].source.display_name | PLoS ONE |
| locations[0].source.host_organization | https://openalex.org/P4310315706 |
| locations[0].source.host_organization_name | Public Library of Science |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310315706 |
| locations[0].source.host_organization_lineage_names | Public Library of Science |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0269135&type=printable |
| locations[0].version | publishedVersion |
| locations[0].raw_type | journal-article |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | PLOS ONE |
| locations[0].landing_page_url | https://doi.org/10.1371/journal.pone.0269135 |
| locations[1].id | pmid:35622821 |
| locations[1].is_oa | False |
| locations[1].source.id | https://openalex.org/S4306525036 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | PubMed |
| locations[1].source.host_organization | https://openalex.org/I1299303238 |
| locations[1].source.host_organization_name | National Institutes of Health |
| locations[1].source.host_organization_lineage | https://openalex.org/I1299303238 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | publishedVersion |
| locations[1].raw_type | |
| locations[1].license_id | |
| locations[1].is_accepted | True |
| locations[1].is_published | True |
| locations[1].raw_source_name | PloS one |
| locations[1].landing_page_url | https://pubmed.ncbi.nlm.nih.gov/35622821 |
| locations[2].id | pmh:oai:doaj.org/article:fc8298380514442f90510b7bb9a8ffa2 |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S4306401280 |
| locations[2].source.issn | |
| locations[2].source.type | repository |
| locations[2].source.is_oa | False |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | DOAJ (DOAJ: Directory of Open Access Journals) |
| locations[2].source.host_organization | |
| locations[2].source.host_organization_name | |
| locations[2].license | cc-by-sa |
| locations[2].pdf_url | |
| locations[2].version | submittedVersion |
| locations[2].raw_type | article |
| locations[2].license_id | https://openalex.org/licenses/cc-by-sa |
| locations[2].is_accepted | False |
| locations[2].is_published | False |
| locations[2].raw_source_name | PLoS ONE, Vol 17, Iss 5, p e0269135 (2022) |
| locations[2].landing_page_url | https://doaj.org/article/fc8298380514442f90510b7bb9a8ffa2 |
| locations[3].id | pmh:oai:pubmedcentral.nih.gov:9140301 |
| locations[3].is_oa | True |
| locations[3].source.id | https://openalex.org/S2764455111 |
| locations[3].source.issn | |
| locations[3].source.type | repository |
| locations[3].source.is_oa | False |
| locations[3].source.issn_l | |
| locations[3].source.is_core | False |
| locations[3].source.is_in_doaj | False |
| locations[3].source.display_name | PubMed Central |
| locations[3].source.host_organization | https://openalex.org/I1299303238 |
| locations[3].source.host_organization_name | National Institutes of Health |
| locations[3].source.host_organization_lineage | https://openalex.org/I1299303238 |
| locations[3].license | other-oa |
| locations[3].pdf_url | |
| locations[3].version | submittedVersion |
| locations[3].raw_type | Text |
| locations[3].license_id | https://openalex.org/licenses/other-oa |
| locations[3].is_accepted | False |
| locations[3].is_published | False |
| locations[3].raw_source_name | PLoS One |
| locations[3].landing_page_url | https://www.ncbi.nlm.nih.gov/pmc/articles/9140301 |
| indexed_in | crossref, doaj, pubmed |
| authorships[0].author.id | https://openalex.org/A5101990176 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-4735-7253 |
| authorships[0].author.display_name | Tuan Tran |
| authorships[0].countries | US |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I4210147380 |
| authorships[0].affiliations[0].raw_affiliation_string | College of Pharmacy, California Northstate University, Elk Grove, CA, United States of America |
| authorships[0].institutions[0].id | https://openalex.org/I4210147380 |
| authorships[0].institutions[0].ror | https://ror.org/03h0d2228 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I4210147380 |
| authorships[0].institutions[0].country_code | US |
| authorships[0].institutions[0].display_name | California Northstate University |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Tuan Tran |
| authorships[0].is_corresponding | True |
| authorships[0].raw_affiliation_strings | College of Pharmacy, California Northstate University, Elk Grove, CA, United States of America |
| authorships[1].author.id | https://openalex.org/A5103064879 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-9482-2747 |
| authorships[1].author.display_name | Uyen Le |
| authorships[1].countries | US |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I4210147380 |
| authorships[1].affiliations[0].raw_affiliation_string | College of Pharmacy, California Northstate University, Elk Grove, CA, United States of America |
| authorships[1].institutions[0].id | https://openalex.org/I4210147380 |
| authorships[1].institutions[0].ror | https://ror.org/03h0d2228 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I4210147380 |
| authorships[1].institutions[0].country_code | US |
| authorships[1].institutions[0].display_name | California Northstate University |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Uyen Le |
| authorships[1].is_corresponding | True |
| authorships[1].raw_affiliation_strings | College of Pharmacy, California Northstate University, Elk Grove, CA, United States of America |
| authorships[2].author.id | https://openalex.org/A5103168394 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-8014-1457 |
| authorships[2].author.display_name | Yihui Shi |
| authorships[2].countries | US |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I4210147380 |
| authorships[2].affiliations[0].raw_affiliation_string | College of Medicine, California Northstate University, Elk Grove, CA, United States of America |
| authorships[2].institutions[0].id | https://openalex.org/I4210147380 |
| authorships[2].institutions[0].ror | https://ror.org/03h0d2228 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I4210147380 |
| authorships[2].institutions[0].country_code | US |
| authorships[2].institutions[0].display_name | California Northstate University |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Yihui Shi |
| authorships[2].is_corresponding | True |
| authorships[2].raw_affiliation_strings | College of Medicine, California Northstate University, Elk Grove, CA, United States of America |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0269135&type=printable |
| open_access.oa_status | gold |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | An effective up-sampling approach for breast cancer prediction with imbalanced data: A machine learning model-based comparative analysis |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10862 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9998999834060669 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | AI in cancer detection |
| related_works | https://openalex.org/W3193043704, https://openalex.org/W4386259002, https://openalex.org/W1546989560, https://openalex.org/W4366990902, https://openalex.org/W4317732970, https://openalex.org/W4388550696, https://openalex.org/W4321636153, https://openalex.org/W4313289487, https://openalex.org/W3191198889, https://openalex.org/W4399767560 |
| cited_by_count | 19 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 3 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 7 |
| counts_by_year[2].year | 2023 |
| counts_by_year[2].cited_by_count | 8 |
| counts_by_year[3].year | 2022 |
| counts_by_year[3].cited_by_count | 1 |
| locations_count | 4 |
| best_oa_location.id | doi:10.1371/journal.pone.0269135 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S202381698 |
| best_oa_location.source.issn | 1932-6203 |
| best_oa_location.source.type | journal |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | 1932-6203 |
| best_oa_location.source.is_core | True |
| best_oa_location.source.is_in_doaj | True |
| best_oa_location.source.display_name | PLoS ONE |
| best_oa_location.source.host_organization | https://openalex.org/P4310315706 |
| best_oa_location.source.host_organization_name | Public Library of Science |
| best_oa_location.source.host_organization_lineage | https://openalex.org/P4310315706 |
| best_oa_location.source.host_organization_lineage_names | Public Library of Science |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0269135&type=printable |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | journal-article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | PLOS ONE |
| best_oa_location.landing_page_url | https://doi.org/10.1371/journal.pone.0269135 |
| primary_location.id | doi:10.1371/journal.pone.0269135 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S202381698 |
| primary_location.source.issn | 1932-6203 |
| primary_location.source.type | journal |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | 1932-6203 |
| primary_location.source.is_core | True |
| primary_location.source.is_in_doaj | True |
| primary_location.source.display_name | PLoS ONE |
| primary_location.source.host_organization | https://openalex.org/P4310315706 |
| primary_location.source.host_organization_name | Public Library of Science |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310315706 |
| primary_location.source.host_organization_lineage_names | Public Library of Science |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0269135&type=printable |
| primary_location.version | publishedVersion |
| primary_location.raw_type | journal-article |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | PLOS ONE |
| primary_location.landing_page_url | https://doi.org/10.1371/journal.pone.0269135 |
| publication_date | 2022-05-27 |
| publication_year | 2022 |
| referenced_works | https://openalex.org/W765754, https://openalex.org/W2889646458, https://openalex.org/W2142407957, https://openalex.org/W2912581524, https://openalex.org/W2525186130, https://openalex.org/W3032311272, https://openalex.org/W3131828774, https://openalex.org/W4245119036, https://openalex.org/W2998618511, https://openalex.org/W2945574311, https://openalex.org/W2140593657, https://openalex.org/W3011287741, https://openalex.org/W2040951492, https://openalex.org/W3032218674, https://openalex.org/W3184676435, https://openalex.org/W3195075407, https://openalex.org/W3092490874, https://openalex.org/W3208166579, https://openalex.org/W2760611159, https://openalex.org/W3165241807, https://openalex.org/W2185981305, https://openalex.org/W273955616, https://openalex.org/W2585240214, https://openalex.org/W94052953, https://openalex.org/W2774823188, https://openalex.org/W2735749309, https://openalex.org/W3191950921, https://openalex.org/W2912102502, https://openalex.org/W2274933089, https://openalex.org/W2789622809, https://openalex.org/W2889520867, https://openalex.org/W4238530616, https://openalex.org/W4246219036, https://openalex.org/W2727347885, https://openalex.org/W2009167374, https://openalex.org/W259338706, https://openalex.org/W3029078373, https://openalex.org/W2075380047, https://openalex.org/W3128646645, https://openalex.org/W4206841660 |
| referenced_works_count | 40 |
| abstract_inverted_index.( | 111 |
| abstract_inverted_index.) | 113 |
| abstract_inverted_index., | 186, 188 |
| abstract_inverted_index.= | 128, 131, 137, 141, 149, 152 |
| abstract_inverted_index.a | 6, 34, 196, 218, 222, 239 |
| abstract_inverted_index.F1 | 101, 144 |
| abstract_inverted_index.In | 48 |
| abstract_inverted_index.It | 234 |
| abstract_inverted_index.an | 53, 121 |
| abstract_inverted_index.as | 166, 238 |
| abstract_inverted_index.be | 236 |
| abstract_inverted_index.by | 30 |
| abstract_inverted_index.in | 9, 154, 171, 229 |
| abstract_inverted_index.is | 85 |
| abstract_inverted_index.of | 2, 15, 17, 37, 66, 78, 125, 134, 146, 181, 192, 221, 225 |
| abstract_inverted_index.to | 43, 62, 81, 244 |
| abstract_inverted_index.we | 51 |
| abstract_inverted_index.89% | 138 |
| abstract_inverted_index.Our | 70, 104, 215 |
| abstract_inverted_index.The | 175 |
| abstract_inverted_index.and | 28, 100, 139, 143, 164, 189, 206, 247 |
| abstract_inverted_index.can | 235 |
| abstract_inverted_index.for | 46, 58, 241 |
| abstract_inverted_index.had | 210 |
| abstract_inverted_index.has | 40 |
| abstract_inverted_index.max | 130, 140, 151 |
| abstract_inverted_index.our | 159, 200 |
| abstract_inverted_index.the | 38, 76, 79, 82, 94, 117, 155, 167, 178, 182, 190, 211 |
| abstract_inverted_index.(min | 127, 136, 148 |
| abstract_inverted_index.20%, | 88 |
| abstract_inverted_index.93%, | 129 |
| abstract_inverted_index.ENUS | 92, 115 |
| abstract_inverted_index.Tree | 110 |
| abstract_inverted_index.also | 106 |
| abstract_inverted_index.been | 26, 41 |
| abstract_inverted_index.best | 118 |
| abstract_inverted_index.data | 24, 39, 61 |
| abstract_inverted_index.have | 25 |
| abstract_inverted_index.less | 86 |
| abstract_inverted_index.most | 168 |
| abstract_inverted_index.only | 33 |
| abstract_inverted_index.role | 8 |
| abstract_inverted_index.than | 87 |
| abstract_inverted_index.that | 12, 74, 108, 203 |
| abstract_inverted_index.this | 49 |
| abstract_inverted_index.tool | 240 |
| abstract_inverted_index.used | 42, 237 |
| abstract_inverted_index.when | 75 |
| abstract_inverted_index.wide | 223 |
| abstract_inverted_index.with | 91, 120 |
| abstract_inverted_index.100%) | 153 |
| abstract_inverted_index.8.36% | 99 |
| abstract_inverted_index.Early | 0 |
| abstract_inverted_index.class | 84 |
| abstract_inverted_index.every | 19 |
| abstract_inverted_index.least | 212 |
| abstract_inverted_index.lives | 16 |
| abstract_inverted_index.plays | 5 |
| abstract_inverted_index.range | 224 |
| abstract_inverted_index.ratio | 77 |
| abstract_inverted_index.risk. | 233 |
| abstract_inverted_index.saves | 13 |
| abstract_inverted_index.score | 102, 145 |
| abstract_inverted_index.small | 35 |
| abstract_inverted_index.study | 105, 216 |
| abstract_inverted_index.time. | 214 |
| abstract_inverted_index.treat | 248 |
| abstract_inverted_index.using | 114, 195 |
| abstract_inverted_index.year. | 20 |
| abstract_inverted_index.(ENUS) | 57 |
| abstract_inverted_index.100%), | 132, 142 |
| abstract_inverted_index.3.74%, | 97 |
| abstract_inverted_index.3.83%. | 103 |
| abstract_inverted_index.89.5%, | 150 |
| abstract_inverted_index.96.20% | 147 |
| abstract_inverted_index.97.47% | 126 |
| abstract_inverted_index.97.88% | 135 |
| abstract_inverted_index.Forest | 205 |
| abstract_inverted_index.Neural | 207 |
| abstract_inverted_index.Nuclei | 165, 187 |
| abstract_inverted_index.Random | 204 |
| abstract_inverted_index.breast | 3, 173, 193, 231, 249 |
| abstract_inverted_index.cancer | 4, 194, 232 |
| abstract_inverted_index.detect | 246 |
| abstract_inverted_index.grades | 191 |
| abstract_inverted_index.method | 56 |
| abstract_inverted_index.models | 90, 209 |
| abstract_inverted_index.showed | 73, 202 |
| abstract_inverted_index.stored | 29 |
| abstract_inverted_index.study, | 50 |
| abstract_inverted_index.Despite | 21 |
| abstract_inverted_index.Network | 208 |
| abstract_inverted_index.XGBTree | 112 |
| abstract_inverted_index.XGBoost | 109 |
| abstract_inverted_index.average | 122 |
| abstract_inverted_index.between | 184 |
| abstract_inverted_index.cancer. | 174, 250 |
| abstract_inverted_index.finding | 176 |
| abstract_inverted_index.improve | 63 |
| abstract_inverted_index.machine | 67, 226 |
| abstract_inverted_index.massive | 22 |
| abstract_inverted_index.methods | 228 |
| abstract_inverted_index.models. | 69 |
| abstract_inverted_index.portion | 36 |
| abstract_inverted_index.results | 72 |
| abstract_inverted_index.support | 44 |
| abstract_inverted_index.Finally, | 199 |
| abstract_inverted_index.accuracy | 96, 124 |
| abstract_inverted_index.achieved | 116 |
| abstract_inverted_index.balanced | 95, 123 |
| abstract_inverted_index.clinical | 23 |
| abstract_inverted_index.critical | 7 |
| abstract_inverted_index.dataset. | 157 |
| abstract_inverted_index.ensemble | 160 |
| abstract_inverted_index.handling | 59 |
| abstract_inverted_index.improved | 93 |
| abstract_inverted_index.learning | 68, 227 |
| abstract_inverted_index.majority | 83 |
| abstract_inverted_index.minority | 80 |
| abstract_inverted_index.patients | 18 |
| abstract_inverted_index.previous | 179 |
| abstract_inverted_index.proposed | 52 |
| abstract_inverted_index.provided | 217 |
| abstract_inverted_index.training | 89, 213 |
| abstract_inverted_index.algorithm | 161 |
| abstract_inverted_index.approach. | 198 |
| abstract_inverted_index.collected | 27 |
| abstract_inverted_index.detection | 1 |
| abstract_inverted_index.important | 169 |
| abstract_inverted_index.knowledge | 180 |
| abstract_inverted_index.thousands | 14 |
| abstract_inverted_index.treatment | 11 |
| abstract_inverted_index.Cell_Shape | 163, 185 |
| abstract_inverted_index.attributes | 170 |
| abstract_inverted_index.comparison | 220 |
| abstract_inverted_index.engineered | 54 |
| abstract_inverted_index.experiment | 71, 201 |
| abstract_inverted_index.healthcare | 31, 242 |
| abstract_inverted_index.identified | 107, 162 |
| abstract_inverted_index.imbalanced | 60 |
| abstract_inverted_index.predicting | 172, 230 |
| abstract_inverted_index.predictive | 64 |
| abstract_inverted_index.re-affirms | 177 |
| abstract_inverted_index.successful | 10 |
| abstract_inverted_index.validation | 156 |
| abstract_inverted_index.data-driven | 197 |
| abstract_inverted_index.effectively | 245 |
| abstract_inverted_index.performance | 65, 119 |
| abstract_inverted_index.sensitivity | 98, 133 |
| abstract_inverted_index.treatments. | 47 |
| abstract_inverted_index.up-sampling | 55 |
| abstract_inverted_index.Furthermore, | 158 |
| abstract_inverted_index.relationship | 183 |
| abstract_inverted_index.comprehensive | 219 |
| abstract_inverted_index.practitioners | 243 |
| abstract_inverted_index.organizations, | 32 |
| abstract_inverted_index.decision-making | 45 |
| cited_by_percentile_year.max | 99 |
| cited_by_percentile_year.min | 89 |
| corresponding_author_ids | https://openalex.org/A5101990176, https://openalex.org/A5103168394, https://openalex.org/A5103064879 |
| countries_distinct_count | 1 |
| institutions_distinct_count | 3 |
| corresponding_institution_ids | https://openalex.org/I4210147380 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/3 |
| sustainable_development_goals[0].score | 0.75 |
| sustainable_development_goals[0].display_name | Good health and well-being |
| citation_normalized_percentile.value | 0.9154589 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |