A Comparison of Modeling Preprocessing Techniques Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2302.12042
This paper compares the performance of various data processing methods in terms of predictive performance for structured data. This paper also seeks to identify and recommend preprocessing methodologies for tree-based binary classification models, with a focus on eXtreme Gradient Boosting (XGBoost) models. Three data sets of various structures, interactions, and complexity were constructed, which were supplemented by a real-world data set from the Lending Club. We compare several methods for feature selection, categorical handling, and null imputation. Performance is assessed using relative comparisons among the chosen methodologies, including model prediction variability. This paper is presented by the three groups of preprocessing methodologies, with each section consisting of generalized observations. Each observation is accompanied by a recommendation of one or more preferred methodologies. Among feature selection methods, permutation-based feature importance, regularization, and XGBoost's feature importance by weight are not recommended. The correlation coefficient reduction also shows inferior performance. Instead, XGBoost importance by gain shows the most consistency and highest caliber of performance. Categorical featuring encoding methods show greater discrimination in performance among data set structures. While there was no universal "best" method, frequency encoding showed the greatest performance for the most complex data sets (Lending Club), but had the poorest performance for all synthetic (i.e., simpler) data sets. Finally, missing indicator imputation dominated in terms of performance among imputation methods, whereas tree imputation showed extremely poor and highly variable model performance.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2302.12042
- https://arxiv.org/pdf/2302.12042
- OA Status
- green
- Cited By
- 2
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4321854822
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4321854822Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2302.12042Digital Object Identifier
- Title
-
A Comparison of Modeling Preprocessing TechniquesWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-02-23Full publication date if available
- Authors
-
Tosan Johnson, Alice J. Liu, Syed Ali Raza, Aaron McGuireList of authors in order
- Landing page
-
https://arxiv.org/abs/2302.12042Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2302.12042Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2302.12042Direct OA link when available
- Concepts
-
Categorical variable, Feature selection, Data mining, Imputation (statistics), Computer science, Data pre-processing, Missing data, Preprocessor, Interpretability, Gradient boosting, Artificial intelligence, Pattern recognition (psychology), Machine learning, Random forestTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
2Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1, 2024: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4321854822 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2302.12042 |
| ids.doi | https://doi.org/10.48550/arxiv.2302.12042 |
| ids.openalex | https://openalex.org/W4321854822 |
| fwci | |
| type | preprint |
| title | A Comparison of Modeling Preprocessing Techniques |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11303 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9850999712944031 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Bayesian Modeling and Causal Inference |
| topics[1].id | https://openalex.org/T13398 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9682000279426575 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Data Analysis with R |
| topics[2].id | https://openalex.org/T10136 |
| topics[2].field.id | https://openalex.org/fields/26 |
| topics[2].field.display_name | Mathematics |
| topics[2].score | 0.9480999708175659 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2613 |
| topics[2].subfield.display_name | Statistics and Probability |
| topics[2].display_name | Statistical Methods and Inference |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C5274069 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7411190271377563 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q2285707 |
| concepts[0].display_name | Categorical variable |
| concepts[1].id | https://openalex.org/C148483581 |
| concepts[1].level | 2 |
| concepts[1].score | 0.646523654460907 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q446488 |
| concepts[1].display_name | Feature selection |
| concepts[2].id | https://openalex.org/C124101348 |
| concepts[2].level | 1 |
| concepts[2].score | 0.6342349648475647 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q172491 |
| concepts[2].display_name | Data mining |
| concepts[3].id | https://openalex.org/C58041806 |
| concepts[3].level | 3 |
| concepts[3].score | 0.6302525401115417 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q1660484 |
| concepts[3].display_name | Imputation (statistics) |
| concepts[4].id | https://openalex.org/C41008148 |
| concepts[4].level | 0 |
| concepts[4].score | 0.6293444633483887 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[4].display_name | Computer science |
| concepts[5].id | https://openalex.org/C10551718 |
| concepts[5].level | 2 |
| concepts[5].score | 0.5806174874305725 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q5227332 |
| concepts[5].display_name | Data pre-processing |
| concepts[6].id | https://openalex.org/C9357733 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5394018888473511 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q6878417 |
| concepts[6].display_name | Missing data |
| concepts[7].id | https://openalex.org/C34736171 |
| concepts[7].level | 2 |
| concepts[7].score | 0.5257304906845093 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q918333 |
| concepts[7].display_name | Preprocessor |
| concepts[8].id | https://openalex.org/C2781067378 |
| concepts[8].level | 2 |
| concepts[8].score | 0.4860706627368927 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q17027399 |
| concepts[8].display_name | Interpretability |
| concepts[9].id | https://openalex.org/C70153297 |
| concepts[9].level | 3 |
| concepts[9].score | 0.4146263897418976 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q5591907 |
| concepts[9].display_name | Gradient boosting |
| concepts[10].id | https://openalex.org/C154945302 |
| concepts[10].level | 1 |
| concepts[10].score | 0.4116526246070862 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[10].display_name | Artificial intelligence |
| concepts[11].id | https://openalex.org/C153180895 |
| concepts[11].level | 2 |
| concepts[11].score | 0.35642218589782715 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q7148389 |
| concepts[11].display_name | Pattern recognition (psychology) |
| concepts[12].id | https://openalex.org/C119857082 |
| concepts[12].level | 1 |
| concepts[12].score | 0.3336741030216217 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[12].display_name | Machine learning |
| concepts[13].id | https://openalex.org/C169258074 |
| concepts[13].level | 2 |
| concepts[13].score | 0.22783100605010986 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q245748 |
| concepts[13].display_name | Random forest |
| keywords[0].id | https://openalex.org/keywords/categorical-variable |
| keywords[0].score | 0.7411190271377563 |
| keywords[0].display_name | Categorical variable |
| keywords[1].id | https://openalex.org/keywords/feature-selection |
| keywords[1].score | 0.646523654460907 |
| keywords[1].display_name | Feature selection |
| keywords[2].id | https://openalex.org/keywords/data-mining |
| keywords[2].score | 0.6342349648475647 |
| keywords[2].display_name | Data mining |
| keywords[3].id | https://openalex.org/keywords/imputation |
| keywords[3].score | 0.6302525401115417 |
| keywords[3].display_name | Imputation (statistics) |
| keywords[4].id | https://openalex.org/keywords/computer-science |
| keywords[4].score | 0.6293444633483887 |
| keywords[4].display_name | Computer science |
| keywords[5].id | https://openalex.org/keywords/data-pre-processing |
| keywords[5].score | 0.5806174874305725 |
| keywords[5].display_name | Data pre-processing |
| keywords[6].id | https://openalex.org/keywords/missing-data |
| keywords[6].score | 0.5394018888473511 |
| keywords[6].display_name | Missing data |
| keywords[7].id | https://openalex.org/keywords/preprocessor |
| keywords[7].score | 0.5257304906845093 |
| keywords[7].display_name | Preprocessor |
| keywords[8].id | https://openalex.org/keywords/interpretability |
| keywords[8].score | 0.4860706627368927 |
| keywords[8].display_name | Interpretability |
| keywords[9].id | https://openalex.org/keywords/gradient-boosting |
| keywords[9].score | 0.4146263897418976 |
| keywords[9].display_name | Gradient boosting |
| keywords[10].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[10].score | 0.4116526246070862 |
| keywords[10].display_name | Artificial intelligence |
| keywords[11].id | https://openalex.org/keywords/pattern-recognition |
| keywords[11].score | 0.35642218589782715 |
| keywords[11].display_name | Pattern recognition (psychology) |
| keywords[12].id | https://openalex.org/keywords/machine-learning |
| keywords[12].score | 0.3336741030216217 |
| keywords[12].display_name | Machine learning |
| keywords[13].id | https://openalex.org/keywords/random-forest |
| keywords[13].score | 0.22783100605010986 |
| keywords[13].display_name | Random forest |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2302.12042 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2302.12042 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2302.12042 |
| locations[1].id | doi:10.48550/arxiv.2302.12042 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2302.12042 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5012474037 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Tosan Johnson |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Johnson, Tosan |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5025120527 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-1800-8524 |
| authorships[1].author.display_name | Alice J. Liu |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Liu, Alice J. |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5046407306 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-2455-6922 |
| authorships[2].author.display_name | Syed Ali Raza |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Raza, Syed |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5011122872 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Aaron McGuire |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | McGuire, Aaron |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2302.12042 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | A Comparison of Modeling Preprocessing Techniques |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11303 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9850999712944031 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Bayesian Modeling and Causal Inference |
| related_works | https://openalex.org/W2181530120, https://openalex.org/W4211215373, https://openalex.org/W2024529227, https://openalex.org/W2055961818, https://openalex.org/W1574575415, https://openalex.org/W3144172081, https://openalex.org/W3179858851, https://openalex.org/W3028371478, https://openalex.org/W2081476516, https://openalex.org/W2581984549 |
| cited_by_count | 2 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2302.12042 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2302.12042 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2302.12042 |
| primary_location.id | pmh:oai:arXiv.org:2302.12042 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2302.12042 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2302.12042 |
| publication_date | 2023-02-23 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 34, 57, 114 |
| abstract_inverted_index.We | 65 |
| abstract_inverted_index.by | 56, 95, 113, 134, 150 |
| abstract_inverted_index.in | 10, 168, 212 |
| abstract_inverted_index.is | 78, 93, 111 |
| abstract_inverted_index.no | 177 |
| abstract_inverted_index.of | 5, 12, 45, 99, 106, 116, 159, 214 |
| abstract_inverted_index.on | 36 |
| abstract_inverted_index.or | 118 |
| abstract_inverted_index.to | 22 |
| abstract_inverted_index.The | 139 |
| abstract_inverted_index.all | 201 |
| abstract_inverted_index.and | 24, 49, 74, 130, 156, 225 |
| abstract_inverted_index.are | 136 |
| abstract_inverted_index.but | 195 |
| abstract_inverted_index.for | 15, 28, 69, 187, 200 |
| abstract_inverted_index.had | 196 |
| abstract_inverted_index.not | 137 |
| abstract_inverted_index.one | 117 |
| abstract_inverted_index.set | 60, 172 |
| abstract_inverted_index.the | 3, 62, 84, 96, 153, 184, 188, 197 |
| abstract_inverted_index.was | 176 |
| abstract_inverted_index.Each | 109 |
| abstract_inverted_index.This | 0, 18, 91 |
| abstract_inverted_index.also | 20, 143 |
| abstract_inverted_index.data | 7, 43, 59, 171, 191, 205 |
| abstract_inverted_index.each | 103 |
| abstract_inverted_index.from | 61 |
| abstract_inverted_index.gain | 151 |
| abstract_inverted_index.more | 119 |
| abstract_inverted_index.most | 154, 189 |
| abstract_inverted_index.null | 75 |
| abstract_inverted_index.poor | 224 |
| abstract_inverted_index.sets | 44, 192 |
| abstract_inverted_index.show | 165 |
| abstract_inverted_index.tree | 220 |
| abstract_inverted_index.were | 51, 54 |
| abstract_inverted_index.with | 33, 102 |
| abstract_inverted_index.Among | 122 |
| abstract_inverted_index.Club. | 64 |
| abstract_inverted_index.Three | 42 |
| abstract_inverted_index.While | 174 |
| abstract_inverted_index.among | 83, 170, 216 |
| abstract_inverted_index.data. | 17 |
| abstract_inverted_index.focus | 35 |
| abstract_inverted_index.model | 88, 228 |
| abstract_inverted_index.paper | 1, 19, 92 |
| abstract_inverted_index.seeks | 21 |
| abstract_inverted_index.sets. | 206 |
| abstract_inverted_index.shows | 144, 152 |
| abstract_inverted_index.terms | 11, 213 |
| abstract_inverted_index.there | 175 |
| abstract_inverted_index.three | 97 |
| abstract_inverted_index.using | 80 |
| abstract_inverted_index.which | 53 |
| abstract_inverted_index."best" | 179 |
| abstract_inverted_index.(i.e., | 203 |
| abstract_inverted_index.Club), | 194 |
| abstract_inverted_index.binary | 30 |
| abstract_inverted_index.chosen | 85 |
| abstract_inverted_index.groups | 98 |
| abstract_inverted_index.highly | 226 |
| abstract_inverted_index.showed | 183, 222 |
| abstract_inverted_index.weight | 135 |
| abstract_inverted_index.Lending | 63 |
| abstract_inverted_index.XGBoost | 148 |
| abstract_inverted_index.caliber | 158 |
| abstract_inverted_index.compare | 66 |
| abstract_inverted_index.complex | 190 |
| abstract_inverted_index.eXtreme | 37 |
| abstract_inverted_index.feature | 70, 123, 127, 132 |
| abstract_inverted_index.greater | 166 |
| abstract_inverted_index.highest | 157 |
| abstract_inverted_index.method, | 180 |
| abstract_inverted_index.methods | 9, 68, 164 |
| abstract_inverted_index.missing | 208 |
| abstract_inverted_index.models, | 32 |
| abstract_inverted_index.models. | 41 |
| abstract_inverted_index.poorest | 198 |
| abstract_inverted_index.section | 104 |
| abstract_inverted_index.several | 67 |
| abstract_inverted_index.various | 6, 46 |
| abstract_inverted_index.whereas | 219 |
| abstract_inverted_index.(Lending | 193 |
| abstract_inverted_index.Boosting | 39 |
| abstract_inverted_index.Finally, | 207 |
| abstract_inverted_index.Gradient | 38 |
| abstract_inverted_index.Instead, | 147 |
| abstract_inverted_index.assessed | 79 |
| abstract_inverted_index.compares | 2 |
| abstract_inverted_index.encoding | 163, 182 |
| abstract_inverted_index.greatest | 185 |
| abstract_inverted_index.identify | 23 |
| abstract_inverted_index.inferior | 145 |
| abstract_inverted_index.methods, | 125, 218 |
| abstract_inverted_index.relative | 81 |
| abstract_inverted_index.simpler) | 204 |
| abstract_inverted_index.variable | 227 |
| abstract_inverted_index.(XGBoost) | 40 |
| abstract_inverted_index.XGBoost's | 131 |
| abstract_inverted_index.dominated | 211 |
| abstract_inverted_index.extremely | 223 |
| abstract_inverted_index.featuring | 162 |
| abstract_inverted_index.frequency | 181 |
| abstract_inverted_index.handling, | 73 |
| abstract_inverted_index.including | 87 |
| abstract_inverted_index.indicator | 209 |
| abstract_inverted_index.preferred | 120 |
| abstract_inverted_index.presented | 94 |
| abstract_inverted_index.recommend | 25 |
| abstract_inverted_index.reduction | 142 |
| abstract_inverted_index.selection | 124 |
| abstract_inverted_index.synthetic | 202 |
| abstract_inverted_index.universal | 178 |
| abstract_inverted_index.complexity | 50 |
| abstract_inverted_index.consisting | 105 |
| abstract_inverted_index.importance | 133, 149 |
| abstract_inverted_index.imputation | 210, 217, 221 |
| abstract_inverted_index.prediction | 89 |
| abstract_inverted_index.predictive | 13 |
| abstract_inverted_index.processing | 8 |
| abstract_inverted_index.real-world | 58 |
| abstract_inverted_index.selection, | 71 |
| abstract_inverted_index.structured | 16 |
| abstract_inverted_index.tree-based | 29 |
| abstract_inverted_index.Categorical | 161 |
| abstract_inverted_index.Performance | 77 |
| abstract_inverted_index.accompanied | 112 |
| abstract_inverted_index.categorical | 72 |
| abstract_inverted_index.coefficient | 141 |
| abstract_inverted_index.comparisons | 82 |
| abstract_inverted_index.consistency | 155 |
| abstract_inverted_index.correlation | 140 |
| abstract_inverted_index.generalized | 107 |
| abstract_inverted_index.importance, | 128 |
| abstract_inverted_index.imputation. | 76 |
| abstract_inverted_index.observation | 110 |
| abstract_inverted_index.performance | 4, 14, 169, 186, 199, 215 |
| abstract_inverted_index.structures, | 47 |
| abstract_inverted_index.structures. | 173 |
| abstract_inverted_index.constructed, | 52 |
| abstract_inverted_index.performance. | 146, 160, 229 |
| abstract_inverted_index.recommended. | 138 |
| abstract_inverted_index.supplemented | 55 |
| abstract_inverted_index.variability. | 90 |
| abstract_inverted_index.interactions, | 48 |
| abstract_inverted_index.methodologies | 27 |
| abstract_inverted_index.observations. | 108 |
| abstract_inverted_index.preprocessing | 26, 100 |
| abstract_inverted_index.classification | 31 |
| abstract_inverted_index.discrimination | 167 |
| abstract_inverted_index.methodologies, | 86, 101 |
| abstract_inverted_index.methodologies. | 121 |
| abstract_inverted_index.recommendation | 115 |
| abstract_inverted_index.regularization, | 129 |
| abstract_inverted_index.permutation-based | 126 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/16 |
| sustainable_development_goals[0].score | 0.4399999976158142 |
| sustainable_development_goals[0].display_name | Peace, Justice and strong institutions |
| citation_normalized_percentile |