Decorrelated forward regression for high dimensional data analysis Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2408.12272
Forward regression is a crucial methodology for automatically identifying important predictors from a large pool of potential covariates. In contexts with moderate predictor correlation, forward selection techniques can achieve screening consistency. However, this property gradually becomes invalid in the presence of substantially correlated variables, especially in high-dimensional datasets where strong correlations exist among predictors. This dilemma is encountered by other model selection methods in literature as well. To address these challenges, we introduce a novel decorrelated forward (DF) selection framework for generalized mean regression models, including prevalent models, such as linear, logistic, Poisson, and quasi likelihood. The DF selection framework stands out because of its ability to convert generalized mean regression models into linear ones, thus providing a clear interpretation of the forward selection process. It also offers a closed-form expression for forward iteration, to improve practical applicability and efficiency. Theoretically, we establish the screening consistency of DF selection and determine the upper bound of the selected submodel's size. To reduce computational burden, we develop a thresholding DF algorithm that provides a stopping rule for the forward-searching process. Simulations and two real data applications show the outstanding performance of our method compared with some existing model selection methods.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2408.12272
- https://arxiv.org/pdf/2408.12272
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4405622412
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4405622412Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2408.12272Digital Object Identifier
- Title
-
Decorrelated forward regression for high dimensional data analysisWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-08-22Full publication date if available
- Authors
-
Xuejun Jiang, Yue Ma, Haofeng WangList of authors in order
- Landing page
-
https://arxiv.org/abs/2408.12272Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2408.12272Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2408.12272Direct OA link when available
- Concepts
-
Regression, Regression analysis, Computer science, Statistics, MathematicsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4405622412 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2408.12272 |
| ids.doi | https://doi.org/10.48550/arxiv.2408.12272 |
| ids.openalex | https://openalex.org/W4405622412 |
| fwci | |
| type | preprint |
| title | Decorrelated forward regression for high dimensional data analysis |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10057 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.6657000184059143 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Face and Expression Recognition |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C83546350 |
| concepts[0].level | 2 |
| concepts[0].score | 0.5972083210945129 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1139051 |
| concepts[0].display_name | Regression |
| concepts[1].id | https://openalex.org/C152877465 |
| concepts[1].level | 2 |
| concepts[1].score | 0.49939393997192383 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q208042 |
| concepts[1].display_name | Regression analysis |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.40836453437805176 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C105795698 |
| concepts[3].level | 1 |
| concepts[3].score | 0.3336861729621887 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q12483 |
| concepts[3].display_name | Statistics |
| concepts[4].id | https://openalex.org/C33923547 |
| concepts[4].level | 0 |
| concepts[4].score | 0.24102747440338135 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[4].display_name | Mathematics |
| keywords[0].id | https://openalex.org/keywords/regression |
| keywords[0].score | 0.5972083210945129 |
| keywords[0].display_name | Regression |
| keywords[1].id | https://openalex.org/keywords/regression-analysis |
| keywords[1].score | 0.49939393997192383 |
| keywords[1].display_name | Regression analysis |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.40836453437805176 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/statistics |
| keywords[3].score | 0.3336861729621887 |
| keywords[3].display_name | Statistics |
| keywords[4].id | https://openalex.org/keywords/mathematics |
| keywords[4].score | 0.24102747440338135 |
| keywords[4].display_name | Mathematics |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2408.12272 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2408.12272 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2408.12272 |
| locations[1].id | doi:10.48550/arxiv.2408.12272 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2408.12272 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5088187019 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-8966-8864 |
| authorships[0].author.display_name | Xuejun Jiang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Jiang, Xuejun |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5100959773 |
| authorships[1].author.orcid | https://orcid.org/0009-0005-9452-1513 |
| authorships[1].author.display_name | Yue Ma |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Ma, Yue |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5080023492 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-8175-5621 |
| authorships[2].author.display_name | Haofeng Wang |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Wang, Haofeng |
| authorships[2].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2408.12272 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Decorrelated forward regression for high dimensional data analysis |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10057 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.6657000184059143 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Face and Expression Recognition |
| related_works | https://openalex.org/W4381136829, https://openalex.org/W31220157, https://openalex.org/W2312753042, https://openalex.org/W4289356671, https://openalex.org/W2389155397, https://openalex.org/W2165884543, https://openalex.org/W3186837933, https://openalex.org/W2368989808, https://openalex.org/W1969346022, https://openalex.org/W2034959125 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2408.12272 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2408.12272 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2408.12272 |
| primary_location.id | pmh:oai:arXiv.org:2408.12272 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2408.12272 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2408.12272 |
| publication_date | 2024-08-22 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 3, 12, 73, 117, 128, 165, 171 |
| abstract_inverted_index.DF | 97, 147, 167 |
| abstract_inverted_index.In | 18 |
| abstract_inverted_index.It | 125 |
| abstract_inverted_index.To | 67, 159 |
| abstract_inverted_index.as | 65, 89 |
| abstract_inverted_index.by | 58 |
| abstract_inverted_index.in | 37, 45, 63 |
| abstract_inverted_index.is | 2, 56 |
| abstract_inverted_index.of | 15, 40, 103, 120, 146, 154, 188 |
| abstract_inverted_index.to | 106, 134 |
| abstract_inverted_index.we | 71, 141, 163 |
| abstract_inverted_index.The | 96 |
| abstract_inverted_index.and | 93, 138, 149, 179 |
| abstract_inverted_index.can | 27 |
| abstract_inverted_index.for | 6, 80, 131, 174 |
| abstract_inverted_index.its | 104 |
| abstract_inverted_index.our | 189 |
| abstract_inverted_index.out | 101 |
| abstract_inverted_index.the | 38, 121, 143, 151, 155, 175, 185 |
| abstract_inverted_index.two | 180 |
| abstract_inverted_index.(DF) | 77 |
| abstract_inverted_index.This | 54 |
| abstract_inverted_index.also | 126 |
| abstract_inverted_index.data | 182 |
| abstract_inverted_index.from | 11 |
| abstract_inverted_index.into | 112 |
| abstract_inverted_index.mean | 82, 109 |
| abstract_inverted_index.pool | 14 |
| abstract_inverted_index.real | 181 |
| abstract_inverted_index.rule | 173 |
| abstract_inverted_index.show | 184 |
| abstract_inverted_index.some | 193 |
| abstract_inverted_index.such | 88 |
| abstract_inverted_index.that | 169 |
| abstract_inverted_index.this | 32 |
| abstract_inverted_index.thus | 115 |
| abstract_inverted_index.with | 20, 192 |
| abstract_inverted_index.among | 52 |
| abstract_inverted_index.bound | 153 |
| abstract_inverted_index.clear | 118 |
| abstract_inverted_index.exist | 51 |
| abstract_inverted_index.large | 13 |
| abstract_inverted_index.model | 60, 195 |
| abstract_inverted_index.novel | 74 |
| abstract_inverted_index.ones, | 114 |
| abstract_inverted_index.other | 59 |
| abstract_inverted_index.quasi | 94 |
| abstract_inverted_index.size. | 158 |
| abstract_inverted_index.these | 69 |
| abstract_inverted_index.upper | 152 |
| abstract_inverted_index.well. | 66 |
| abstract_inverted_index.where | 48 |
| abstract_inverted_index.linear | 113 |
| abstract_inverted_index.method | 190 |
| abstract_inverted_index.models | 111 |
| abstract_inverted_index.offers | 127 |
| abstract_inverted_index.reduce | 160 |
| abstract_inverted_index.stands | 100 |
| abstract_inverted_index.strong | 49 |
| abstract_inverted_index.Forward | 0 |
| abstract_inverted_index.ability | 105 |
| abstract_inverted_index.achieve | 28 |
| abstract_inverted_index.address | 68 |
| abstract_inverted_index.because | 102 |
| abstract_inverted_index.becomes | 35 |
| abstract_inverted_index.burden, | 162 |
| abstract_inverted_index.convert | 107 |
| abstract_inverted_index.crucial | 4 |
| abstract_inverted_index.develop | 164 |
| abstract_inverted_index.dilemma | 55 |
| abstract_inverted_index.forward | 24, 76, 122, 132 |
| abstract_inverted_index.improve | 135 |
| abstract_inverted_index.invalid | 36 |
| abstract_inverted_index.linear, | 90 |
| abstract_inverted_index.methods | 62 |
| abstract_inverted_index.models, | 84, 87 |
| abstract_inverted_index.However, | 31 |
| abstract_inverted_index.Poisson, | 92 |
| abstract_inverted_index.compared | 191 |
| abstract_inverted_index.contexts | 19 |
| abstract_inverted_index.datasets | 47 |
| abstract_inverted_index.existing | 194 |
| abstract_inverted_index.methods. | 197 |
| abstract_inverted_index.moderate | 21 |
| abstract_inverted_index.presence | 39 |
| abstract_inverted_index.process. | 124, 177 |
| abstract_inverted_index.property | 33 |
| abstract_inverted_index.provides | 170 |
| abstract_inverted_index.selected | 156 |
| abstract_inverted_index.stopping | 172 |
| abstract_inverted_index.algorithm | 168 |
| abstract_inverted_index.determine | 150 |
| abstract_inverted_index.establish | 142 |
| abstract_inverted_index.framework | 79, 99 |
| abstract_inverted_index.gradually | 34 |
| abstract_inverted_index.important | 9 |
| abstract_inverted_index.including | 85 |
| abstract_inverted_index.introduce | 72 |
| abstract_inverted_index.logistic, | 91 |
| abstract_inverted_index.potential | 16 |
| abstract_inverted_index.practical | 136 |
| abstract_inverted_index.predictor | 22 |
| abstract_inverted_index.prevalent | 86 |
| abstract_inverted_index.providing | 116 |
| abstract_inverted_index.screening | 29, 144 |
| abstract_inverted_index.selection | 25, 61, 78, 98, 123, 148, 196 |
| abstract_inverted_index.correlated | 42 |
| abstract_inverted_index.especially | 44 |
| abstract_inverted_index.expression | 130 |
| abstract_inverted_index.iteration, | 133 |
| abstract_inverted_index.literature | 64 |
| abstract_inverted_index.predictors | 10 |
| abstract_inverted_index.regression | 1, 83, 110 |
| abstract_inverted_index.submodel's | 157 |
| abstract_inverted_index.techniques | 26 |
| abstract_inverted_index.variables, | 43 |
| abstract_inverted_index.Simulations | 178 |
| abstract_inverted_index.challenges, | 70 |
| abstract_inverted_index.closed-form | 129 |
| abstract_inverted_index.consistency | 145 |
| abstract_inverted_index.covariates. | 17 |
| abstract_inverted_index.efficiency. | 139 |
| abstract_inverted_index.encountered | 57 |
| abstract_inverted_index.generalized | 81, 108 |
| abstract_inverted_index.identifying | 8 |
| abstract_inverted_index.likelihood. | 95 |
| abstract_inverted_index.methodology | 5 |
| abstract_inverted_index.outstanding | 186 |
| abstract_inverted_index.performance | 187 |
| abstract_inverted_index.predictors. | 53 |
| abstract_inverted_index.applications | 183 |
| abstract_inverted_index.consistency. | 30 |
| abstract_inverted_index.correlation, | 23 |
| abstract_inverted_index.correlations | 50 |
| abstract_inverted_index.decorrelated | 75 |
| abstract_inverted_index.thresholding | 166 |
| abstract_inverted_index.applicability | 137 |
| abstract_inverted_index.automatically | 7 |
| abstract_inverted_index.computational | 161 |
| abstract_inverted_index.substantially | 41 |
| abstract_inverted_index.Theoretically, | 140 |
| abstract_inverted_index.interpretation | 119 |
| abstract_inverted_index.high-dimensional | 46 |
| abstract_inverted_index.forward-searching | 176 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| citation_normalized_percentile |