Minimax-Optimal Dimension-Reduced Clustering for High-Dimensional Nonspherical Mixtures Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2502.02580
In mixture models, nonspherical (anisotropic) noise within each cluster is widely present in real-world data. We study both the minimax rate and optimal statistical procedure for clustering under high-dimensional nonspherical mixture models. In high-dimensional settings, we first establish the information-theoretic limits for clustering under Gaussian mixtures. The minimax lower bound unveils an intriguing informational dimension-reduction phenomenon: there exists a substantial gap between the minimax rate and the oracle clustering risk, with the former determined solely by the projected centers and projected covariance matrices in a low-dimensional space. Motivated by the lower bound, we propose a novel computationally efficient clustering method: Covariance Projected Spectral Clustering (COPO). Its key step is to project the high-dimensional data onto the low-dimensional space spanned by the cluster centers and then use the projected covariance matrices in this space to enhance clustering. We establish tight algorithmic upper bounds for COPO, both for Gaussian noise with flexible covariance and general noise with local dependence. Our theory indicates the minimax-optimality of COPO in the Gaussian case and highlights its adaptivity to a broad spectrum of dependent noise. Extensive simulation studies under various noise structures and real data analysis demonstrate our method's superior performance.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2502.02580
- https://arxiv.org/pdf/2502.02580
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4407186911
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4407186911Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2502.02580Digital Object Identifier
- Title
-
Minimax-Optimal Dimension-Reduced Clustering for High-Dimensional Nonspherical MixturesWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-02-04Full publication date if available
- Authors
-
Chiung‐Yu Huang, Yuqi GuList of authors in order
- Landing page
-
https://arxiv.org/abs/2502.02580Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2502.02580Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2502.02580Direct OA link when available
- Concepts
-
Covariance, Minimax, Cluster analysis, Spectral clustering, Mathematics, Applied mathematics, Mathematical optimization, StatisticsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4407186911 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2502.02580 |
| ids.doi | https://doi.org/10.48550/arxiv.2502.02580 |
| ids.openalex | https://openalex.org/W4407186911 |
| fwci | |
| type | preprint |
| title | Minimax-Optimal Dimension-Reduced Clustering for High-Dimensional Nonspherical Mixtures |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11901 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9269000291824341 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Bayesian Methods and Mixture Models |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C178650346 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7872279286384583 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q201984 |
| concepts[0].display_name | Covariance |
| concepts[1].id | https://openalex.org/C149728462 |
| concepts[1].level | 2 |
| concepts[1].score | 0.7865548133850098 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q751319 |
| concepts[1].display_name | Minimax |
| concepts[2].id | https://openalex.org/C73555534 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6617928743362427 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q622825 |
| concepts[2].display_name | Cluster analysis |
| concepts[3].id | https://openalex.org/C105611402 |
| concepts[3].level | 3 |
| concepts[3].score | 0.44426101446151733 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q2976589 |
| concepts[3].display_name | Spectral clustering |
| concepts[4].id | https://openalex.org/C33923547 |
| concepts[4].level | 0 |
| concepts[4].score | 0.439573734998703 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[4].display_name | Mathematics |
| concepts[5].id | https://openalex.org/C28826006 |
| concepts[5].level | 1 |
| concepts[5].score | 0.4279651343822479 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q33521 |
| concepts[5].display_name | Applied mathematics |
| concepts[6].id | https://openalex.org/C126255220 |
| concepts[6].level | 1 |
| concepts[6].score | 0.39752423763275146 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q141495 |
| concepts[6].display_name | Mathematical optimization |
| concepts[7].id | https://openalex.org/C105795698 |
| concepts[7].level | 1 |
| concepts[7].score | 0.16978824138641357 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q12483 |
| concepts[7].display_name | Statistics |
| keywords[0].id | https://openalex.org/keywords/covariance |
| keywords[0].score | 0.7872279286384583 |
| keywords[0].display_name | Covariance |
| keywords[1].id | https://openalex.org/keywords/minimax |
| keywords[1].score | 0.7865548133850098 |
| keywords[1].display_name | Minimax |
| keywords[2].id | https://openalex.org/keywords/cluster-analysis |
| keywords[2].score | 0.6617928743362427 |
| keywords[2].display_name | Cluster analysis |
| keywords[3].id | https://openalex.org/keywords/spectral-clustering |
| keywords[3].score | 0.44426101446151733 |
| keywords[3].display_name | Spectral clustering |
| keywords[4].id | https://openalex.org/keywords/mathematics |
| keywords[4].score | 0.439573734998703 |
| keywords[4].display_name | Mathematics |
| keywords[5].id | https://openalex.org/keywords/applied-mathematics |
| keywords[5].score | 0.4279651343822479 |
| keywords[5].display_name | Applied mathematics |
| keywords[6].id | https://openalex.org/keywords/mathematical-optimization |
| keywords[6].score | 0.39752423763275146 |
| keywords[6].display_name | Mathematical optimization |
| keywords[7].id | https://openalex.org/keywords/statistics |
| keywords[7].score | 0.16978824138641357 |
| keywords[7].display_name | Statistics |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2502.02580 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2502.02580 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2502.02580 |
| locations[1].id | doi:10.48550/arxiv.2502.02580 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2502.02580 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5074319772 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-2313-3562 |
| authorships[0].author.display_name | Chiung‐Yu Huang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Huang, Chengzhu |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5011422415 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-4124-113X |
| authorships[1].author.display_name | Yuqi Gu |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Gu, Yuqi |
| authorships[1].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2502.02580 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-02-06T00:00:00 |
| display_name | Minimax-Optimal Dimension-Reduced Clustering for High-Dimensional Nonspherical Mixtures |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11901 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9269000291824341 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Bayesian Methods and Mixture Models |
| related_works | https://openalex.org/W2016058626, https://openalex.org/W2474724840, https://openalex.org/W2963760573, https://openalex.org/W185788778, https://openalex.org/W2895916002, https://openalex.org/W1814049089, https://openalex.org/W1977348009, https://openalex.org/W2369683208, https://openalex.org/W3182145356, https://openalex.org/W1482912984 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2502.02580 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2502.02580 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2502.02580 |
| primary_location.id | pmh:oai:arXiv.org:2502.02580 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2502.02580 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2502.02580 |
| publication_date | 2025-02-04 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 58, 84, 94, 173 |
| abstract_inverted_index.In | 0, 32 |
| abstract_inverted_index.We | 15, 136 |
| abstract_inverted_index.an | 51 |
| abstract_inverted_index.by | 75, 88, 119 |
| abstract_inverted_index.in | 12, 83, 130, 164 |
| abstract_inverted_index.is | 9, 108 |
| abstract_inverted_index.of | 162, 176 |
| abstract_inverted_index.to | 109, 133, 172 |
| abstract_inverted_index.we | 35, 92 |
| abstract_inverted_index.Its | 105 |
| abstract_inverted_index.Our | 157 |
| abstract_inverted_index.The | 46 |
| abstract_inverted_index.and | 21, 65, 79, 123, 151, 168, 186 |
| abstract_inverted_index.for | 25, 41, 142, 145 |
| abstract_inverted_index.gap | 60 |
| abstract_inverted_index.its | 170 |
| abstract_inverted_index.key | 106 |
| abstract_inverted_index.our | 191 |
| abstract_inverted_index.the | 18, 38, 62, 66, 71, 76, 89, 111, 115, 120, 126, 160, 165 |
| abstract_inverted_index.use | 125 |
| abstract_inverted_index.COPO | 163 |
| abstract_inverted_index.both | 17, 144 |
| abstract_inverted_index.case | 167 |
| abstract_inverted_index.data | 113, 188 |
| abstract_inverted_index.each | 7 |
| abstract_inverted_index.onto | 114 |
| abstract_inverted_index.rate | 20, 64 |
| abstract_inverted_index.real | 187 |
| abstract_inverted_index.step | 107 |
| abstract_inverted_index.then | 124 |
| abstract_inverted_index.this | 131 |
| abstract_inverted_index.with | 70, 148, 154 |
| abstract_inverted_index.COPO, | 143 |
| abstract_inverted_index.bound | 49 |
| abstract_inverted_index.broad | 174 |
| abstract_inverted_index.data. | 14 |
| abstract_inverted_index.first | 36 |
| abstract_inverted_index.local | 155 |
| abstract_inverted_index.lower | 48, 90 |
| abstract_inverted_index.noise | 5, 147, 153, 184 |
| abstract_inverted_index.novel | 95 |
| abstract_inverted_index.risk, | 69 |
| abstract_inverted_index.space | 117, 132 |
| abstract_inverted_index.study | 16 |
| abstract_inverted_index.there | 56 |
| abstract_inverted_index.tight | 138 |
| abstract_inverted_index.under | 27, 43, 182 |
| abstract_inverted_index.upper | 140 |
| abstract_inverted_index.bound, | 91 |
| abstract_inverted_index.bounds | 141 |
| abstract_inverted_index.exists | 57 |
| abstract_inverted_index.former | 72 |
| abstract_inverted_index.limits | 40 |
| abstract_inverted_index.noise. | 178 |
| abstract_inverted_index.oracle | 67 |
| abstract_inverted_index.solely | 74 |
| abstract_inverted_index.space. | 86 |
| abstract_inverted_index.theory | 158 |
| abstract_inverted_index.widely | 10 |
| abstract_inverted_index.within | 6 |
| abstract_inverted_index.(COPO). | 104 |
| abstract_inverted_index.between | 61 |
| abstract_inverted_index.centers | 78, 122 |
| abstract_inverted_index.cluster | 8, 121 |
| abstract_inverted_index.enhance | 134 |
| abstract_inverted_index.general | 152 |
| abstract_inverted_index.method: | 99 |
| abstract_inverted_index.minimax | 19, 47, 63 |
| abstract_inverted_index.mixture | 1, 30 |
| abstract_inverted_index.models, | 2 |
| abstract_inverted_index.models. | 31 |
| abstract_inverted_index.optimal | 22 |
| abstract_inverted_index.present | 11 |
| abstract_inverted_index.project | 110 |
| abstract_inverted_index.propose | 93 |
| abstract_inverted_index.spanned | 118 |
| abstract_inverted_index.studies | 181 |
| abstract_inverted_index.unveils | 50 |
| abstract_inverted_index.various | 183 |
| abstract_inverted_index.Gaussian | 44, 146, 166 |
| abstract_inverted_index.Spectral | 102 |
| abstract_inverted_index.analysis | 189 |
| abstract_inverted_index.flexible | 149 |
| abstract_inverted_index.matrices | 82, 129 |
| abstract_inverted_index.method's | 192 |
| abstract_inverted_index.spectrum | 175 |
| abstract_inverted_index.superior | 193 |
| abstract_inverted_index.Extensive | 179 |
| abstract_inverted_index.Motivated | 87 |
| abstract_inverted_index.Projected | 101 |
| abstract_inverted_index.dependent | 177 |
| abstract_inverted_index.efficient | 97 |
| abstract_inverted_index.establish | 37, 137 |
| abstract_inverted_index.indicates | 159 |
| abstract_inverted_index.mixtures. | 45 |
| abstract_inverted_index.procedure | 24 |
| abstract_inverted_index.projected | 77, 80, 127 |
| abstract_inverted_index.settings, | 34 |
| abstract_inverted_index.Clustering | 103 |
| abstract_inverted_index.Covariance | 100 |
| abstract_inverted_index.adaptivity | 171 |
| abstract_inverted_index.clustering | 26, 42, 68, 98 |
| abstract_inverted_index.covariance | 81, 128, 150 |
| abstract_inverted_index.determined | 73 |
| abstract_inverted_index.highlights | 169 |
| abstract_inverted_index.intriguing | 52 |
| abstract_inverted_index.real-world | 13 |
| abstract_inverted_index.simulation | 180 |
| abstract_inverted_index.structures | 185 |
| abstract_inverted_index.algorithmic | 139 |
| abstract_inverted_index.clustering. | 135 |
| abstract_inverted_index.demonstrate | 190 |
| abstract_inverted_index.dependence. | 156 |
| abstract_inverted_index.phenomenon: | 55 |
| abstract_inverted_index.statistical | 23 |
| abstract_inverted_index.substantial | 59 |
| abstract_inverted_index.nonspherical | 3, 29 |
| abstract_inverted_index.performance. | 194 |
| abstract_inverted_index.(anisotropic) | 4 |
| abstract_inverted_index.informational | 53 |
| abstract_inverted_index.computationally | 96 |
| abstract_inverted_index.low-dimensional | 85, 116 |
| abstract_inverted_index.high-dimensional | 28, 33, 112 |
| abstract_inverted_index.minimax-optimality | 161 |
| abstract_inverted_index.dimension-reduction | 54 |
| abstract_inverted_index.information-theoretic | 39 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |