MixMax: Distributional Robustness in Function Space via Optimal Data Mixtures Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2406.01477
Machine learning models are often required to perform well across several pre-defined settings, such as a set of user groups. Worst-case performance is a common metric to capture this requirement, and is the objective of group distributionally robust optimization (group DRO). Unfortunately, these methods struggle when the loss is non-convex in the parameters, or the model class is non-parametric. Here, we make a classical move to address this: we reparameterize group DRO from parameter space to function space, which results in a number of advantages. First, we show that group DRO over the space of bounded functions admits a minimax theorem. Second, for cross-entropy and mean squared error, we show that the minimax optimal mixture distribution is the solution of a simple convex optimization problem. Thus, provided one is working with a model class of universal function approximators, group DRO can be solved by a convex optimization problem followed by a classical risk minimization problem. We call our method MixMax. In our experi ments, we found that MixMax matched or outperformed the standard group DRO baselines, and in particular, MixMax improved the performance of XGBoost over the only baseline, data balancing, for variations of the ACSIncome and CelebA annotations datasets.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2406.01477
- https://arxiv.org/pdf/2406.01477
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4399401446
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4399401446Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2406.01477Digital Object Identifier
- Title
-
MixMax: Distributional Robustness in Function Space via Optimal Data MixturesWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-06-03Full publication date if available
- Authors
-
Anvith Thudi, Chris J. MaddisonList of authors in order
- Landing page
-
https://arxiv.org/abs/2406.01477Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2406.01477Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2406.01477Direct OA link when available
- Concepts
-
Maximization, Computer science, Concave function, Mathematical optimization, Mathematics, Regular polygon, GeometryTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4399401446 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2406.01477 |
| ids.doi | https://doi.org/10.48550/arxiv.2406.01477 |
| ids.openalex | https://openalex.org/W4399401446 |
| fwci | |
| type | preprint |
| title | MixMax: Distributional Robustness in Function Space via Optimal Data Mixtures |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11901 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9670000076293945 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Bayesian Methods and Mixture Models |
| topics[1].id | https://openalex.org/T11443 |
| topics[1].field.id | https://openalex.org/fields/18 |
| topics[1].field.display_name | Decision Sciences |
| topics[1].score | 0.9556000232696533 |
| topics[1].domain.id | https://openalex.org/domains/2 |
| topics[1].domain.display_name | Social Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1804 |
| topics[1].subfield.display_name | Statistics, Probability and Uncertainty |
| topics[1].display_name | Advanced Statistical Process Monitoring |
| topics[2].id | https://openalex.org/T10136 |
| topics[2].field.id | https://openalex.org/fields/26 |
| topics[2].field.display_name | Mathematics |
| topics[2].score | 0.942799985408783 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2613 |
| topics[2].subfield.display_name | Statistics and Probability |
| topics[2].display_name | Statistical Methods and Inference |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2776330181 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6045883893966675 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q18358244 |
| concepts[0].display_name | Maximization |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.4731144607067108 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C66690126 |
| concepts[2].level | 3 |
| concepts[2].score | 0.4352640211582184 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q2914302 |
| concepts[2].display_name | Concave function |
| concepts[3].id | https://openalex.org/C126255220 |
| concepts[3].level | 1 |
| concepts[3].score | 0.396038293838501 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q141495 |
| concepts[3].display_name | Mathematical optimization |
| concepts[4].id | https://openalex.org/C33923547 |
| concepts[4].level | 0 |
| concepts[4].score | 0.283882200717926 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[4].display_name | Mathematics |
| concepts[5].id | https://openalex.org/C112680207 |
| concepts[5].level | 2 |
| concepts[5].score | 0.10976770520210266 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q714886 |
| concepts[5].display_name | Regular polygon |
| concepts[6].id | https://openalex.org/C2524010 |
| concepts[6].level | 1 |
| concepts[6].score | 0.08000463247299194 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[6].display_name | Geometry |
| keywords[0].id | https://openalex.org/keywords/maximization |
| keywords[0].score | 0.6045883893966675 |
| keywords[0].display_name | Maximization |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.4731144607067108 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/concave-function |
| keywords[2].score | 0.4352640211582184 |
| keywords[2].display_name | Concave function |
| keywords[3].id | https://openalex.org/keywords/mathematical-optimization |
| keywords[3].score | 0.396038293838501 |
| keywords[3].display_name | Mathematical optimization |
| keywords[4].id | https://openalex.org/keywords/mathematics |
| keywords[4].score | 0.283882200717926 |
| keywords[4].display_name | Mathematics |
| keywords[5].id | https://openalex.org/keywords/regular-polygon |
| keywords[5].score | 0.10976770520210266 |
| keywords[5].display_name | Regular polygon |
| keywords[6].id | https://openalex.org/keywords/geometry |
| keywords[6].score | 0.08000463247299194 |
| keywords[6].display_name | Geometry |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2406.01477 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2406.01477 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2406.01477 |
| locations[1].id | doi:10.48550/arxiv.2406.01477 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2406.01477 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5099038757 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Anvith Thudi |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Thudi, Anvith |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5054711904 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Chris J. Maddison |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Maddison, Chris J. |
| authorships[1].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2406.01477 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2024-06-07T00:00:00 |
| display_name | MixMax: Distributional Robustness in Function Space via Optimal Data Mixtures |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11901 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9670000076293945 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Bayesian Methods and Mixture Models |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2748952813, https://openalex.org/W2986946279, https://openalex.org/W2390279801, https://openalex.org/W2959833232, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4319661813 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2406.01477 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2406.01477 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2406.01477 |
| primary_location.id | pmh:oai:arXiv.org:2406.01477 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2406.01477 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2406.01477 |
| publication_date | 2024-06-03 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 15, 23, 62, 81, 98, 120, 131, 144, 150 |
| abstract_inverted_index.In | 160 |
| abstract_inverted_index.We | 155 |
| abstract_inverted_index.as | 14 |
| abstract_inverted_index.be | 141 |
| abstract_inverted_index.by | 143, 149 |
| abstract_inverted_index.in | 50, 80, 177 |
| abstract_inverted_index.is | 22, 31, 48, 57, 116, 128 |
| abstract_inverted_index.of | 17, 34, 83, 94, 119, 134, 183, 193 |
| abstract_inverted_index.or | 53, 169 |
| abstract_inverted_index.to | 6, 26, 65, 75 |
| abstract_inverted_index.we | 60, 68, 86, 108, 164 |
| abstract_inverted_index.DRO | 71, 90, 139, 174 |
| abstract_inverted_index.and | 30, 104, 176, 196 |
| abstract_inverted_index.are | 3 |
| abstract_inverted_index.can | 140 |
| abstract_inverted_index.for | 102, 191 |
| abstract_inverted_index.one | 127 |
| abstract_inverted_index.our | 157, 161 |
| abstract_inverted_index.set | 16 |
| abstract_inverted_index.the | 32, 46, 51, 54, 92, 111, 117, 171, 181, 186, 194 |
| abstract_inverted_index.call | 156 |
| abstract_inverted_index.data | 189 |
| abstract_inverted_index.from | 72 |
| abstract_inverted_index.loss | 47 |
| abstract_inverted_index.make | 61 |
| abstract_inverted_index.mean | 105 |
| abstract_inverted_index.move | 64 |
| abstract_inverted_index.only | 187 |
| abstract_inverted_index.over | 91, 185 |
| abstract_inverted_index.risk | 152 |
| abstract_inverted_index.show | 87, 109 |
| abstract_inverted_index.such | 13 |
| abstract_inverted_index.that | 88, 110, 166 |
| abstract_inverted_index.this | 28 |
| abstract_inverted_index.user | 18 |
| abstract_inverted_index.well | 8 |
| abstract_inverted_index.when | 45 |
| abstract_inverted_index.with | 130 |
| abstract_inverted_index.DRO). | 40 |
| abstract_inverted_index.Here, | 59 |
| abstract_inverted_index.Thus, | 125 |
| abstract_inverted_index.class | 56, 133 |
| abstract_inverted_index.found | 165 |
| abstract_inverted_index.group | 35, 70, 89, 138, 173 |
| abstract_inverted_index.model | 55, 132 |
| abstract_inverted_index.often | 4 |
| abstract_inverted_index.space | 74, 93 |
| abstract_inverted_index.these | 42 |
| abstract_inverted_index.this: | 67 |
| abstract_inverted_index.which | 78 |
| abstract_inverted_index.(group | 39 |
| abstract_inverted_index.CelebA | 197 |
| abstract_inverted_index.First, | 85 |
| abstract_inverted_index.MixMax | 167, 179 |
| abstract_inverted_index.across | 9 |
| abstract_inverted_index.admits | 97 |
| abstract_inverted_index.common | 24 |
| abstract_inverted_index.convex | 122, 145 |
| abstract_inverted_index.error, | 107 |
| abstract_inverted_index.experi | 162 |
| abstract_inverted_index.ments, | 163 |
| abstract_inverted_index.method | 158 |
| abstract_inverted_index.metric | 25 |
| abstract_inverted_index.models | 2 |
| abstract_inverted_index.number | 82 |
| abstract_inverted_index.robust | 37 |
| abstract_inverted_index.simple | 121 |
| abstract_inverted_index.solved | 142 |
| abstract_inverted_index.space, | 77 |
| abstract_inverted_index.Machine | 0 |
| abstract_inverted_index.MixMax. | 159 |
| abstract_inverted_index.Second, | 101 |
| abstract_inverted_index.XGBoost | 184 |
| abstract_inverted_index.address | 66 |
| abstract_inverted_index.bounded | 95 |
| abstract_inverted_index.capture | 27 |
| abstract_inverted_index.groups. | 19 |
| abstract_inverted_index.matched | 168 |
| abstract_inverted_index.methods | 43 |
| abstract_inverted_index.minimax | 99, 112 |
| abstract_inverted_index.mixture | 114 |
| abstract_inverted_index.optimal | 113 |
| abstract_inverted_index.perform | 7 |
| abstract_inverted_index.problem | 147 |
| abstract_inverted_index.results | 79 |
| abstract_inverted_index.several | 10 |
| abstract_inverted_index.squared | 106 |
| abstract_inverted_index.working | 129 |
| abstract_inverted_index.followed | 148 |
| abstract_inverted_index.function | 76, 136 |
| abstract_inverted_index.improved | 180 |
| abstract_inverted_index.learning | 1 |
| abstract_inverted_index.problem. | 124, 154 |
| abstract_inverted_index.provided | 126 |
| abstract_inverted_index.required | 5 |
| abstract_inverted_index.solution | 118 |
| abstract_inverted_index.standard | 172 |
| abstract_inverted_index.struggle | 44 |
| abstract_inverted_index.theorem. | 100 |
| abstract_inverted_index.ACSIncome | 195 |
| abstract_inverted_index.baseline, | 188 |
| abstract_inverted_index.classical | 63, 151 |
| abstract_inverted_index.datasets. | 199 |
| abstract_inverted_index.functions | 96 |
| abstract_inverted_index.objective | 33 |
| abstract_inverted_index.parameter | 73 |
| abstract_inverted_index.settings, | 12 |
| abstract_inverted_index.universal | 135 |
| abstract_inverted_index.Worst-case | 20 |
| abstract_inverted_index.balancing, | 190 |
| abstract_inverted_index.baselines, | 175 |
| abstract_inverted_index.non-convex | 49 |
| abstract_inverted_index.variations | 192 |
| abstract_inverted_index.advantages. | 84 |
| abstract_inverted_index.annotations | 198 |
| abstract_inverted_index.parameters, | 52 |
| abstract_inverted_index.particular, | 178 |
| abstract_inverted_index.performance | 21, 182 |
| abstract_inverted_index.pre-defined | 11 |
| abstract_inverted_index.distribution | 115 |
| abstract_inverted_index.minimization | 153 |
| abstract_inverted_index.optimization | 38, 123, 146 |
| abstract_inverted_index.outperformed | 170 |
| abstract_inverted_index.requirement, | 29 |
| abstract_inverted_index.cross-entropy | 103 |
| abstract_inverted_index.Unfortunately, | 41 |
| abstract_inverted_index.approximators, | 137 |
| abstract_inverted_index.reparameterize | 69 |
| abstract_inverted_index.non-parametric. | 58 |
| abstract_inverted_index.distributionally | 36 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |