MixMax: Distributional Robustness in Function Space via Optimal Data Mixtures

Exploring foci of: arXiv (Cornell University) MixMax: Distributional Robustness in Function Space via Optimal Data Mixtures June 2024 • Anvith Thudi, Chris J. Maddison Machine learning models are often required to perform well across several pre-defined settings, such as a set of user groups. Worst-case performance is a common metric to capture this requirement, and is the objective of group distributionally robust optimization (group DRO). Unfortunately, these methods struggle when the loss is non-convex in the parameters, or the model class is non-parametric. Here, we make a classical move to address this: we reparameterize group DRO from parameter space to function space, whi… Open Article Page

Expectation–Maximization Algorithm Computer Science Mathematics Geometry Open Article