Towards Mitigating Systematics in Large-Scale Surveys via Few-Shot Optimal Transport-Based Feature Alignment Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2511.11787
Systematics contaminate observables, leading to distribution shifts relative to theoretically simulated signals-posing a major challenge for using pre-trained models to label such observables. Since systematics are often poorly understood and difficult to model, removing them directly and entirely may not be feasible. To address this challenge, we propose a novel method that aligns learned features between in-distribution (ID) and out-of-distribution (OOD) samples by optimizing a feature-alignment loss on the representations extracted from a pre-trained ID model. We first experimentally validate the method on the MNIST dataset using possible alignment losses, including mean squared error and optimal transport, and subsequently apply it to large-scale maps of neutral hydrogen. Our results show that optimal transport is particularly effective at aligning OOD features when parity between ID and OOD samples is unknown, even with limited data-mimicking real-world conditions in extracting information from large-scale surveys. Our code is available at https://github.com/sultan-hassan/feature-alignment-for-OOD-generalization.
Related Topics
- Type
- preprint
- Landing Page
- http://arxiv.org/abs/2511.11787
- https://arxiv.org/pdf/2511.11787
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4416350698
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4416350698Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2511.11787Digital Object Identifier
- Title
-
Towards Mitigating Systematics in Large-Scale Surveys via Few-Shot Optimal Transport-Based Feature AlignmentWork title
- Type
-
preprintOpenAlex work type
- Publication year
-
2025Year of publication
- Publication date
-
2025-11-14Full publication date if available
- Authors
-
Sambatra Andrianomena, B. D. WandeltList of authors in order
- Landing page
-
https://arxiv.org/abs/2511.11787Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2511.11787Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2511.11787Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4416350698 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2511.11787 |
| ids.doi | https://doi.org/10.48550/arxiv.2511.11787 |
| ids.openalex | https://openalex.org/W4416350698 |
| fwci | |
| type | preprint |
| title | Towards Mitigating Systematics in Large-Scale Surveys via Few-Shot Optimal Transport-Based Feature Alignment |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | |
| locations[0].id | pmh:oai:arXiv.org:2511.11787 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2511.11787 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2511.11787 |
| locations[1].id | doi:10.48550/arxiv.2511.11787 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2511.11787 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5038445526 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-5957-0719 |
| authorships[0].author.display_name | Sambatra Andrianomena |
| authorships[0].author_position | last |
| authorships[0].raw_author_name | Andrianomena, Sambatra |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5050309898 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-5854-8269 |
| authorships[1].author.display_name | B. D. Wandelt |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wandelt, Benjamin D. |
| authorships[1].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2511.11787 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-11-19T00:00:00 |
| display_name | Towards Mitigating Systematics in Large-Scale Surveys via Few-Shot Optimal Transport-Based Feature Alignment |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-28T11:55:03.066091 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2511.11787 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2511.11787 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2511.11787 |
| primary_location.id | pmh:oai:arXiv.org:2511.11787 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2511.11787 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2511.11787 |
| publication_date | 2025-11-14 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 12, 48, 64, 72 |
| abstract_inverted_index.ID | 74, 123 |
| abstract_inverted_index.To | 42 |
| abstract_inverted_index.We | 76 |
| abstract_inverted_index.at | 116, 145 |
| abstract_inverted_index.be | 40 |
| abstract_inverted_index.by | 62 |
| abstract_inverted_index.in | 135 |
| abstract_inverted_index.is | 113, 127, 143 |
| abstract_inverted_index.it | 100 |
| abstract_inverted_index.of | 104 |
| abstract_inverted_index.on | 67, 82 |
| abstract_inverted_index.to | 4, 8, 19, 31, 101 |
| abstract_inverted_index.we | 46 |
| abstract_inverted_index.OOD | 118, 125 |
| abstract_inverted_index.Our | 107, 141 |
| abstract_inverted_index.and | 29, 36, 58, 94, 97, 124 |
| abstract_inverted_index.are | 25 |
| abstract_inverted_index.for | 15 |
| abstract_inverted_index.may | 38 |
| abstract_inverted_index.not | 39 |
| abstract_inverted_index.the | 68, 80, 83 |
| abstract_inverted_index.(ID) | 57 |
| abstract_inverted_index.code | 142 |
| abstract_inverted_index.even | 129 |
| abstract_inverted_index.from | 71, 138 |
| abstract_inverted_index.loss | 66 |
| abstract_inverted_index.maps | 103 |
| abstract_inverted_index.mean | 91 |
| abstract_inverted_index.show | 109 |
| abstract_inverted_index.such | 21 |
| abstract_inverted_index.that | 51, 110 |
| abstract_inverted_index.them | 34 |
| abstract_inverted_index.this | 44 |
| abstract_inverted_index.when | 120 |
| abstract_inverted_index.with | 130 |
| abstract_inverted_index.(OOD) | 60 |
| abstract_inverted_index.MNIST | 84 |
| abstract_inverted_index.Since | 23 |
| abstract_inverted_index.apply | 99 |
| abstract_inverted_index.error | 93 |
| abstract_inverted_index.first | 77 |
| abstract_inverted_index.label | 20 |
| abstract_inverted_index.major | 13 |
| abstract_inverted_index.novel | 49 |
| abstract_inverted_index.often | 26 |
| abstract_inverted_index.using | 16, 86 |
| abstract_inverted_index.aligns | 52 |
| abstract_inverted_index.method | 50, 81 |
| abstract_inverted_index.model, | 32 |
| abstract_inverted_index.model. | 75 |
| abstract_inverted_index.models | 18 |
| abstract_inverted_index.parity | 121 |
| abstract_inverted_index.poorly | 27 |
| abstract_inverted_index.shifts | 6 |
| abstract_inverted_index.address | 43 |
| abstract_inverted_index.between | 55, 122 |
| abstract_inverted_index.dataset | 85 |
| abstract_inverted_index.leading | 3 |
| abstract_inverted_index.learned | 53 |
| abstract_inverted_index.limited | 131 |
| abstract_inverted_index.losses, | 89 |
| abstract_inverted_index.neutral | 105 |
| abstract_inverted_index.optimal | 95, 111 |
| abstract_inverted_index.propose | 47 |
| abstract_inverted_index.results | 108 |
| abstract_inverted_index.samples | 61, 126 |
| abstract_inverted_index.squared | 92 |
| abstract_inverted_index.aligning | 117 |
| abstract_inverted_index.directly | 35 |
| abstract_inverted_index.entirely | 37 |
| abstract_inverted_index.features | 54, 119 |
| abstract_inverted_index.possible | 87 |
| abstract_inverted_index.relative | 7 |
| abstract_inverted_index.removing | 33 |
| abstract_inverted_index.surveys. | 140 |
| abstract_inverted_index.unknown, | 128 |
| abstract_inverted_index.validate | 79 |
| abstract_inverted_index.alignment | 88 |
| abstract_inverted_index.available | 144 |
| abstract_inverted_index.challenge | 14 |
| abstract_inverted_index.difficult | 30 |
| abstract_inverted_index.effective | 115 |
| abstract_inverted_index.extracted | 70 |
| abstract_inverted_index.feasible. | 41 |
| abstract_inverted_index.hydrogen. | 106 |
| abstract_inverted_index.including | 90 |
| abstract_inverted_index.simulated | 10 |
| abstract_inverted_index.transport | 112 |
| abstract_inverted_index.challenge, | 45 |
| abstract_inverted_index.conditions | 134 |
| abstract_inverted_index.extracting | 136 |
| abstract_inverted_index.optimizing | 63 |
| abstract_inverted_index.real-world | 133 |
| abstract_inverted_index.transport, | 96 |
| abstract_inverted_index.understood | 28 |
| abstract_inverted_index.Systematics | 0 |
| abstract_inverted_index.contaminate | 1 |
| abstract_inverted_index.information | 137 |
| abstract_inverted_index.large-scale | 102, 139 |
| abstract_inverted_index.pre-trained | 17, 73 |
| abstract_inverted_index.systematics | 24 |
| abstract_inverted_index.distribution | 5 |
| abstract_inverted_index.observables, | 2 |
| abstract_inverted_index.observables. | 22 |
| abstract_inverted_index.particularly | 114 |
| abstract_inverted_index.subsequently | 98 |
| abstract_inverted_index.theoretically | 9 |
| abstract_inverted_index.data-mimicking | 132 |
| abstract_inverted_index.experimentally | 78 |
| abstract_inverted_index.signals-posing | 11 |
| abstract_inverted_index.in-distribution | 56 |
| abstract_inverted_index.representations | 69 |
| abstract_inverted_index.feature-alignment | 65 |
| abstract_inverted_index.out-of-distribution | 59 |
| abstract_inverted_index.https://github.com/sultan-hassan/feature-alignment-for-OOD-generalization. | 146 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |