Optimal Program Synthesis Over Noisy Data Article Swipe
We explore and formalize the task of synthesizing programs over noisy data, i.e., data that may contain corrupted input-output examples. By formalizing the concept of a Noise Source, an Input Source, and a prior distribution over programs, we formalize the probabilistic process which constructs a noisy dataset. This formalism allows us to define the correctness of a synthesis algorithm, in terms of its ability to synthesize the hidden underlying program. The probability of a synthesis algorithm being correct depends upon the match between the Noise Source and the Loss Function used in the synthesis algorithm's optimization process. We formalize the concept of an optimal Loss Function given prior information about the Noise Source. We provide a technique to design optimal Loss Functions given perfect and imperfect information about the Noise Sources. We also formalize the concept and conditions required for convergence, i.e., conditions under which the probability that the synthesis algorithm produces a correct program increases as the size of the noisy data set increases. This paper presents the first formalization of the concept of optimal Loss Functions, the first closed form definition of optimal Loss Functions, and the first conditions that ensure that a noisy synthesis algorithm will have convergence guarantees.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- https://arxiv.org/abs/2103.05030v2
- OA Status
- green
- References
- 14
- Related Works
- 20
- OpenAlex ID
- https://openalex.org/W3134033197
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W3134033197Canonical identifier for this work in OpenAlex
- Title
-
Optimal Program Synthesis Over Noisy DataWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2021Year of publication
- Publication date
-
2021-03-08Full publication date if available
- Authors
-
Shivam Handa, Martin RinardList of authors in order
- Landing page
-
https://arxiv.org/abs/2103.05030v2Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/abs/2103.05030v2Direct OA link when available
- Concepts
-
Correctness, Computer science, Probabilistic logic, Algorithm, Noise (video), Convergence (economics), Synthetic data, Formalism (music), Function (biology), Mathematical optimization, Artificial intelligence, Mathematics, Image (mathematics), Visual arts, Musical, Economics, Economic growth, Evolutionary biology, Biology, ArtTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- References (count)
-
14Number of works referenced by this work
- Related works (count)
-
20Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W3134033197 |
|---|---|
| doi | |
| ids.mag | 3134033197 |
| ids.openalex | https://openalex.org/W3134033197 |
| fwci | |
| type | preprint |
| title | Optimal Program Synthesis Over Noisy Data |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T12535 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9987000226974487 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Machine Learning and Data Classification |
| topics[1].id | https://openalex.org/T12423 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9929999709129333 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1712 |
| topics[1].subfield.display_name | Software |
| topics[1].display_name | Software Reliability and Analysis Research |
| topics[2].id | https://openalex.org/T12072 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9914000034332275 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Machine Learning and Algorithms |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C55439883 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8282498121261597 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q360812 |
| concepts[0].display_name | Correctness |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.7589360475540161 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C49937458 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6004832983016968 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q2599292 |
| concepts[2].display_name | Probabilistic logic |
| concepts[3].id | https://openalex.org/C11413529 |
| concepts[3].level | 1 |
| concepts[3].score | 0.5559850335121155 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[3].display_name | Algorithm |
| concepts[4].id | https://openalex.org/C99498987 |
| concepts[4].level | 3 |
| concepts[4].score | 0.5088839530944824 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q2210247 |
| concepts[4].display_name | Noise (video) |
| concepts[5].id | https://openalex.org/C2777303404 |
| concepts[5].level | 2 |
| concepts[5].score | 0.469525009393692 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q759757 |
| concepts[5].display_name | Convergence (economics) |
| concepts[6].id | https://openalex.org/C160920958 |
| concepts[6].level | 2 |
| concepts[6].score | 0.4556272029876709 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q7662746 |
| concepts[6].display_name | Synthetic data |
| concepts[7].id | https://openalex.org/C73301696 |
| concepts[7].level | 3 |
| concepts[7].score | 0.4553840458393097 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q5469984 |
| concepts[7].display_name | Formalism (music) |
| concepts[8].id | https://openalex.org/C14036430 |
| concepts[8].level | 2 |
| concepts[8].score | 0.4110203683376312 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q3736076 |
| concepts[8].display_name | Function (biology) |
| concepts[9].id | https://openalex.org/C126255220 |
| concepts[9].level | 1 |
| concepts[9].score | 0.3513765037059784 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q141495 |
| concepts[9].display_name | Mathematical optimization |
| concepts[10].id | https://openalex.org/C154945302 |
| concepts[10].level | 1 |
| concepts[10].score | 0.19331717491149902 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[10].display_name | Artificial intelligence |
| concepts[11].id | https://openalex.org/C33923547 |
| concepts[11].level | 0 |
| concepts[11].score | 0.16347619891166687 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[11].display_name | Mathematics |
| concepts[12].id | https://openalex.org/C115961682 |
| concepts[12].level | 2 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q860623 |
| concepts[12].display_name | Image (mathematics) |
| concepts[13].id | https://openalex.org/C153349607 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q36649 |
| concepts[13].display_name | Visual arts |
| concepts[14].id | https://openalex.org/C558565934 |
| concepts[14].level | 2 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q2743 |
| concepts[14].display_name | Musical |
| concepts[15].id | https://openalex.org/C162324750 |
| concepts[15].level | 0 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q8134 |
| concepts[15].display_name | Economics |
| concepts[16].id | https://openalex.org/C50522688 |
| concepts[16].level | 1 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q189833 |
| concepts[16].display_name | Economic growth |
| concepts[17].id | https://openalex.org/C78458016 |
| concepts[17].level | 1 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q840400 |
| concepts[17].display_name | Evolutionary biology |
| concepts[18].id | https://openalex.org/C86803240 |
| concepts[18].level | 0 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q420 |
| concepts[18].display_name | Biology |
| concepts[19].id | https://openalex.org/C142362112 |
| concepts[19].level | 0 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q735 |
| concepts[19].display_name | Art |
| keywords[0].id | https://openalex.org/keywords/correctness |
| keywords[0].score | 0.8282498121261597 |
| keywords[0].display_name | Correctness |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.7589360475540161 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/probabilistic-logic |
| keywords[2].score | 0.6004832983016968 |
| keywords[2].display_name | Probabilistic logic |
| keywords[3].id | https://openalex.org/keywords/algorithm |
| keywords[3].score | 0.5559850335121155 |
| keywords[3].display_name | Algorithm |
| keywords[4].id | https://openalex.org/keywords/noise |
| keywords[4].score | 0.5088839530944824 |
| keywords[4].display_name | Noise (video) |
| keywords[5].id | https://openalex.org/keywords/convergence |
| keywords[5].score | 0.469525009393692 |
| keywords[5].display_name | Convergence (economics) |
| keywords[6].id | https://openalex.org/keywords/synthetic-data |
| keywords[6].score | 0.4556272029876709 |
| keywords[6].display_name | Synthetic data |
| keywords[7].id | https://openalex.org/keywords/formalism |
| keywords[7].score | 0.4553840458393097 |
| keywords[7].display_name | Formalism (music) |
| keywords[8].id | https://openalex.org/keywords/function |
| keywords[8].score | 0.4110203683376312 |
| keywords[8].display_name | Function (biology) |
| keywords[9].id | https://openalex.org/keywords/mathematical-optimization |
| keywords[9].score | 0.3513765037059784 |
| keywords[9].display_name | Mathematical optimization |
| keywords[10].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[10].score | 0.19331717491149902 |
| keywords[10].display_name | Artificial intelligence |
| keywords[11].id | https://openalex.org/keywords/mathematics |
| keywords[11].score | 0.16347619891166687 |
| keywords[11].display_name | Mathematics |
| language | en |
| locations[0].id | mag:3134033197 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | arXiv (Cornell University) |
| locations[0].landing_page_url | https://arxiv.org/abs/2103.05030v2 |
| authorships[0].author.id | https://openalex.org/A5087871590 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Shivam Handa |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Shivam Handa |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5045127387 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-8095-8523 |
| authorships[1].author.display_name | Martin Rinard |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Martin Rinard |
| authorships[1].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/abs/2103.05030v2 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Optimal Program Synthesis Over Noisy Data |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-10-10T17:16:08.811792 |
| primary_topic.id | https://openalex.org/T12535 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9987000226974487 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Machine Learning and Data Classification |
| related_works | https://openalex.org/W3137184948, https://openalex.org/W2999308398, https://openalex.org/W3010497646, https://openalex.org/W2988651262, https://openalex.org/W2725485065, https://openalex.org/W1596179888, https://openalex.org/W3189821996, https://openalex.org/W2950815864, https://openalex.org/W2555732556, https://openalex.org/W3179926615, https://openalex.org/W3154464254, https://openalex.org/W2788128461, https://openalex.org/W2183514596, https://openalex.org/W3192861658, https://openalex.org/W2939976446, https://openalex.org/W2907033687, https://openalex.org/W2963834707, https://openalex.org/W2751302235, https://openalex.org/W2781896072, https://openalex.org/W2911422441 |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | mag:3134033197 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | arXiv (Cornell University) |
| best_oa_location.landing_page_url | https://arxiv.org/abs/2103.05030v2 |
| primary_location.id | mag:3134033197 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | arXiv (Cornell University) |
| primary_location.landing_page_url | https://arxiv.org/abs/2103.05030v2 |
| publication_date | 2021-03-08 |
| publication_year | 2021 |
| referenced_works | https://openalex.org/W2416325154, https://openalex.org/W2093535699, https://openalex.org/W3109609376, https://openalex.org/W2066792529, https://openalex.org/W2112660220, https://openalex.org/W2293101314, https://openalex.org/W2013596093, https://openalex.org/W2550471858, https://openalex.org/W2963697299, https://openalex.org/W2045656233, https://openalex.org/W2238673293, https://openalex.org/W3105239039, https://openalex.org/W1510293570, https://openalex.org/W2793194499 |
| referenced_works_count | 14 |
| abstract_inverted_index.a | 25, 32, 44, 56, 73, 115, 152, 194 |
| abstract_inverted_index.By | 20 |
| abstract_inverted_index.We | 0, 97, 113, 131 |
| abstract_inverted_index.an | 28, 102 |
| abstract_inverted_index.as | 156 |
| abstract_inverted_index.in | 59, 91 |
| abstract_inverted_index.of | 6, 24, 55, 61, 72, 101, 159, 171, 174, 183 |
| abstract_inverted_index.to | 51, 64, 117 |
| abstract_inverted_index.us | 50 |
| abstract_inverted_index.we | 37 |
| abstract_inverted_index.The | 70 |
| abstract_inverted_index.and | 2, 31, 86, 124, 136, 187 |
| abstract_inverted_index.for | 139 |
| abstract_inverted_index.its | 62 |
| abstract_inverted_index.may | 15 |
| abstract_inverted_index.set | 163 |
| abstract_inverted_index.the | 4, 22, 39, 53, 66, 80, 83, 87, 92, 99, 110, 128, 134, 145, 148, 157, 160, 168, 172, 178, 188 |
| abstract_inverted_index.Loss | 88, 104, 120, 176, 185 |
| abstract_inverted_index.This | 47, 165 |
| abstract_inverted_index.also | 132 |
| abstract_inverted_index.data | 13, 162 |
| abstract_inverted_index.form | 181 |
| abstract_inverted_index.have | 199 |
| abstract_inverted_index.over | 9, 35 |
| abstract_inverted_index.size | 158 |
| abstract_inverted_index.task | 5 |
| abstract_inverted_index.that | 14, 147, 191, 193 |
| abstract_inverted_index.upon | 79 |
| abstract_inverted_index.used | 90 |
| abstract_inverted_index.will | 198 |
| abstract_inverted_index.Input | 29 |
| abstract_inverted_index.Noise | 26, 84, 111, 129 |
| abstract_inverted_index.about | 109, 127 |
| abstract_inverted_index.being | 76 |
| abstract_inverted_index.data, | 11 |
| abstract_inverted_index.first | 169, 179, 189 |
| abstract_inverted_index.given | 106, 122 |
| abstract_inverted_index.i.e., | 12, 141 |
| abstract_inverted_index.match | 81 |
| abstract_inverted_index.noisy | 10, 45, 161, 195 |
| abstract_inverted_index.paper | 166 |
| abstract_inverted_index.prior | 33, 107 |
| abstract_inverted_index.terms | 60 |
| abstract_inverted_index.under | 143 |
| abstract_inverted_index.which | 42, 144 |
| abstract_inverted_index.Source | 85 |
| abstract_inverted_index.allows | 49 |
| abstract_inverted_index.closed | 180 |
| abstract_inverted_index.define | 52 |
| abstract_inverted_index.design | 118 |
| abstract_inverted_index.ensure | 192 |
| abstract_inverted_index.hidden | 67 |
| abstract_inverted_index.Source, | 27, 30 |
| abstract_inverted_index.Source. | 112 |
| abstract_inverted_index.ability | 63 |
| abstract_inverted_index.between | 82 |
| abstract_inverted_index.concept | 23, 100, 135, 173 |
| abstract_inverted_index.contain | 16 |
| abstract_inverted_index.correct | 77, 153 |
| abstract_inverted_index.depends | 78 |
| abstract_inverted_index.explore | 1 |
| abstract_inverted_index.optimal | 103, 119, 175, 184 |
| abstract_inverted_index.perfect | 123 |
| abstract_inverted_index.process | 41 |
| abstract_inverted_index.program | 154 |
| abstract_inverted_index.provide | 114 |
| abstract_inverted_index.Function | 89, 105 |
| abstract_inverted_index.Sources. | 130 |
| abstract_inverted_index.dataset. | 46 |
| abstract_inverted_index.presents | 167 |
| abstract_inverted_index.process. | 96 |
| abstract_inverted_index.produces | 151 |
| abstract_inverted_index.program. | 69 |
| abstract_inverted_index.programs | 8 |
| abstract_inverted_index.required | 138 |
| abstract_inverted_index.Functions | 121 |
| abstract_inverted_index.algorithm | 75, 150, 197 |
| abstract_inverted_index.corrupted | 17 |
| abstract_inverted_index.examples. | 19 |
| abstract_inverted_index.formalism | 48 |
| abstract_inverted_index.formalize | 3, 38, 98, 133 |
| abstract_inverted_index.imperfect | 125 |
| abstract_inverted_index.increases | 155 |
| abstract_inverted_index.programs, | 36 |
| abstract_inverted_index.synthesis | 57, 74, 93, 149, 196 |
| abstract_inverted_index.technique | 116 |
| abstract_inverted_index.Functions, | 177, 186 |
| abstract_inverted_index.algorithm, | 58 |
| abstract_inverted_index.conditions | 137, 142, 190 |
| abstract_inverted_index.constructs | 43 |
| abstract_inverted_index.definition | 182 |
| abstract_inverted_index.increases. | 164 |
| abstract_inverted_index.synthesize | 65 |
| abstract_inverted_index.underlying | 68 |
| abstract_inverted_index.algorithm's | 94 |
| abstract_inverted_index.convergence | 200 |
| abstract_inverted_index.correctness | 54 |
| abstract_inverted_index.formalizing | 21 |
| abstract_inverted_index.guarantees. | 201 |
| abstract_inverted_index.information | 108, 126 |
| abstract_inverted_index.probability | 71, 146 |
| abstract_inverted_index.convergence, | 140 |
| abstract_inverted_index.distribution | 34 |
| abstract_inverted_index.input-output | 18 |
| abstract_inverted_index.optimization | 95 |
| abstract_inverted_index.synthesizing | 7 |
| abstract_inverted_index.formalization | 170 |
| abstract_inverted_index.probabilistic | 40 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |