Symbolic-Diffusion: Deep Learning Based Symbolic Regression with D3PM Discrete Token Diffusion Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2510.07570
Symbolic regression refers to the task of finding a closed-form mathematical expression to fit a set of data points. Genetic programming based techniques are the most common algorithms used to tackle this problem, but recently, neural-network based approaches have gained popularity. Most of the leading neural-network based models used for symbolic regression utilize transformer-based autoregressive models to generate an equation conditioned on encoded input points. However, autoregressive generation is limited to generating tokens left-to-right, and future generated tokens are conditioned only on previously generated tokens. Motivated by the desire to generate all tokens simultaneously to produce improved closed-form equations, we propose Symbolic Diffusion, a D3PM based discrete state-space diffusion model which simultaneously generates all tokens of the equation at once using discrete token diffusion. Using the bivariate dataset developed for SymbolicGPT, we compared our diffusion-based generation approach to an autoregressive model based on SymbolicGPT, using equivalent encoder and transformer architectures. We demonstrate that our novel approach of using diffusion-based generation for symbolic regression can offer comparable and, by some metrics, improved performance over autoregressive generation in models using similar underlying architectures, opening new research opportunities in neural-network based symbolic regression.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2510.07570
- https://arxiv.org/pdf/2510.07570
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415318471
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415318471Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2510.07570Digital Object Identifier
- Title
-
Symbolic-Diffusion: Deep Learning Based Symbolic Regression with D3PM Discrete Token DiffusionWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-10-08Full publication date if available
- Authors
-
Ryan T. Tymkow, Benjamin Schnapp, Mojtaba Valipour, Ali GhodshiList of authors in order
- Landing page
-
https://arxiv.org/abs/2510.07570Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2510.07570Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2510.07570Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415318471 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2510.07570 |
| ids.doi | https://doi.org/10.48550/arxiv.2510.07570 |
| ids.openalex | https://openalex.org/W4415318471 |
| fwci | |
| type | preprint |
| title | Symbolic-Diffusion: Deep Learning Based Symbolic Regression with D3PM Discrete Token Diffusion |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10775 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.5442000031471252 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Generative Adversarial Networks and Image Synthesis |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2510.07570 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2510.07570 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2510.07570 |
| locations[1].id | doi:10.48550/arxiv.2510.07570 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2510.07570 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5120050563 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Ryan T. Tymkow |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Tymkow, Ryan T. |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5091561751 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-5031-8269 |
| authorships[1].author.display_name | Benjamin Schnapp |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Schnapp, Benjamin D. |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5075681233 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-5877-2869 |
| authorships[2].author.display_name | Mojtaba Valipour |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Valipour, Mojtaba |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5120050564 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Ali Ghodshi |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Ghodshi, Ali |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2510.07570 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-18T00:00:00 |
| display_name | Symbolic-Diffusion: Deep Learning Based Symbolic Regression with D3PM Discrete Token Diffusion |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10775 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.5442000031471252 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Generative Adversarial Networks and Image Synthesis |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2510.07570 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2510.07570 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2510.07570 |
| primary_location.id | pmh:oai:arXiv.org:2510.07570 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2510.07570 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2510.07570 |
| publication_date | 2025-10-08 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 8, 14, 103 |
| abstract_inverted_index.We | 150 |
| abstract_inverted_index.an | 58, 138 |
| abstract_inverted_index.at | 118 |
| abstract_inverted_index.by | 86, 167 |
| abstract_inverted_index.in | 175, 185 |
| abstract_inverted_index.is | 68 |
| abstract_inverted_index.of | 6, 16, 42, 115, 156 |
| abstract_inverted_index.on | 61, 81, 142 |
| abstract_inverted_index.to | 3, 12, 29, 56, 70, 89, 94, 137 |
| abstract_inverted_index.we | 99, 131 |
| abstract_inverted_index.all | 91, 113 |
| abstract_inverted_index.and | 74, 147 |
| abstract_inverted_index.are | 23, 78 |
| abstract_inverted_index.but | 33 |
| abstract_inverted_index.can | 163 |
| abstract_inverted_index.fit | 13 |
| abstract_inverted_index.for | 49, 129, 160 |
| abstract_inverted_index.new | 182 |
| abstract_inverted_index.our | 133, 153 |
| abstract_inverted_index.set | 15 |
| abstract_inverted_index.the | 4, 24, 43, 87, 116, 125 |
| abstract_inverted_index.D3PM | 104 |
| abstract_inverted_index.Most | 41 |
| abstract_inverted_index.and, | 166 |
| abstract_inverted_index.data | 17 |
| abstract_inverted_index.have | 38 |
| abstract_inverted_index.most | 25 |
| abstract_inverted_index.once | 119 |
| abstract_inverted_index.only | 80 |
| abstract_inverted_index.over | 172 |
| abstract_inverted_index.some | 168 |
| abstract_inverted_index.task | 5 |
| abstract_inverted_index.that | 152 |
| abstract_inverted_index.this | 31 |
| abstract_inverted_index.used | 28, 48 |
| abstract_inverted_index.Using | 124 |
| abstract_inverted_index.based | 21, 36, 46, 105, 141, 187 |
| abstract_inverted_index.input | 63 |
| abstract_inverted_index.model | 109, 140 |
| abstract_inverted_index.novel | 154 |
| abstract_inverted_index.offer | 164 |
| abstract_inverted_index.token | 122 |
| abstract_inverted_index.using | 120, 144, 157, 177 |
| abstract_inverted_index.which | 110 |
| abstract_inverted_index.common | 26 |
| abstract_inverted_index.desire | 88 |
| abstract_inverted_index.future | 75 |
| abstract_inverted_index.gained | 39 |
| abstract_inverted_index.models | 47, 55, 176 |
| abstract_inverted_index.refers | 2 |
| abstract_inverted_index.tackle | 30 |
| abstract_inverted_index.tokens | 72, 77, 92, 114 |
| abstract_inverted_index.Genetic | 19 |
| abstract_inverted_index.dataset | 127 |
| abstract_inverted_index.encoded | 62 |
| abstract_inverted_index.encoder | 146 |
| abstract_inverted_index.finding | 7 |
| abstract_inverted_index.leading | 44 |
| abstract_inverted_index.limited | 69 |
| abstract_inverted_index.opening | 181 |
| abstract_inverted_index.points. | 18, 64 |
| abstract_inverted_index.produce | 95 |
| abstract_inverted_index.propose | 100 |
| abstract_inverted_index.similar | 178 |
| abstract_inverted_index.tokens. | 84 |
| abstract_inverted_index.utilize | 52 |
| abstract_inverted_index.However, | 65 |
| abstract_inverted_index.Symbolic | 0, 101 |
| abstract_inverted_index.approach | 136, 155 |
| abstract_inverted_index.compared | 132 |
| abstract_inverted_index.discrete | 106, 121 |
| abstract_inverted_index.equation | 59, 117 |
| abstract_inverted_index.generate | 57, 90 |
| abstract_inverted_index.improved | 96, 170 |
| abstract_inverted_index.metrics, | 169 |
| abstract_inverted_index.problem, | 32 |
| abstract_inverted_index.research | 183 |
| abstract_inverted_index.symbolic | 50, 161, 188 |
| abstract_inverted_index.Motivated | 85 |
| abstract_inverted_index.bivariate | 126 |
| abstract_inverted_index.developed | 128 |
| abstract_inverted_index.diffusion | 108 |
| abstract_inverted_index.generated | 76, 83 |
| abstract_inverted_index.generates | 112 |
| abstract_inverted_index.recently, | 34 |
| abstract_inverted_index.Diffusion, | 102 |
| abstract_inverted_index.algorithms | 27 |
| abstract_inverted_index.approaches | 37 |
| abstract_inverted_index.comparable | 165 |
| abstract_inverted_index.diffusion. | 123 |
| abstract_inverted_index.equations, | 98 |
| abstract_inverted_index.equivalent | 145 |
| abstract_inverted_index.expression | 11 |
| abstract_inverted_index.generating | 71 |
| abstract_inverted_index.generation | 67, 135, 159, 174 |
| abstract_inverted_index.previously | 82 |
| abstract_inverted_index.regression | 1, 51, 162 |
| abstract_inverted_index.techniques | 22 |
| abstract_inverted_index.underlying | 179 |
| abstract_inverted_index.closed-form | 9, 97 |
| abstract_inverted_index.conditioned | 60, 79 |
| abstract_inverted_index.demonstrate | 151 |
| abstract_inverted_index.performance | 171 |
| abstract_inverted_index.popularity. | 40 |
| abstract_inverted_index.programming | 20 |
| abstract_inverted_index.regression. | 189 |
| abstract_inverted_index.state-space | 107 |
| abstract_inverted_index.transformer | 148 |
| abstract_inverted_index.SymbolicGPT, | 130, 143 |
| abstract_inverted_index.mathematical | 10 |
| abstract_inverted_index.opportunities | 184 |
| abstract_inverted_index.architectures, | 180 |
| abstract_inverted_index.architectures. | 149 |
| abstract_inverted_index.autoregressive | 54, 66, 139, 173 |
| abstract_inverted_index.left-to-right, | 73 |
| abstract_inverted_index.neural-network | 35, 45, 186 |
| abstract_inverted_index.simultaneously | 93, 111 |
| abstract_inverted_index.diffusion-based | 134, 158 |
| abstract_inverted_index.transformer-based | 53 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |