CoDA: Coding LM via Diffusion Adaptation Article Swipe
Haolin Chen
,
Shiyu Wang
,
Can Qin
,
Bo Pang
,
Zuxin Liu
,
Jielin Qiu
,
Jianguo Zhang
,
Zhou Yingbo
,
Zeyuan Chen
,
Ran Xu
,
Shelby Heinecke
,
Silvio Savarese
,
Caiming Xiong
,
Huan Wang
,
Weiran Yao
·
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2510.03270
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2510.03270
Diffusion language models promise bidirectional context and infilling capabilities that autoregressive coders lack, yet practical systems remain heavyweight. We introduce CoDA, a 1.7B-parameter diffusion coder trained on TPU with a fully open-source training pipeline. CoDA pairs large-scale diffusion pre-training with code-centric mid-training and instruction tuning, enabling confidence-guided sampling that keeps inference latency competitive. On Humaneval, MBPP, and EvalPlus, CoDA-1.7B-Instruct matches or surpasses diffusion models up to 7B parameters. Our release includes model checkpoints, evaluation harnesses, and TPU training pipelines to accelerate research on lightweight diffusion-based coding assistants.
Related Topics
Concepts
No concepts available.
Metadata
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2510.03270
- https://arxiv.org/pdf/2510.03270
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4414968642
All OpenAlex metadata
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4414968642Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2510.03270Digital Object Identifier
- Title
-
CoDA: Coding LM via Diffusion AdaptationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-09-27Full publication date if available
- Authors
-
Haolin Chen, Shiyu Wang, Can Qin, Bo Pang, Zuxin Liu, Jielin Qiu, Jianguo Zhang, Zhou Yingbo, Zeyuan Chen, Ran Xu, Shelby Heinecke, Silvio Savarese, Caiming Xiong, Huan Wang, Weiran YaoList of authors in order
- Landing page
-
https://arxiv.org/abs/2510.03270Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2510.03270Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2510.03270Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4414968642 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2510.03270 |
| ids.doi | https://doi.org/10.48550/arxiv.2510.03270 |
| ids.openalex | https://openalex.org/W4414968642 |
| fwci | |
| type | preprint |
| title | CoDA: Coding LM via Diffusion Adaptation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10901 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9415000081062317 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Advanced Data Compression Techniques |
| topics[1].id | https://openalex.org/T10320 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9028000235557556 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Neural Networks and Applications |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2510.03270 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2510.03270 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2510.03270 |
| locations[1].id | doi:10.48550/arxiv.2510.03270 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2510.03270 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5101905561 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-1791-6560 |
| authorships[0].author.display_name | Haolin Chen |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Chen, Haolin |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5007461905 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-9996-0420 |
| authorships[1].author.display_name | Shiyu Wang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wang, Shiyu |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5021042598 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-0712-5378 |
| authorships[2].author.display_name | Can Qin |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Qin, Can |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5058655480 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-4359-2937 |
| authorships[3].author.display_name | Bo Pang |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Pang, Bo |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5037613999 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-7412-5074 |
| authorships[4].author.display_name | Zuxin Liu |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Liu, Zuxin |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5058194829 |
| authorships[5].author.orcid | https://orcid.org/0000-0002-7384-1324 |
| authorships[5].author.display_name | Jielin Qiu |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Qiu, Jielin |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5077035984 |
| authorships[6].author.orcid | https://orcid.org/0000-0001-7057-2862 |
| authorships[6].author.display_name | Jianguo Zhang |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Zhang, Jianguo |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5052185835 |
| authorships[7].author.orcid | https://orcid.org/0000-0001-5398-0944 |
| authorships[7].author.display_name | Zhou Yingbo |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Zhou, Yingbo |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5034197490 |
| authorships[8].author.orcid | https://orcid.org/0009-0000-1046-3744 |
| authorships[8].author.display_name | Zeyuan Chen |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Chen, Zeyuan |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5100540192 |
| authorships[9].author.orcid | |
| authorships[9].author.display_name | Ran Xu |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | Xu, Ran |
| authorships[9].is_corresponding | False |
| authorships[10].author.id | https://openalex.org/A5063103006 |
| authorships[10].author.orcid | https://orcid.org/0000-0002-8831-0753 |
| authorships[10].author.display_name | Shelby Heinecke |
| authorships[10].author_position | middle |
| authorships[10].raw_author_name | Heinecke, Shelby |
| authorships[10].is_corresponding | False |
| authorships[11].author.id | https://openalex.org/A5042646536 |
| authorships[11].author.orcid | |
| authorships[11].author.display_name | Silvio Savarese |
| authorships[11].author_position | middle |
| authorships[11].raw_author_name | Savarese, Silvio |
| authorships[11].is_corresponding | False |
| authorships[12].author.id | https://openalex.org/A5032046813 |
| authorships[12].author.orcid | https://orcid.org/0000-0003-0349-8628 |
| authorships[12].author.display_name | Caiming Xiong |
| authorships[12].author_position | middle |
| authorships[12].raw_author_name | Xiong, Caiming |
| authorships[12].is_corresponding | False |
| authorships[13].author.id | https://openalex.org/A5100332030 |
| authorships[13].author.orcid | https://orcid.org/0000-0003-0113-4425 |
| authorships[13].author.display_name | Huan Wang |
| authorships[13].author_position | middle |
| authorships[13].raw_author_name | Wang, Huan |
| authorships[13].is_corresponding | False |
| authorships[14].author.id | https://openalex.org/A5026898193 |
| authorships[14].author.orcid | https://orcid.org/0000-0002-6570-3888 |
| authorships[14].author.display_name | Weiran Yao |
| authorships[14].author_position | last |
| authorships[14].raw_author_name | Yao, Weiran |
| authorships[14].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2510.03270 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-09T00:00:00 |
| display_name | CoDA: Coding LM via Diffusion Adaptation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10901 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9415000081062317 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Advanced Data Compression Techniques |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2510.03270 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2510.03270 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2510.03270 |
| primary_location.id | pmh:oai:arXiv.org:2510.03270 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2510.03270 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2510.03270 |
| publication_date | 2025-09-27 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 21, 29 |
| abstract_inverted_index.7B | 66 |
| abstract_inverted_index.On | 53 |
| abstract_inverted_index.We | 18 |
| abstract_inverted_index.on | 26, 82 |
| abstract_inverted_index.or | 60 |
| abstract_inverted_index.to | 65, 79 |
| abstract_inverted_index.up | 64 |
| abstract_inverted_index.Our | 68 |
| abstract_inverted_index.TPU | 27, 76 |
| abstract_inverted_index.and | 6, 42, 56, 75 |
| abstract_inverted_index.yet | 13 |
| abstract_inverted_index.CoDA | 34 |
| abstract_inverted_index.that | 9, 48 |
| abstract_inverted_index.with | 28, 39 |
| abstract_inverted_index.CoDA, | 20 |
| abstract_inverted_index.MBPP, | 55 |
| abstract_inverted_index.coder | 24 |
| abstract_inverted_index.fully | 30 |
| abstract_inverted_index.keeps | 49 |
| abstract_inverted_index.lack, | 12 |
| abstract_inverted_index.model | 71 |
| abstract_inverted_index.pairs | 35 |
| abstract_inverted_index.coders | 11 |
| abstract_inverted_index.coding | 85 |
| abstract_inverted_index.models | 2, 63 |
| abstract_inverted_index.remain | 16 |
| abstract_inverted_index.context | 5 |
| abstract_inverted_index.latency | 51 |
| abstract_inverted_index.matches | 59 |
| abstract_inverted_index.promise | 3 |
| abstract_inverted_index.release | 69 |
| abstract_inverted_index.systems | 15 |
| abstract_inverted_index.trained | 25 |
| abstract_inverted_index.tuning, | 44 |
| abstract_inverted_index.enabling | 45 |
| abstract_inverted_index.includes | 70 |
| abstract_inverted_index.language | 1 |
| abstract_inverted_index.research | 81 |
| abstract_inverted_index.sampling | 47 |
| abstract_inverted_index.training | 32, 77 |
| abstract_inverted_index.Diffusion | 0 |
| abstract_inverted_index.EvalPlus, | 57 |
| abstract_inverted_index.diffusion | 23, 37, 62 |
| abstract_inverted_index.inference | 50 |
| abstract_inverted_index.infilling | 7 |
| abstract_inverted_index.introduce | 19 |
| abstract_inverted_index.pipeline. | 33 |
| abstract_inverted_index.pipelines | 78 |
| abstract_inverted_index.practical | 14 |
| abstract_inverted_index.surpasses | 61 |
| abstract_inverted_index.Humaneval, | 54 |
| abstract_inverted_index.accelerate | 80 |
| abstract_inverted_index.evaluation | 73 |
| abstract_inverted_index.harnesses, | 74 |
| abstract_inverted_index.assistants. | 86 |
| abstract_inverted_index.instruction | 43 |
| abstract_inverted_index.large-scale | 36 |
| abstract_inverted_index.lightweight | 83 |
| abstract_inverted_index.open-source | 31 |
| abstract_inverted_index.parameters. | 67 |
| abstract_inverted_index.capabilities | 8 |
| abstract_inverted_index.checkpoints, | 72 |
| abstract_inverted_index.code-centric | 40 |
| abstract_inverted_index.competitive. | 52 |
| abstract_inverted_index.heavyweight. | 17 |
| abstract_inverted_index.mid-training | 41 |
| abstract_inverted_index.pre-training | 38 |
| abstract_inverted_index.bidirectional | 4 |
| abstract_inverted_index.1.7B-parameter | 22 |
| abstract_inverted_index.autoregressive | 10 |
| abstract_inverted_index.diffusion-based | 84 |
| abstract_inverted_index.confidence-guided | 46 |
| abstract_inverted_index.CoDA-1.7B-Instruct | 58 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 15 |
| citation_normalized_percentile |