Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2504.09135
In real-world applications of large language models, outputs are often required to be confined: selecting items from predefined product or document sets, generating phrases that comply with safety standards, or conforming to specialized formatting styles. To control the generation, constrained decoding has been widely adopted. However, existing prefix-tree-based constrained decoding is inefficient under GPU-based model inference paradigms, and it introduces unintended biases into the output distribution. This paper introduces Dynamic Importance Sampling for Constrained Decoding (DISC) with GPU-based Parallel Prefix-Verification (PPV), a novel algorithm that leverages dynamic importance sampling to achieve theoretically guaranteed asymptotic unbiasedness and overcomes the inefficiency of prefix-tree. Extensive experiments demonstrate the superiority of our method over existing methods in both efficiency and output quality. These results highlight the potential of our methods to improve constrained generation in applications where adherence to specific constraints is essential.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- http://arxiv.org/abs/2504.09135
- https://arxiv.org/pdf/2504.09135
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415155194
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415155194Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2504.09135Digital Object Identifier
- Title
-
Efficient and Asymptotically Unbiased Constrained Decoding for Large Language ModelsWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-04-12Full publication date if available
- Authors
-
Haotian Ye, Himanshu Jain, Chong You, Ananda Theertha Suresh, Haowei Lin, James Zou, Felix YuList of authors in order
- Landing page
-
https://arxiv.org/abs/2504.09135Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2504.09135Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2504.09135Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415155194 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2504.09135 |
| ids.doi | https://doi.org/10.48550/arxiv.2504.09135 |
| ids.openalex | https://openalex.org/W4415155194 |
| fwci | 0.0 |
| type | article |
| title | Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10181 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9595000147819519 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Natural Language Processing Techniques |
| topics[1].id | https://openalex.org/T10028 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9168000221252441 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Topic Modeling |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2504.09135 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2504.09135 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2504.09135 |
| locations[1].id | doi:10.48550/arxiv.2504.09135 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article-journal |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2504.09135 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5077279992 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Haotian Ye |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Ye, Haotian |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5008620289 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-4382-9460 |
| authorships[1].author.display_name | Himanshu Jain |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Jain, Himanshu |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5103110999 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-7821-2378 |
| authorships[2].author.display_name | Chong You |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | You, Chong |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5103890688 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Ananda Theertha Suresh |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Suresh, Ananda Theertha |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5101191761 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Haowei Lin |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Lin, Haowei |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5005779176 |
| authorships[5].author.orcid | https://orcid.org/0000-0001-8880-4764 |
| authorships[5].author.display_name | James Zou |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Zou, James |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5111045439 |
| authorships[6].author.orcid | |
| authorships[6].author.display_name | Felix Yu |
| authorships[6].author_position | last |
| authorships[6].raw_author_name | Yu, Felix |
| authorships[6].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2504.09135 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-14T00:00:00 |
| display_name | Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10181 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9595000147819519 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Natural Language Processing Techniques |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2504.09135 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2504.09135 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2504.09135 |
| primary_location.id | pmh:oai:arXiv.org:2504.09135 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2504.09135 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2504.09135 |
| publication_date | 2025-04-12 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 81 |
| abstract_inverted_index.In | 0 |
| abstract_inverted_index.To | 35 |
| abstract_inverted_index.be | 12 |
| abstract_inverted_index.in | 112, 130 |
| abstract_inverted_index.is | 50, 137 |
| abstract_inverted_index.it | 58 |
| abstract_inverted_index.of | 3, 99, 106, 123 |
| abstract_inverted_index.or | 19, 29 |
| abstract_inverted_index.to | 11, 31, 89, 126, 134 |
| abstract_inverted_index.and | 57, 95, 115 |
| abstract_inverted_index.are | 8 |
| abstract_inverted_index.for | 72 |
| abstract_inverted_index.has | 41 |
| abstract_inverted_index.our | 107, 124 |
| abstract_inverted_index.the | 37, 63, 97, 104, 121 |
| abstract_inverted_index.This | 66 |
| abstract_inverted_index.been | 42 |
| abstract_inverted_index.both | 113 |
| abstract_inverted_index.from | 16 |
| abstract_inverted_index.into | 62 |
| abstract_inverted_index.over | 109 |
| abstract_inverted_index.that | 24, 84 |
| abstract_inverted_index.with | 26, 76 |
| abstract_inverted_index.These | 118 |
| abstract_inverted_index.items | 15 |
| abstract_inverted_index.large | 4 |
| abstract_inverted_index.model | 54 |
| abstract_inverted_index.novel | 82 |
| abstract_inverted_index.often | 9 |
| abstract_inverted_index.paper | 67 |
| abstract_inverted_index.sets, | 21 |
| abstract_inverted_index.under | 52 |
| abstract_inverted_index.where | 132 |
| abstract_inverted_index.(DISC) | 75 |
| abstract_inverted_index.(PPV), | 80 |
| abstract_inverted_index.biases | 61 |
| abstract_inverted_index.comply | 25 |
| abstract_inverted_index.method | 108 |
| abstract_inverted_index.output | 64, 116 |
| abstract_inverted_index.safety | 27 |
| abstract_inverted_index.widely | 43 |
| abstract_inverted_index.Dynamic | 69 |
| abstract_inverted_index.achieve | 90 |
| abstract_inverted_index.control | 36 |
| abstract_inverted_index.dynamic | 86 |
| abstract_inverted_index.improve | 127 |
| abstract_inverted_index.methods | 111, 125 |
| abstract_inverted_index.models, | 6 |
| abstract_inverted_index.outputs | 7 |
| abstract_inverted_index.phrases | 23 |
| abstract_inverted_index.product | 18 |
| abstract_inverted_index.results | 119 |
| abstract_inverted_index.styles. | 34 |
| abstract_inverted_index.Decoding | 74 |
| abstract_inverted_index.However, | 45 |
| abstract_inverted_index.Parallel | 78 |
| abstract_inverted_index.Sampling | 71 |
| abstract_inverted_index.adopted. | 44 |
| abstract_inverted_index.decoding | 40, 49 |
| abstract_inverted_index.document | 20 |
| abstract_inverted_index.existing | 46, 110 |
| abstract_inverted_index.language | 5 |
| abstract_inverted_index.quality. | 117 |
| abstract_inverted_index.required | 10 |
| abstract_inverted_index.sampling | 88 |
| abstract_inverted_index.specific | 135 |
| abstract_inverted_index.Extensive | 101 |
| abstract_inverted_index.GPU-based | 53, 77 |
| abstract_inverted_index.adherence | 133 |
| abstract_inverted_index.algorithm | 83 |
| abstract_inverted_index.confined: | 13 |
| abstract_inverted_index.highlight | 120 |
| abstract_inverted_index.inference | 55 |
| abstract_inverted_index.leverages | 85 |
| abstract_inverted_index.overcomes | 96 |
| abstract_inverted_index.potential | 122 |
| abstract_inverted_index.selecting | 14 |
| abstract_inverted_index.Importance | 70 |
| abstract_inverted_index.asymptotic | 93 |
| abstract_inverted_index.conforming | 30 |
| abstract_inverted_index.efficiency | 114 |
| abstract_inverted_index.essential. | 138 |
| abstract_inverted_index.formatting | 33 |
| abstract_inverted_index.generating | 22 |
| abstract_inverted_index.generation | 129 |
| abstract_inverted_index.guaranteed | 92 |
| abstract_inverted_index.importance | 87 |
| abstract_inverted_index.introduces | 59, 68 |
| abstract_inverted_index.paradigms, | 56 |
| abstract_inverted_index.predefined | 17 |
| abstract_inverted_index.real-world | 1 |
| abstract_inverted_index.standards, | 28 |
| abstract_inverted_index.unintended | 60 |
| abstract_inverted_index.Constrained | 73 |
| abstract_inverted_index.constrained | 39, 48, 128 |
| abstract_inverted_index.constraints | 136 |
| abstract_inverted_index.demonstrate | 103 |
| abstract_inverted_index.experiments | 102 |
| abstract_inverted_index.generation, | 38 |
| abstract_inverted_index.inefficient | 51 |
| abstract_inverted_index.specialized | 32 |
| abstract_inverted_index.superiority | 105 |
| abstract_inverted_index.applications | 2, 131 |
| abstract_inverted_index.inefficiency | 98 |
| abstract_inverted_index.prefix-tree. | 100 |
| abstract_inverted_index.unbiasedness | 94 |
| abstract_inverted_index.distribution. | 65 |
| abstract_inverted_index.theoretically | 91 |
| abstract_inverted_index.prefix-tree-based | 47 |
| abstract_inverted_index.Prefix-Verification | 79 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 7 |
| citation_normalized_percentile.value | 0.22055554 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |