XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL Article Swipe
YOU?
·
· 2025
· Open Access
·
To leverage the advantages of LLM in addressing challenges in the Text-to-SQL task, we present XiYan-SQL, an innovative framework effectively generating and utilizing multiple SQL candidates. It consists of three components: 1) a Schema Filter module filtering and obtaining multiple relevant schemas; 2) a multi-generator ensemble approach generating multiple highquality and diverse SQL queries; 3) a selection model with a candidate reorganization strategy implemented to obtain the optimal SQL query. Specifically, for the multi-generator ensemble, we employ a multi-task fine-tuning strategy to enhance the capabilities of SQL generation models for the intrinsic alignment between SQL and text, and construct multiple generation models with distinct generation styles by fine-tuning across different SQL formats. The experimental results and comprehensive analysis demonstrate the effectiveness and robustness of our framework. Overall, XiYan-SQL achieves a new SOTA performance of 75.63% on the notable BIRD benchmark, surpassing all previous methods. It also attains SOTA performance on the Spider test set with an accuracy of 89.65%.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- http://arxiv.org/abs/2507.04701
- https://arxiv.org/pdf/2507.04701
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415348225
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415348225Canonical identifier for this work in OpenAlex
- Title
-
XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQLWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-07-07Full publication date if available
- Authors
-
Yifu Liu, Yin Zhu, Yingqi Gao, Zhiling Luo, Xiaoxia Li, X. Shi, Yú Hónɡ, Jinyang Gao, Haijun Yu, Bolin Ding, Jingren ZhouList of authors in order
- Landing page
-
https://arxiv.org/abs/2507.04701Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2507.04701Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2507.04701Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415348225 |
|---|---|
| doi | |
| ids.openalex | https://openalex.org/W4415348225 |
| fwci | 0.0 |
| type | article |
| title | XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T13734 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.5062000155448914 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Advanced Computational Techniques and Applications |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2507.04701 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2507.04701 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2507.04701 |
| indexed_in | arxiv |
| authorships[0].author.id | https://openalex.org/A5075365578 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-2018-2572 |
| authorships[0].author.display_name | Yifu Liu |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Liu, Yifu |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5100343741 |
| authorships[1].author.orcid | https://orcid.org/0009-0002-1063-515X |
| authorships[1].author.display_name | Yin Zhu |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zhu, Yin |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5020118426 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-3247-0272 |
| authorships[2].author.display_name | Yingqi Gao |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Gao, Yingqi |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5020904548 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-0540-7307 |
| authorships[3].author.display_name | Zhiling Luo |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Luo, Zhiling |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5100348753 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-3510-2942 |
| authorships[4].author.display_name | Xiaoxia Li |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Li, Xiaoxia |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5091299418 |
| authorships[5].author.orcid | https://orcid.org/0000-0001-9910-9345 |
| authorships[5].author.display_name | X. Shi |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Shi, Xiaorong |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5101834055 |
| authorships[6].author.orcid | https://orcid.org/0000-0003-0738-5620 |
| authorships[6].author.display_name | Yú Hónɡ |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Hong, Yuntao |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5101403129 |
| authorships[7].author.orcid | https://orcid.org/0000-0002-2094-3554 |
| authorships[7].author.display_name | Jinyang Gao |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Gao, Jinyang |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5022118078 |
| authorships[8].author.orcid | https://orcid.org/0000-0002-3398-0880 |
| authorships[8].author.display_name | Haijun Yu |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Li, Yu |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5040297543 |
| authorships[9].author.orcid | https://orcid.org/0000-0003-1535-9692 |
| authorships[9].author.display_name | Bolin Ding |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | Ding, Bolin |
| authorships[9].is_corresponding | False |
| authorships[10].author.id | https://openalex.org/A5057864403 |
| authorships[10].author.orcid | https://orcid.org/0000-0002-4220-2634 |
| authorships[10].author.display_name | Jingren Zhou |
| authorships[10].author_position | last |
| authorships[10].raw_author_name | Zhou, Jingren |
| authorships[10].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2507.04701 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-20T00:00:00 |
| display_name | XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T04:12:42.849631 |
| primary_topic.id | https://openalex.org/T13734 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.5062000155448914 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Advanced Computational Techniques and Applications |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | pmh:oai:arXiv.org:2507.04701 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2507.04701 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2507.04701 |
| primary_location.id | pmh:oai:arXiv.org:2507.04701 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2507.04701 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2507.04701 |
| publication_date | 2025-07-07 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 32, 43, 55, 59, 77, 129 |
| abstract_inverted_index.1) | 31 |
| abstract_inverted_index.2) | 42 |
| abstract_inverted_index.3) | 54 |
| abstract_inverted_index.It | 26, 144 |
| abstract_inverted_index.To | 0 |
| abstract_inverted_index.an | 16, 155 |
| abstract_inverted_index.by | 106 |
| abstract_inverted_index.in | 6, 9 |
| abstract_inverted_index.of | 4, 28, 85, 123, 133, 157 |
| abstract_inverted_index.on | 135, 149 |
| abstract_inverted_index.to | 64, 81 |
| abstract_inverted_index.we | 13, 75 |
| abstract_inverted_index.LLM | 5 |
| abstract_inverted_index.SQL | 24, 52, 68, 86, 94, 110 |
| abstract_inverted_index.The | 112 |
| abstract_inverted_index.all | 141 |
| abstract_inverted_index.and | 21, 37, 50, 95, 97, 115, 121 |
| abstract_inverted_index.for | 71, 89 |
| abstract_inverted_index.new | 130 |
| abstract_inverted_index.our | 124 |
| abstract_inverted_index.set | 153 |
| abstract_inverted_index.the | 2, 10, 66, 72, 83, 90, 119, 136, 150 |
| abstract_inverted_index.BIRD | 138 |
| abstract_inverted_index.SOTA | 131, 147 |
| abstract_inverted_index.also | 145 |
| abstract_inverted_index.test | 152 |
| abstract_inverted_index.with | 58, 102, 154 |
| abstract_inverted_index.model | 57 |
| abstract_inverted_index.task, | 12 |
| abstract_inverted_index.text, | 96 |
| abstract_inverted_index.three | 29 |
| abstract_inverted_index.75.63% | 134 |
| abstract_inverted_index.Filter | 34 |
| abstract_inverted_index.Schema | 33 |
| abstract_inverted_index.Spider | 151 |
| abstract_inverted_index.across | 108 |
| abstract_inverted_index.employ | 76 |
| abstract_inverted_index.models | 88, 101 |
| abstract_inverted_index.module | 35 |
| abstract_inverted_index.obtain | 65 |
| abstract_inverted_index.query. | 69 |
| abstract_inverted_index.styles | 105 |
| abstract_inverted_index.89.65%. | 158 |
| abstract_inverted_index.attains | 146 |
| abstract_inverted_index.between | 93 |
| abstract_inverted_index.diverse | 51 |
| abstract_inverted_index.enhance | 82 |
| abstract_inverted_index.notable | 137 |
| abstract_inverted_index.optimal | 67 |
| abstract_inverted_index.present | 14 |
| abstract_inverted_index.results | 114 |
| abstract_inverted_index.Overall, | 126 |
| abstract_inverted_index.accuracy | 156 |
| abstract_inverted_index.achieves | 128 |
| abstract_inverted_index.analysis | 117 |
| abstract_inverted_index.approach | 46 |
| abstract_inverted_index.consists | 27 |
| abstract_inverted_index.distinct | 103 |
| abstract_inverted_index.ensemble | 45 |
| abstract_inverted_index.formats. | 111 |
| abstract_inverted_index.leverage | 1 |
| abstract_inverted_index.methods. | 143 |
| abstract_inverted_index.multiple | 23, 39, 48, 99 |
| abstract_inverted_index.previous | 142 |
| abstract_inverted_index.queries; | 53 |
| abstract_inverted_index.relevant | 40 |
| abstract_inverted_index.schemas; | 41 |
| abstract_inverted_index.strategy | 62, 80 |
| abstract_inverted_index.XiYan-SQL | 127 |
| abstract_inverted_index.alignment | 92 |
| abstract_inverted_index.candidate | 60 |
| abstract_inverted_index.construct | 98 |
| abstract_inverted_index.different | 109 |
| abstract_inverted_index.ensemble, | 74 |
| abstract_inverted_index.filtering | 36 |
| abstract_inverted_index.framework | 18 |
| abstract_inverted_index.intrinsic | 91 |
| abstract_inverted_index.obtaining | 38 |
| abstract_inverted_index.selection | 56 |
| abstract_inverted_index.utilizing | 22 |
| abstract_inverted_index.XiYan-SQL, | 15 |
| abstract_inverted_index.addressing | 7 |
| abstract_inverted_index.advantages | 3 |
| abstract_inverted_index.benchmark, | 139 |
| abstract_inverted_index.challenges | 8 |
| abstract_inverted_index.framework. | 125 |
| abstract_inverted_index.generating | 20, 47 |
| abstract_inverted_index.generation | 87, 100, 104 |
| abstract_inverted_index.innovative | 17 |
| abstract_inverted_index.multi-task | 78 |
| abstract_inverted_index.robustness | 122 |
| abstract_inverted_index.surpassing | 140 |
| abstract_inverted_index.Text-to-SQL | 11 |
| abstract_inverted_index.candidates. | 25 |
| abstract_inverted_index.components: | 30 |
| abstract_inverted_index.demonstrate | 118 |
| abstract_inverted_index.effectively | 19 |
| abstract_inverted_index.fine-tuning | 79, 107 |
| abstract_inverted_index.highquality | 49 |
| abstract_inverted_index.implemented | 63 |
| abstract_inverted_index.performance | 132, 148 |
| abstract_inverted_index.capabilities | 84 |
| abstract_inverted_index.experimental | 113 |
| abstract_inverted_index.Specifically, | 70 |
| abstract_inverted_index.comprehensive | 116 |
| abstract_inverted_index.effectiveness | 120 |
| abstract_inverted_index.reorganization | 61 |
| abstract_inverted_index.multi-generator | 44, 73 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 11 |
| citation_normalized_percentile.value | 0.22740161 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |