Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2507.06804
Automated Theorem Proving (ATP) in formal languages is a foundational challenge for AI. While Large Language Models (LLMs) have driven remarkable progress, a significant gap remains between their powerful informal reasoning capabilities and their weak formal proving performance. Recent studies show that the informal accuracy exceeds 80% while formal success remains below 8% on benchmarks like PutnamBench. We argue this gap persists because current state-of-the-art provers, by tightly coupling reasoning and proving, are trained with paradigms that inadvertently punish deep reasoning in favor of shallow, tactic-based strategies. To bridge this fundamental gap, we propose a novel framework that decouples high-level reasoning from low-level proof generation. Our approach utilizes two distinct, specialized models: a powerful, general-purpose Reasoner to generate diverse, strategic subgoal lemmas, and an efficient Prover to rigorously verify them. This modular design liberates the model's full reasoning potential and bypasses the pitfalls of end-to-end training. We evaluate our method on a challenging set of post-2000 IMO problems, a problem set on which no prior open-source prover has reported success. Our decoupled framework successfully solves 5 of these problems, demonstrating a significant step towards automated reasoning on exceptionally difficult mathematical challenges. To foster future research, we release our full dataset of generated and verified lemmas for a wide range of IMO problems, available at https://tencent-imo.github.io/ .
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2507.06804
- https://arxiv.org/pdf/2507.06804
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4416104197
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4416104197Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2507.06804Digital Object Identifier
- Title
-
Towards Solving More Challenging IMO Problems via Decoupled Reasoning and ProvingWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-07-07Full publication date if available
- Authors
-
Zhenwen Liang, Linfeng Song, Yang Li, Yang Tao, Feng Zhang, Haitao Mi, Dihua YuList of authors in order
- Landing page
-
https://arxiv.org/abs/2507.06804Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2507.06804Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2507.06804Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4416104197 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2507.06804 |
| ids.doi | https://doi.org/10.48550/arxiv.2507.06804 |
| ids.openalex | https://openalex.org/W4416104197 |
| fwci | |
| type | preprint |
| title | Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2507.06804 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2507.06804 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2507.06804 |
| locations[1].id | doi:10.48550/arxiv.2507.06804 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2507.06804 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5101447384 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-1080-3408 |
| authorships[0].author.display_name | Zhenwen Liang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Liang, Zhenwen |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5016877422 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-3502-3574 |
| authorships[1].author.display_name | Linfeng Song |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Song, Linfeng |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5100421383 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-9612-2704 |
| authorships[2].author.display_name | Yang Li |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Li, Yang |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5029754515 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-8063-9634 |
| authorships[3].author.display_name | Yang Tao |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Yang, Tao |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5043866405 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-0155-9087 |
| authorships[4].author.display_name | Feng Zhang |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Zhang, Feng |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5114418787 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Haitao Mi |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Mi, Haitao |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5040450464 |
| authorships[6].author.orcid | https://orcid.org/0000-0001-6231-9381 |
| authorships[6].author.display_name | Dihua Yu |
| authorships[6].author_position | last |
| authorships[6].raw_author_name | Yu, Dong |
| authorships[6].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2507.06804 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-28T08:54:24.970857 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2507.06804 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2507.06804 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2507.06804 |
| primary_location.id | pmh:oai:arXiv.org:2507.06804 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2507.06804 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2507.06804 |
| publication_date | 2025-07-07 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.. | 215 |
| abstract_inverted_index.5 | 175 |
| abstract_inverted_index.a | 8, 22, 94, 112, 151, 158, 180, 206 |
| abstract_inverted_index.8% | 52 |
| abstract_inverted_index.To | 87, 191 |
| abstract_inverted_index.We | 57, 146 |
| abstract_inverted_index.an | 123 |
| abstract_inverted_index.at | 213 |
| abstract_inverted_index.by | 66 |
| abstract_inverted_index.in | 4, 81 |
| abstract_inverted_index.is | 7 |
| abstract_inverted_index.no | 163 |
| abstract_inverted_index.of | 83, 143, 154, 176, 200, 209 |
| abstract_inverted_index.on | 53, 150, 161, 186 |
| abstract_inverted_index.to | 116, 126 |
| abstract_inverted_index.we | 92, 195 |
| abstract_inverted_index.80% | 46 |
| abstract_inverted_index.AI. | 12 |
| abstract_inverted_index.IMO | 156, 210 |
| abstract_inverted_index.Our | 105, 170 |
| abstract_inverted_index.and | 32, 70, 122, 139, 202 |
| abstract_inverted_index.are | 72 |
| abstract_inverted_index.for | 11, 205 |
| abstract_inverted_index.gap | 24, 60 |
| abstract_inverted_index.has | 167 |
| abstract_inverted_index.our | 148, 197 |
| abstract_inverted_index.set | 153, 160 |
| abstract_inverted_index.the | 42, 134, 141 |
| abstract_inverted_index.two | 108 |
| abstract_inverted_index.This | 130 |
| abstract_inverted_index.deep | 79 |
| abstract_inverted_index.from | 101 |
| abstract_inverted_index.full | 136, 198 |
| abstract_inverted_index.gap, | 91 |
| abstract_inverted_index.have | 18 |
| abstract_inverted_index.like | 55 |
| abstract_inverted_index.show | 40 |
| abstract_inverted_index.step | 182 |
| abstract_inverted_index.that | 41, 76, 97 |
| abstract_inverted_index.this | 59, 89 |
| abstract_inverted_index.weak | 34 |
| abstract_inverted_index.wide | 207 |
| abstract_inverted_index.with | 74 |
| abstract_inverted_index.(ATP) | 3 |
| abstract_inverted_index.Large | 14 |
| abstract_inverted_index.While | 13 |
| abstract_inverted_index.argue | 58 |
| abstract_inverted_index.below | 51 |
| abstract_inverted_index.favor | 82 |
| abstract_inverted_index.novel | 95 |
| abstract_inverted_index.prior | 164 |
| abstract_inverted_index.proof | 103 |
| abstract_inverted_index.range | 208 |
| abstract_inverted_index.their | 27, 33 |
| abstract_inverted_index.them. | 129 |
| abstract_inverted_index.these | 177 |
| abstract_inverted_index.which | 162 |
| abstract_inverted_index.while | 47 |
| abstract_inverted_index.(LLMs) | 17 |
| abstract_inverted_index.Models | 16 |
| abstract_inverted_index.Prover | 125 |
| abstract_inverted_index.Recent | 38 |
| abstract_inverted_index.bridge | 88 |
| abstract_inverted_index.design | 132 |
| abstract_inverted_index.driven | 19 |
| abstract_inverted_index.formal | 5, 35, 48 |
| abstract_inverted_index.foster | 192 |
| abstract_inverted_index.future | 193 |
| abstract_inverted_index.lemmas | 204 |
| abstract_inverted_index.method | 149 |
| abstract_inverted_index.prover | 166 |
| abstract_inverted_index.punish | 78 |
| abstract_inverted_index.solves | 174 |
| abstract_inverted_index.verify | 128 |
| abstract_inverted_index.Proving | 2 |
| abstract_inverted_index.Theorem | 1 |
| abstract_inverted_index.because | 62 |
| abstract_inverted_index.between | 26 |
| abstract_inverted_index.current | 63 |
| abstract_inverted_index.dataset | 199 |
| abstract_inverted_index.exceeds | 45 |
| abstract_inverted_index.lemmas, | 121 |
| abstract_inverted_index.model's | 135 |
| abstract_inverted_index.models: | 111 |
| abstract_inverted_index.modular | 131 |
| abstract_inverted_index.problem | 159 |
| abstract_inverted_index.propose | 93 |
| abstract_inverted_index.proving | 36 |
| abstract_inverted_index.release | 196 |
| abstract_inverted_index.remains | 25, 50 |
| abstract_inverted_index.studies | 39 |
| abstract_inverted_index.subgoal | 120 |
| abstract_inverted_index.success | 49 |
| abstract_inverted_index.tightly | 67 |
| abstract_inverted_index.towards | 183 |
| abstract_inverted_index.trained | 73 |
| abstract_inverted_index.Language | 15 |
| abstract_inverted_index.Reasoner | 115 |
| abstract_inverted_index.accuracy | 44 |
| abstract_inverted_index.approach | 106 |
| abstract_inverted_index.bypasses | 140 |
| abstract_inverted_index.coupling | 68 |
| abstract_inverted_index.diverse, | 118 |
| abstract_inverted_index.evaluate | 147 |
| abstract_inverted_index.generate | 117 |
| abstract_inverted_index.informal | 29, 43 |
| abstract_inverted_index.persists | 61 |
| abstract_inverted_index.pitfalls | 142 |
| abstract_inverted_index.powerful | 28 |
| abstract_inverted_index.provers, | 65 |
| abstract_inverted_index.proving, | 71 |
| abstract_inverted_index.reported | 168 |
| abstract_inverted_index.shallow, | 84 |
| abstract_inverted_index.success. | 169 |
| abstract_inverted_index.utilizes | 107 |
| abstract_inverted_index.verified | 203 |
| abstract_inverted_index.Automated | 0 |
| abstract_inverted_index.automated | 184 |
| abstract_inverted_index.available | 212 |
| abstract_inverted_index.challenge | 10 |
| abstract_inverted_index.decoupled | 171 |
| abstract_inverted_index.decouples | 98 |
| abstract_inverted_index.difficult | 188 |
| abstract_inverted_index.distinct, | 109 |
| abstract_inverted_index.efficient | 124 |
| abstract_inverted_index.framework | 96, 172 |
| abstract_inverted_index.generated | 201 |
| abstract_inverted_index.languages | 6 |
| abstract_inverted_index.liberates | 133 |
| abstract_inverted_index.low-level | 102 |
| abstract_inverted_index.paradigms | 75 |
| abstract_inverted_index.post-2000 | 155 |
| abstract_inverted_index.potential | 138 |
| abstract_inverted_index.powerful, | 113 |
| abstract_inverted_index.problems, | 157, 178, 211 |
| abstract_inverted_index.progress, | 21 |
| abstract_inverted_index.reasoning | 30, 69, 80, 100, 137, 185 |
| abstract_inverted_index.research, | 194 |
| abstract_inverted_index.strategic | 119 |
| abstract_inverted_index.training. | 145 |
| abstract_inverted_index.benchmarks | 54 |
| abstract_inverted_index.end-to-end | 144 |
| abstract_inverted_index.high-level | 99 |
| abstract_inverted_index.remarkable | 20 |
| abstract_inverted_index.rigorously | 127 |
| abstract_inverted_index.challenges. | 190 |
| abstract_inverted_index.challenging | 152 |
| abstract_inverted_index.fundamental | 90 |
| abstract_inverted_index.generation. | 104 |
| abstract_inverted_index.open-source | 165 |
| abstract_inverted_index.significant | 23, 181 |
| abstract_inverted_index.specialized | 110 |
| abstract_inverted_index.strategies. | 86 |
| abstract_inverted_index.PutnamBench. | 56 |
| abstract_inverted_index.capabilities | 31 |
| abstract_inverted_index.foundational | 9 |
| abstract_inverted_index.mathematical | 189 |
| abstract_inverted_index.performance. | 37 |
| abstract_inverted_index.successfully | 173 |
| abstract_inverted_index.tactic-based | 85 |
| abstract_inverted_index.demonstrating | 179 |
| abstract_inverted_index.exceptionally | 187 |
| abstract_inverted_index.inadvertently | 77 |
| abstract_inverted_index.general-purpose | 114 |
| abstract_inverted_index.state-of-the-art | 64 |
| abstract_inverted_index.https://tencent-imo.github.io/ | 214 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 7 |
| citation_normalized_percentile |