Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2507.13966
Language models traditionally used for cross-domain generalization have recently demonstrated task-specific reasoning. However, their top-down training approach on general corpora is insufficient for acquiring abstractions needed for deep domain expertise. This may require a bottom-up approach that acquires expertise by learning to compose simple domain concepts into more complex ones. A knowledge graph (KG) provides this compositional structure, where domain primitives are represented as head-relation-tail edges and their paths encode higher-level concepts. We present a task generation pipeline that synthesizes tasks directly from KG primitives, enabling models to acquire and compose them for reasoning. We fine-tune language models on the resultant KG-grounded curriculum to demonstrate domain-specific superintelligence. While broadly applicable, we validate our approach in medicine, where reliable KGs exist. Using a medical KG, we curate 24,000 reasoning tasks paired with thinking traces derived from diverse medical primitives. We fine-tune the QwQ-32B model on this curriculum to obtain QwQ-Med-3 that takes a step towards medical superintelligence. We also introduce ICD-Bench, an evaluation suite to quantify reasoning abilities across 15 medical domains. Our experiments demonstrate that QwQ-Med-3 significantly outperforms state-of-the-art reasoning models on ICD-Bench categories. Further analysis reveals that QwQ-Med-3 utilizes acquired primitives to widen the performance gap on the hardest tasks of ICD-Bench. Finally, evaluation on medical question-answer benchmarks shows that QwQ-Med-3 transfers acquired expertise to enhance the base model's performance. While the industry's approach to artificial general intelligence (AGI) emphasizes broad expertise, we envision a future in which AGI emerges from the composable interaction of efficient domain-specific superintelligent agents.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2507.13966
- https://arxiv.org/pdf/2507.13966
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4416167863
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4416167863Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2507.13966Digital Object Identifier
- Title
-
Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We NeedWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-07-18Full publication date if available
- Authors
-
Bhishma Dedhia, Yuval Kansal, Niraj K. JhaList of authors in order
- Landing page
-
https://arxiv.org/abs/2507.13966Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2507.13966Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2507.13966Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4416167863 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2507.13966 |
| ids.doi | https://doi.org/10.48550/arxiv.2507.13966 |
| ids.openalex | https://openalex.org/W4416167863 |
| fwci | |
| type | preprint |
| title | Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2507.13966 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2507.13966 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2507.13966 |
| locations[1].id | doi:10.48550/arxiv.2507.13966 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2507.13966 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5042238232 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-8260-282X |
| authorships[0].author.display_name | Bhishma Dedhia |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Dedhia, Bhishma |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5120494978 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Yuval Kansal |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Kansal, Yuval |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5086131079 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-1539-0369 |
| authorships[2].author.display_name | Niraj K. Jha |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Jha, Niraj K. |
| authorships[2].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2507.13966 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-28T06:51:13.909132 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2507.13966 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2507.13966 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2507.13966 |
| primary_location.id | pmh:oai:arXiv.org:2507.13966 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2507.13966 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2507.13966 |
| publication_date | 2025-07-18 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.A | 50 |
| abstract_inverted_index.a | 33, 74, 121, 151, 235 |
| abstract_inverted_index.15 | 168 |
| abstract_inverted_index.KG | 83 |
| abstract_inverted_index.We | 72, 94, 138, 156 |
| abstract_inverted_index.an | 160 |
| abstract_inverted_index.as | 63 |
| abstract_inverted_index.by | 39 |
| abstract_inverted_index.in | 114, 237 |
| abstract_inverted_index.is | 20 |
| abstract_inverted_index.of | 201, 245 |
| abstract_inverted_index.on | 17, 98, 143, 181, 197, 205 |
| abstract_inverted_index.to | 41, 87, 103, 146, 163, 192, 215, 225 |
| abstract_inverted_index.we | 110, 124, 233 |
| abstract_inverted_index.AGI | 239 |
| abstract_inverted_index.KG, | 123 |
| abstract_inverted_index.KGs | 118 |
| abstract_inverted_index.Our | 171 |
| abstract_inverted_index.and | 66, 89 |
| abstract_inverted_index.are | 61 |
| abstract_inverted_index.for | 4, 22, 26, 92 |
| abstract_inverted_index.gap | 196 |
| abstract_inverted_index.may | 31 |
| abstract_inverted_index.our | 112 |
| abstract_inverted_index.the | 99, 140, 194, 198, 217, 222, 242 |
| abstract_inverted_index.(KG) | 53 |
| abstract_inverted_index.This | 30 |
| abstract_inverted_index.also | 157 |
| abstract_inverted_index.base | 218 |
| abstract_inverted_index.deep | 27 |
| abstract_inverted_index.from | 82, 134, 241 |
| abstract_inverted_index.have | 7 |
| abstract_inverted_index.into | 46 |
| abstract_inverted_index.more | 47 |
| abstract_inverted_index.step | 152 |
| abstract_inverted_index.task | 75 |
| abstract_inverted_index.that | 36, 78, 149, 174, 187, 210 |
| abstract_inverted_index.them | 91 |
| abstract_inverted_index.this | 55, 144 |
| abstract_inverted_index.used | 3 |
| abstract_inverted_index.with | 130 |
| abstract_inverted_index.(AGI) | 229 |
| abstract_inverted_index.Using | 120 |
| abstract_inverted_index.While | 107, 221 |
| abstract_inverted_index.broad | 231 |
| abstract_inverted_index.edges | 65 |
| abstract_inverted_index.graph | 52 |
| abstract_inverted_index.model | 142 |
| abstract_inverted_index.ones. | 49 |
| abstract_inverted_index.paths | 68 |
| abstract_inverted_index.shows | 209 |
| abstract_inverted_index.suite | 162 |
| abstract_inverted_index.takes | 150 |
| abstract_inverted_index.tasks | 80, 128, 200 |
| abstract_inverted_index.their | 13, 67 |
| abstract_inverted_index.where | 58, 116 |
| abstract_inverted_index.which | 238 |
| abstract_inverted_index.widen | 193 |
| abstract_inverted_index.24,000 | 126 |
| abstract_inverted_index.across | 167 |
| abstract_inverted_index.curate | 125 |
| abstract_inverted_index.domain | 28, 44, 59 |
| abstract_inverted_index.encode | 69 |
| abstract_inverted_index.exist. | 119 |
| abstract_inverted_index.future | 236 |
| abstract_inverted_index.models | 1, 86, 97, 180 |
| abstract_inverted_index.needed | 25 |
| abstract_inverted_index.obtain | 147 |
| abstract_inverted_index.paired | 129 |
| abstract_inverted_index.simple | 43 |
| abstract_inverted_index.traces | 132 |
| abstract_inverted_index.Further | 184 |
| abstract_inverted_index.QwQ-32B | 141 |
| abstract_inverted_index.acquire | 88 |
| abstract_inverted_index.agents. | 249 |
| abstract_inverted_index.broadly | 108 |
| abstract_inverted_index.complex | 48 |
| abstract_inverted_index.compose | 42, 90 |
| abstract_inverted_index.corpora | 19 |
| abstract_inverted_index.derived | 133 |
| abstract_inverted_index.diverse | 135 |
| abstract_inverted_index.emerges | 240 |
| abstract_inverted_index.enhance | 216 |
| abstract_inverted_index.general | 18, 227 |
| abstract_inverted_index.hardest | 199 |
| abstract_inverted_index.medical | 122, 136, 154, 169, 206 |
| abstract_inverted_index.model's | 219 |
| abstract_inverted_index.present | 73 |
| abstract_inverted_index.require | 32 |
| abstract_inverted_index.reveals | 186 |
| abstract_inverted_index.towards | 153 |
| abstract_inverted_index.Finally, | 203 |
| abstract_inverted_index.However, | 12 |
| abstract_inverted_index.Language | 0 |
| abstract_inverted_index.acquired | 190, 213 |
| abstract_inverted_index.acquires | 37 |
| abstract_inverted_index.analysis | 185 |
| abstract_inverted_index.approach | 16, 35, 113, 224 |
| abstract_inverted_index.concepts | 45 |
| abstract_inverted_index.directly | 81 |
| abstract_inverted_index.domains. | 170 |
| abstract_inverted_index.enabling | 85 |
| abstract_inverted_index.envision | 234 |
| abstract_inverted_index.language | 96 |
| abstract_inverted_index.learning | 40 |
| abstract_inverted_index.pipeline | 77 |
| abstract_inverted_index.provides | 54 |
| abstract_inverted_index.quantify | 164 |
| abstract_inverted_index.recently | 8 |
| abstract_inverted_index.reliable | 117 |
| abstract_inverted_index.thinking | 131 |
| abstract_inverted_index.top-down | 14 |
| abstract_inverted_index.training | 15 |
| abstract_inverted_index.utilizes | 189 |
| abstract_inverted_index.validate | 111 |
| abstract_inverted_index.ICD-Bench | 182 |
| abstract_inverted_index.QwQ-Med-3 | 148, 175, 188, 211 |
| abstract_inverted_index.abilities | 166 |
| abstract_inverted_index.acquiring | 23 |
| abstract_inverted_index.bottom-up | 34 |
| abstract_inverted_index.concepts. | 71 |
| abstract_inverted_index.efficient | 246 |
| abstract_inverted_index.expertise | 38, 214 |
| abstract_inverted_index.fine-tune | 95, 139 |
| abstract_inverted_index.introduce | 158 |
| abstract_inverted_index.knowledge | 51 |
| abstract_inverted_index.medicine, | 115 |
| abstract_inverted_index.reasoning | 127, 165, 179 |
| abstract_inverted_index.resultant | 100 |
| abstract_inverted_index.transfers | 212 |
| abstract_inverted_index.ICD-Bench, | 159 |
| abstract_inverted_index.ICD-Bench. | 202 |
| abstract_inverted_index.artificial | 226 |
| abstract_inverted_index.benchmarks | 208 |
| abstract_inverted_index.composable | 243 |
| abstract_inverted_index.curriculum | 102, 145 |
| abstract_inverted_index.emphasizes | 230 |
| abstract_inverted_index.evaluation | 161, 204 |
| abstract_inverted_index.expertise, | 232 |
| abstract_inverted_index.expertise. | 29 |
| abstract_inverted_index.generation | 76 |
| abstract_inverted_index.industry's | 223 |
| abstract_inverted_index.primitives | 60, 191 |
| abstract_inverted_index.reasoning. | 11, 93 |
| abstract_inverted_index.structure, | 57 |
| abstract_inverted_index.KG-grounded | 101 |
| abstract_inverted_index.applicable, | 109 |
| abstract_inverted_index.categories. | 183 |
| abstract_inverted_index.demonstrate | 104, 173 |
| abstract_inverted_index.experiments | 172 |
| abstract_inverted_index.interaction | 244 |
| abstract_inverted_index.outperforms | 177 |
| abstract_inverted_index.performance | 195 |
| abstract_inverted_index.primitives, | 84 |
| abstract_inverted_index.primitives. | 137 |
| abstract_inverted_index.represented | 62 |
| abstract_inverted_index.synthesizes | 79 |
| abstract_inverted_index.abstractions | 24 |
| abstract_inverted_index.cross-domain | 5 |
| abstract_inverted_index.demonstrated | 9 |
| abstract_inverted_index.higher-level | 70 |
| abstract_inverted_index.insufficient | 21 |
| abstract_inverted_index.intelligence | 228 |
| abstract_inverted_index.performance. | 220 |
| abstract_inverted_index.compositional | 56 |
| abstract_inverted_index.significantly | 176 |
| abstract_inverted_index.task-specific | 10 |
| abstract_inverted_index.traditionally | 2 |
| abstract_inverted_index.generalization | 6 |
| abstract_inverted_index.domain-specific | 105, 247 |
| abstract_inverted_index.question-answer | 207 |
| abstract_inverted_index.state-of-the-art | 178 |
| abstract_inverted_index.superintelligent | 248 |
| abstract_inverted_index.head-relation-tail | 64 |
| abstract_inverted_index.superintelligence. | 106, 155 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| citation_normalized_percentile |