Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need Article Swipe

PDF

Bhishma Dedhia , Yuval Kansal , Niraj K. Jha ·

YOU? · · 2025 · Open Access · · DOI: https://doi.org/10.48550/arxiv.2507.13966

Language models traditionally used for cross-domain generalization have recently demonstrated task-specific reasoning. However, their top-down training approach on general corpora is insufficient for acquiring abstractions needed for deep domain expertise. This may require a bottom-up approach that acquires expertise by learning to compose simple domain concepts into more complex ones. A knowledge graph (KG) provides this compositional structure, where domain primitives are represented as head-relation-tail edges and their paths encode higher-level concepts. We present a task generation pipeline that synthesizes tasks directly from KG primitives, enabling models to acquire and compose them for reasoning. We fine-tune language models on the resultant KG-grounded curriculum to demonstrate domain-specific superintelligence. While broadly applicable, we validate our approach in medicine, where reliable KGs exist. Using a medical KG, we curate 24,000 reasoning tasks paired with thinking traces derived from diverse medical primitives. We fine-tune the QwQ-32B model on this curriculum to obtain QwQ-Med-3 that takes a step towards medical superintelligence. We also introduce ICD-Bench, an evaluation suite to quantify reasoning abilities across 15 medical domains. Our experiments demonstrate that QwQ-Med-3 significantly outperforms state-of-the-art reasoning models on ICD-Bench categories. Further analysis reveals that QwQ-Med-3 utilizes acquired primitives to widen the performance gap on the hardest tasks of ICD-Bench. Finally, evaluation on medical question-answer benchmarks shows that QwQ-Med-3 transfers acquired expertise to enhance the base model's performance. While the industry's approach to artificial general intelligence (AGI) emphasizes broad expertise, we envision a future in which AGI emerges from the composable interaction of efficient domain-specific superintelligent agents.

Related Topics

Truth And Reconciliation Commission Of Canada

2025 Nba Draft

28 Years Later

Reich Ministry Of Public Enlightenment And Propaganda

Mahmood Mamdani

Concepts

No concepts available.

Metadata

Type: preprint
Language: en
Landing Page: http://arxiv.org/abs/2507.13966
PDF: https://arxiv.org/pdf/2507.13966
OA Status: green
OpenAlex ID: https://openalex.org/W4416167863

All OpenAlex metadata

Raw OpenAlex JSON

OpenAlex ID: https://openalex.org/W4416167863

Canonical identifier for this work in OpenAlex
DOI: https://doi.org/10.48550/arxiv.2507.13966

Digital Object Identifier
Title: Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need

Work title
Type: preprint

OpenAlex work type
Language: en

Primary language
Publication year: 2025

Year of publication
Publication date: 2025-07-18

Full publication date if available
Authors: Bhishma Dedhia, Yuval Kansal, Niraj K. Jha

List of authors in order
Landing page: https://arxiv.org/abs/2507.13966

Publisher landing page
PDF URL: https://arxiv.org/pdf/2507.13966

Direct link to full text PDF
Open access: Yes

Whether a free full text is available
OA status: green

Open access status per OpenAlex
OA URL: https://arxiv.org/pdf/2507.13966

Direct OA link when available
Cited by: 0

Total citation count in OpenAlex

Full payload

id	https://openalex.org/W4416167863
doi	https://doi.org/10.48550/arxiv.2507.13966
ids.doi	https://doi.org/10.48550/arxiv.2507.13966
ids.openalex	https://openalex.org/W4416167863
fwci
type	preprint
title	Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need
biblio.issue
biblio.volume
biblio.last_page
biblio.first_page
is_xpac	False
apc_list
apc_paid
language	en
locations[0].id	pmh:oai:arXiv.org:2507.13966
locations[0].is_oa	True
locations[0].source.id	https://openalex.org/S4306400194
locations[0].source.issn
locations[0].source.type	repository
locations[0].source.is_oa	True
locations[0].source.issn_l
locations[0].source.is_core	False
locations[0].source.is_in_doaj	False
locations[0].source.display_name	arXiv (Cornell University)
locations[0].source.host_organization	https://openalex.org/I205783295
locations[0].source.host_organization_name	Cornell University
locations[0].source.host_organization_lineage	https://openalex.org/I205783295
locations[0].license
locations[0].pdf_url	https://arxiv.org/pdf/2507.13966
locations[0].version	submittedVersion
locations[0].raw_type	text
locations[0].license_id
locations[0].is_accepted	False
locations[0].is_published	False
locations[0].raw_source_name
locations[0].landing_page_url	http://arxiv.org/abs/2507.13966
locations[1].id	doi:10.48550/arxiv.2507.13966
locations[1].is_oa	True
locations[1].source.id	https://openalex.org/S4306400194
locations[1].source.issn
locations[1].source.type	repository
locations[1].source.is_oa	True
locations[1].source.issn_l
locations[1].source.is_core	False
locations[1].source.is_in_doaj	False
locations[1].source.display_name	arXiv (Cornell University)
locations[1].source.host_organization	https://openalex.org/I205783295
locations[1].source.host_organization_name	Cornell University
locations[1].source.host_organization_lineage	https://openalex.org/I205783295
locations[1].license
locations[1].pdf_url
locations[1].version
locations[1].raw_type	article
locations[1].license_id
locations[1].is_accepted	False
locations[1].is_published
locations[1].raw_source_name
locations[1].landing_page_url	https://doi.org/10.48550/arxiv.2507.13966
indexed_in	arxiv, datacite
authorships[0].author.id	https://openalex.org/A5042238232
authorships[0].author.orcid	https://orcid.org/0000-0001-8260-282X
authorships[0].author.display_name	Bhishma Dedhia
authorships[0].author_position	first
authorships[0].raw_author_name	Dedhia, Bhishma
authorships[0].is_corresponding	False
authorships[1].author.id	https://openalex.org/A5120494978
authorships[1].author.orcid
authorships[1].author.display_name	Yuval Kansal
authorships[1].author_position	middle
authorships[1].raw_author_name	Kansal, Yuval
authorships[1].is_corresponding	False
authorships[2].author.id	https://openalex.org/A5086131079
authorships[2].author.orcid	https://orcid.org/0000-0002-1539-0369
authorships[2].author.display_name	Niraj K. Jha
authorships[2].author_position	last
authorships[2].raw_author_name	Jha, Niraj K.
authorships[2].is_corresponding	False
has_content.pdf	False
has_content.grobid_xml	False
is_paratext	False
open_access.is_oa	True
open_access.oa_url	https://arxiv.org/pdf/2507.13966
open_access.oa_status	green
open_access.any_repository_has_fulltext	False
created_date	2025-10-10T00:00:00
display_name	Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need
has_fulltext	False
is_retracted	False
updated_date	2025-11-28T06:51:13.909132
primary_topic
cited_by_count	0
locations_count	2
best_oa_location.id	pmh:oai:arXiv.org:2507.13966
best_oa_location.is_oa	True
best_oa_location.source.id	https://openalex.org/S4306400194
best_oa_location.source.issn
best_oa_location.source.type	repository
best_oa_location.source.is_oa	True
best_oa_location.source.issn_l
best_oa_location.source.is_core	False
best_oa_location.source.is_in_doaj	False
best_oa_location.source.display_name	arXiv (Cornell University)
best_oa_location.source.host_organization	https://openalex.org/I205783295
best_oa_location.source.host_organization_name	Cornell University
best_oa_location.source.host_organization_lineage	https://openalex.org/I205783295
best_oa_location.license
best_oa_location.pdf_url	https://arxiv.org/pdf/2507.13966
best_oa_location.version	submittedVersion
best_oa_location.raw_type	text
best_oa_location.license_id
best_oa_location.is_accepted	False
best_oa_location.is_published	False
best_oa_location.raw_source_name
best_oa_location.landing_page_url	http://arxiv.org/abs/2507.13966
primary_location.id	pmh:oai:arXiv.org:2507.13966
primary_location.is_oa	True
primary_location.source.id	https://openalex.org/S4306400194
primary_location.source.issn
primary_location.source.type	repository
primary_location.source.is_oa	True
primary_location.source.issn_l
primary_location.source.is_core	False
primary_location.source.is_in_doaj	False
primary_location.source.display_name	arXiv (Cornell University)
primary_location.source.host_organization	https://openalex.org/I205783295
primary_location.source.host_organization_name	Cornell University
primary_location.source.host_organization_lineage	https://openalex.org/I205783295
primary_location.license
primary_location.pdf_url	https://arxiv.org/pdf/2507.13966
primary_location.version	submittedVersion
primary_location.raw_type	text
primary_location.license_id
primary_location.is_accepted	False
primary_location.is_published	False
primary_location.raw_source_name
primary_location.landing_page_url	http://arxiv.org/abs/2507.13966
publication_date	2025-07-18
publication_year	2025
referenced_works_count	0
abstract_inverted_index.A	50
abstract_inverted_index.a	33, 74, 121, 151, 235
abstract_inverted_index.15	168
abstract_inverted_index.KG	83
abstract_inverted_index.We	72, 94, 138, 156
abstract_inverted_index.an	160
abstract_inverted_index.as	63
abstract_inverted_index.by	39
abstract_inverted_index.in	114, 237
abstract_inverted_index.is	20
abstract_inverted_index.of	201, 245
abstract_inverted_index.on	17, 98, 143, 181, 197, 205
abstract_inverted_index.to	41, 87, 103, 146, 163, 192, 215, 225
abstract_inverted_index.we	110, 124, 233
abstract_inverted_index.AGI	239
abstract_inverted_index.KG,	123
abstract_inverted_index.KGs	118
abstract_inverted_index.Our	171
abstract_inverted_index.and	66, 89
abstract_inverted_index.are	61
abstract_inverted_index.for	4, 22, 26, 92
abstract_inverted_index.gap	196
abstract_inverted_index.may	31
abstract_inverted_index.our	112
abstract_inverted_index.the	99, 140, 194, 198, 217, 222, 242
abstract_inverted_index.(KG)	53
abstract_inverted_index.This	30
abstract_inverted_index.also	157
abstract_inverted_index.base	218
abstract_inverted_index.deep	27
abstract_inverted_index.from	82, 134, 241
abstract_inverted_index.have	7
abstract_inverted_index.into	46
abstract_inverted_index.more	47
abstract_inverted_index.step	152
abstract_inverted_index.task	75
abstract_inverted_index.that	36, 78, 149, 174, 187, 210
abstract_inverted_index.them	91
abstract_inverted_index.this	55, 144
abstract_inverted_index.used	3
abstract_inverted_index.with	130
abstract_inverted_index.(AGI)	229
abstract_inverted_index.Using	120
abstract_inverted_index.While	107, 221
abstract_inverted_index.broad	231
abstract_inverted_index.edges	65
abstract_inverted_index.graph	52
abstract_inverted_index.model	142
abstract_inverted_index.ones.	49
abstract_inverted_index.paths	68
abstract_inverted_index.shows	209
abstract_inverted_index.suite	162
abstract_inverted_index.takes	150
abstract_inverted_index.tasks	80, 128, 200
abstract_inverted_index.their	13, 67
abstract_inverted_index.where	58, 116
abstract_inverted_index.which	238
abstract_inverted_index.widen	193
abstract_inverted_index.24,000	126
abstract_inverted_index.across	167
abstract_inverted_index.curate	125
abstract_inverted_index.domain	28, 44, 59
abstract_inverted_index.encode	69
abstract_inverted_index.exist.	119
abstract_inverted_index.future	236
abstract_inverted_index.models	1, 86, 97, 180
abstract_inverted_index.needed	25
abstract_inverted_index.obtain	147
abstract_inverted_index.paired	129
abstract_inverted_index.simple	43
abstract_inverted_index.traces	132
abstract_inverted_index.Further	184
abstract_inverted_index.QwQ-32B	141
abstract_inverted_index.acquire	88
abstract_inverted_index.agents.	249
abstract_inverted_index.broadly	108
abstract_inverted_index.complex	48
abstract_inverted_index.compose	42, 90
abstract_inverted_index.corpora	19
abstract_inverted_index.derived	133
abstract_inverted_index.diverse	135
abstract_inverted_index.emerges	240
abstract_inverted_index.enhance	216
abstract_inverted_index.general	18, 227
abstract_inverted_index.hardest	199
abstract_inverted_index.medical	122, 136, 154, 169, 206
abstract_inverted_index.model's	219
abstract_inverted_index.present	73
abstract_inverted_index.require	32
abstract_inverted_index.reveals	186
abstract_inverted_index.towards	153
abstract_inverted_index.Finally,	203
abstract_inverted_index.However,	12
abstract_inverted_index.Language	0
abstract_inverted_index.acquired	190, 213
abstract_inverted_index.acquires	37
abstract_inverted_index.analysis	185
abstract_inverted_index.approach	16, 35, 113, 224
abstract_inverted_index.concepts	45
abstract_inverted_index.directly	81
abstract_inverted_index.domains.	170
abstract_inverted_index.enabling	85
abstract_inverted_index.envision	234
abstract_inverted_index.language	96
abstract_inverted_index.learning	40
abstract_inverted_index.pipeline	77
abstract_inverted_index.provides	54
abstract_inverted_index.quantify	164
abstract_inverted_index.recently	8
abstract_inverted_index.reliable	117
abstract_inverted_index.thinking	131
abstract_inverted_index.top-down	14
abstract_inverted_index.training	15
abstract_inverted_index.utilizes	189
abstract_inverted_index.validate	111
abstract_inverted_index.ICD-Bench	182
abstract_inverted_index.QwQ-Med-3	148, 175, 188, 211
abstract_inverted_index.abilities	166
abstract_inverted_index.acquiring	23
abstract_inverted_index.bottom-up	34
abstract_inverted_index.concepts.	71
abstract_inverted_index.efficient	246
abstract_inverted_index.expertise	38, 214
abstract_inverted_index.fine-tune	95, 139
abstract_inverted_index.introduce	158
abstract_inverted_index.knowledge	51
abstract_inverted_index.medicine,	115
abstract_inverted_index.reasoning	127, 165, 179
abstract_inverted_index.resultant	100
abstract_inverted_index.transfers	212
abstract_inverted_index.ICD-Bench,	159
abstract_inverted_index.ICD-Bench.	202
abstract_inverted_index.artificial	226
abstract_inverted_index.benchmarks	208
abstract_inverted_index.composable	243
abstract_inverted_index.curriculum	102, 145
abstract_inverted_index.emphasizes	230
abstract_inverted_index.evaluation	161, 204
abstract_inverted_index.expertise,	232
abstract_inverted_index.expertise.	29
abstract_inverted_index.generation	76
abstract_inverted_index.industry's	223
abstract_inverted_index.primitives	60, 191
abstract_inverted_index.reasoning.	11, 93
abstract_inverted_index.structure,	57
abstract_inverted_index.KG-grounded	101
abstract_inverted_index.applicable,	109
abstract_inverted_index.categories.	183
abstract_inverted_index.demonstrate	104, 173
abstract_inverted_index.experiments	172
abstract_inverted_index.interaction	244
abstract_inverted_index.outperforms	177
abstract_inverted_index.performance	195
abstract_inverted_index.primitives,	84
abstract_inverted_index.primitives.	137
abstract_inverted_index.represented	62
abstract_inverted_index.synthesizes	79
abstract_inverted_index.abstractions	24
abstract_inverted_index.cross-domain	5
abstract_inverted_index.demonstrated	9
abstract_inverted_index.higher-level	70
abstract_inverted_index.insufficient	21
abstract_inverted_index.intelligence	228
abstract_inverted_index.performance.	220
abstract_inverted_index.compositional	56
abstract_inverted_index.significantly	176
abstract_inverted_index.task-specific	10
abstract_inverted_index.traditionally	2
abstract_inverted_index.generalization	6
abstract_inverted_index.domain-specific	105, 247
abstract_inverted_index.question-answer	207
abstract_inverted_index.state-of-the-art	178
abstract_inverted_index.superintelligent	248
abstract_inverted_index.head-relation-tail	64
abstract_inverted_index.superintelligence.	106, 155
cited_by_percentile_year
countries_distinct_count	0
institutions_distinct_count	3
citation_normalized_percentile