Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software Development Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2510.16395
Recent advances in large language models (LLMs) have demonstrated strong capabilities in software engineering tasks, raising expectations of revolutionary productivity gains. However, enterprise software development is largely driven by incremental evolution, where challenges extend far beyond routine coding and depend critically on tacit knowledge, including design decisions at different levels and historical trade-offs. To achieve effective AI-powered support for complex software development, we should align emerging AI capabilities with the practical realities of enterprise development. To this end, we systematically identify challenges from both software and LLM perspectives. Alongside these challenges, we outline opportunities where AI and structured knowledge frameworks can enhance decision-making in tasks such as issue localization and impact analysis. To address these needs, we propose the Code Digital Twin, a living framework that models both the physical and conceptual layers of software, preserves tacit knowledge, and co-evolves with the codebase. By integrating hybrid knowledge representations, multi-stage extraction pipelines, incremental updates, LLM-empowered applications, and human-in-the-loop feedback, the Code Digital Twin transforms fragmented knowledge into explicit and actionable representations. Our vision positions it as a bridge between AI advancements and enterprise software realities, providing a concrete roadmap toward sustainable, intelligent, and resilient development and evolution of ultra-complex systems.
Related Topics
- Type
- preprint
- Landing Page
- http://arxiv.org/abs/2510.16395
- https://arxiv.org/pdf/2510.16395
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415953405
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415953405Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2510.16395Digital Object Identifier
- Title
-
Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software DevelopmentWork title
- Type
-
preprintOpenAlex work type
- Publication year
-
2025Year of publication
- Publication date
-
2025-10-18Full publication date if available
- Authors
-
Xin Peng, Chong WangList of authors in order
- Landing page
-
https://arxiv.org/abs/2510.16395Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2510.16395Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2510.16395Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415953405 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2510.16395 |
| ids.doi | https://doi.org/10.48550/arxiv.2510.16395 |
| ids.openalex | https://openalex.org/W4415953405 |
| fwci | |
| type | preprint |
| title | Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software Development |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | |
| locations[0].id | pmh:oai:arXiv.org:2510.16395 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2510.16395 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2510.16395 |
| locations[1].id | doi:10.48550/arxiv.2510.16395 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2510.16395 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5101854992 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-3376-2581 |
| authorships[0].author.display_name | Xin Peng |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Peng, Xin |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5100329466 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-1424-6290 |
| authorships[1].author.display_name | Chong Wang |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Wang, Chong |
| authorships[1].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2510.16395 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-22T00:00:00 |
| display_name | Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software Development |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-07T23:20:04.922697 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2510.16395 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2510.16395 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2510.16395 |
| primary_location.id | pmh:oai:arXiv.org:2510.16395 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2510.16395 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2510.16395 |
| publication_date | 2025-10-18 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 122, 175, 185 |
| abstract_inverted_index.AI | 66, 95, 178 |
| abstract_inverted_index.By | 143 |
| abstract_inverted_index.To | 53, 75, 112 |
| abstract_inverted_index.as | 106, 174 |
| abstract_inverted_index.at | 47 |
| abstract_inverted_index.by | 28 |
| abstract_inverted_index.in | 2, 11, 103 |
| abstract_inverted_index.is | 25 |
| abstract_inverted_index.it | 173 |
| abstract_inverted_index.of | 17, 72, 133, 196 |
| abstract_inverted_index.on | 41 |
| abstract_inverted_index.we | 62, 78, 91, 116 |
| abstract_inverted_index.LLM | 86 |
| abstract_inverted_index.Our | 170 |
| abstract_inverted_index.and | 38, 50, 85, 96, 109, 130, 138, 155, 167, 180, 191, 194 |
| abstract_inverted_index.can | 100 |
| abstract_inverted_index.far | 34 |
| abstract_inverted_index.for | 58 |
| abstract_inverted_index.the | 69, 118, 128, 141, 158 |
| abstract_inverted_index.Code | 119, 159 |
| abstract_inverted_index.Twin | 161 |
| abstract_inverted_index.both | 83, 127 |
| abstract_inverted_index.end, | 77 |
| abstract_inverted_index.from | 82 |
| abstract_inverted_index.have | 7 |
| abstract_inverted_index.into | 165 |
| abstract_inverted_index.such | 105 |
| abstract_inverted_index.that | 125 |
| abstract_inverted_index.this | 76 |
| abstract_inverted_index.with | 68, 140 |
| abstract_inverted_index.Twin, | 121 |
| abstract_inverted_index.align | 64 |
| abstract_inverted_index.issue | 107 |
| abstract_inverted_index.large | 3 |
| abstract_inverted_index.tacit | 42, 136 |
| abstract_inverted_index.tasks | 104 |
| abstract_inverted_index.these | 89, 114 |
| abstract_inverted_index.where | 31, 94 |
| abstract_inverted_index.(LLMs) | 6 |
| abstract_inverted_index.Recent | 0 |
| abstract_inverted_index.beyond | 35 |
| abstract_inverted_index.bridge | 176 |
| abstract_inverted_index.coding | 37 |
| abstract_inverted_index.depend | 39 |
| abstract_inverted_index.design | 45 |
| abstract_inverted_index.driven | 27 |
| abstract_inverted_index.extend | 33 |
| abstract_inverted_index.gains. | 20 |
| abstract_inverted_index.hybrid | 145 |
| abstract_inverted_index.impact | 110 |
| abstract_inverted_index.layers | 132 |
| abstract_inverted_index.levels | 49 |
| abstract_inverted_index.living | 123 |
| abstract_inverted_index.models | 5, 126 |
| abstract_inverted_index.needs, | 115 |
| abstract_inverted_index.should | 63 |
| abstract_inverted_index.strong | 9 |
| abstract_inverted_index.tasks, | 14 |
| abstract_inverted_index.toward | 188 |
| abstract_inverted_index.vision | 171 |
| abstract_inverted_index.Digital | 120, 160 |
| abstract_inverted_index.achieve | 54 |
| abstract_inverted_index.address | 113 |
| abstract_inverted_index.between | 177 |
| abstract_inverted_index.complex | 59 |
| abstract_inverted_index.enhance | 101 |
| abstract_inverted_index.largely | 26 |
| abstract_inverted_index.outline | 92 |
| abstract_inverted_index.propose | 117 |
| abstract_inverted_index.raising | 15 |
| abstract_inverted_index.roadmap | 187 |
| abstract_inverted_index.routine | 36 |
| abstract_inverted_index.support | 57 |
| abstract_inverted_index.However, | 21 |
| abstract_inverted_index.advances | 1 |
| abstract_inverted_index.concrete | 186 |
| abstract_inverted_index.emerging | 65 |
| abstract_inverted_index.explicit | 166 |
| abstract_inverted_index.identify | 80 |
| abstract_inverted_index.language | 4 |
| abstract_inverted_index.physical | 129 |
| abstract_inverted_index.software | 12, 23, 60, 84, 182 |
| abstract_inverted_index.systems. | 198 |
| abstract_inverted_index.updates, | 152 |
| abstract_inverted_index.Alongside | 88 |
| abstract_inverted_index.analysis. | 111 |
| abstract_inverted_index.codebase. | 142 |
| abstract_inverted_index.decisions | 46 |
| abstract_inverted_index.different | 48 |
| abstract_inverted_index.effective | 55 |
| abstract_inverted_index.evolution | 195 |
| abstract_inverted_index.feedback, | 157 |
| abstract_inverted_index.framework | 124 |
| abstract_inverted_index.including | 44 |
| abstract_inverted_index.knowledge | 98, 146, 164 |
| abstract_inverted_index.positions | 172 |
| abstract_inverted_index.practical | 70 |
| abstract_inverted_index.preserves | 135 |
| abstract_inverted_index.providing | 184 |
| abstract_inverted_index.realities | 71 |
| abstract_inverted_index.resilient | 192 |
| abstract_inverted_index.software, | 134 |
| abstract_inverted_index.AI-powered | 56 |
| abstract_inverted_index.actionable | 168 |
| abstract_inverted_index.challenges | 32, 81 |
| abstract_inverted_index.co-evolves | 139 |
| abstract_inverted_index.conceptual | 131 |
| abstract_inverted_index.critically | 40 |
| abstract_inverted_index.enterprise | 22, 73, 181 |
| abstract_inverted_index.evolution, | 30 |
| abstract_inverted_index.extraction | 149 |
| abstract_inverted_index.fragmented | 163 |
| abstract_inverted_index.frameworks | 99 |
| abstract_inverted_index.historical | 51 |
| abstract_inverted_index.knowledge, | 43, 137 |
| abstract_inverted_index.pipelines, | 150 |
| abstract_inverted_index.realities, | 183 |
| abstract_inverted_index.structured | 97 |
| abstract_inverted_index.transforms | 162 |
| abstract_inverted_index.challenges, | 90 |
| abstract_inverted_index.development | 24, 193 |
| abstract_inverted_index.engineering | 13 |
| abstract_inverted_index.incremental | 29, 151 |
| abstract_inverted_index.integrating | 144 |
| abstract_inverted_index.multi-stage | 148 |
| abstract_inverted_index.trade-offs. | 52 |
| abstract_inverted_index.advancements | 179 |
| abstract_inverted_index.capabilities | 10, 67 |
| abstract_inverted_index.demonstrated | 8 |
| abstract_inverted_index.development, | 61 |
| abstract_inverted_index.development. | 74 |
| abstract_inverted_index.expectations | 16 |
| abstract_inverted_index.intelligent, | 190 |
| abstract_inverted_index.localization | 108 |
| abstract_inverted_index.productivity | 19 |
| abstract_inverted_index.sustainable, | 189 |
| abstract_inverted_index.LLM-empowered | 153 |
| abstract_inverted_index.applications, | 154 |
| abstract_inverted_index.opportunities | 93 |
| abstract_inverted_index.perspectives. | 87 |
| abstract_inverted_index.revolutionary | 18 |
| abstract_inverted_index.ultra-complex | 197 |
| abstract_inverted_index.systematically | 79 |
| abstract_inverted_index.decision-making | 102 |
| abstract_inverted_index.representations, | 147 |
| abstract_inverted_index.representations. | 169 |
| abstract_inverted_index.human-in-the-loop | 156 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |