Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software Development Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2503.07967
Recent advances in large language models (LLMs) have demonstrated strong capabilities in software engineering tasks, raising expectations of revolutionary productivity gains. However, enterprise software development is largely driven by incremental evolution, where challenges extend far beyond routine coding and depend critically on tacit knowledge, including design decisions at different levels and historical trade-offs. To achieve effective AI-powered support for complex software development, we should align emerging AI capabilities with the practical realities of enterprise development. To this end, we systematically identify challenges from both software and LLM perspectives. Alongside these challenges, we outline opportunities where AI and structured knowledge frameworks can enhance decision-making in tasks such as issue localization and impact analysis. To address these needs, we propose the Code Digital Twin, a living framework that models both the physical and conceptual layers of software, preserves tacit knowledge, and co-evolves with the codebase. By integrating hybrid knowledge representations, multi-stage extraction pipelines, incremental updates, LLM-empowered applications, and human-in-the-loop feedback, the Code Digital Twin transforms fragmented knowledge into explicit and actionable representations. Our vision positions it as a bridge between AI advancements and enterprise software realities, providing a concrete roadmap toward sustainable, intelligent, and resilient development and evolution of ultra-complex systems.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2503.07967
- https://arxiv.org/pdf/2503.07967
- OA Status
- green
- Cited By
- 1
- OpenAlex ID
- https://openalex.org/W4414566824
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4414566824Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2503.07967Digital Object Identifier
- Title
-
Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software DevelopmentWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-03-11Full publication date if available
- Authors
-
Xin Peng, Chong Wang, Mingwei Liu, Yiling Lou, Yijian WuList of authors in order
- Landing page
-
https://arxiv.org/abs/2503.07967Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2503.07967Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2503.07967Direct OA link when available
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1Per-year citation counts (last 5 years)
Full payload
| id | https://openalex.org/W4414566824 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2503.07967 |
| ids.doi | https://doi.org/10.48550/arxiv.2503.07967 |
| ids.openalex | https://openalex.org/W4414566824 |
| fwci | |
| type | preprint |
| title | Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software Development |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10538 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.8626999855041504 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1710 |
| topics[0].subfield.display_name | Information Systems |
| topics[0].display_name | Data Mining Algorithms and Applications |
| topics[1].id | https://openalex.org/T11891 |
| topics[1].field.id | https://openalex.org/fields/14 |
| topics[1].field.display_name | Business, Management and Accounting |
| topics[1].score | 0.8345999717712402 |
| topics[1].domain.id | https://openalex.org/domains/2 |
| topics[1].domain.display_name | Social Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1404 |
| topics[1].subfield.display_name | Management Information Systems |
| topics[1].display_name | Big Data and Business Intelligence |
| topics[2].id | https://openalex.org/T11159 |
| topics[2].field.id | https://openalex.org/fields/22 |
| topics[2].field.display_name | Engineering |
| topics[2].score | 0.8019000291824341 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2209 |
| topics[2].subfield.display_name | Industrial and Manufacturing Engineering |
| topics[2].display_name | Manufacturing Process and Optimization |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2503.07967 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2503.07967 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2503.07967 |
| locations[1].id | doi:10.48550/arxiv.2503.07967 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2503.07967 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5101854992 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-3376-2581 |
| authorships[0].author.display_name | Xin Peng |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Peng, Xin |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5100329466 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-1424-6290 |
| authorships[1].author.display_name | Chong Wang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wang, Chong |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5102154339 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Mingwei Liu |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Liu, Mingwei |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5024354460 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-4066-3365 |
| authorships[3].author.display_name | Yiling Lou |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Lou, Yiling |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5006030692 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-9290-2068 |
| authorships[4].author.display_name | Yijian Wu |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Wu, Yijian |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2503.07967 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-09-27T00:00:00 |
| display_name | Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software Development |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10538 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.8626999855041504 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1710 |
| primary_topic.subfield.display_name | Information Systems |
| primary_topic.display_name | Data Mining Algorithms and Applications |
| cited_by_count | 1 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2503.07967 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2503.07967 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2503.07967 |
| primary_location.id | pmh:oai:arXiv.org:2503.07967 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2503.07967 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2503.07967 |
| publication_date | 2025-03-11 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 122, 175, 185 |
| abstract_inverted_index.AI | 66, 95, 178 |
| abstract_inverted_index.By | 143 |
| abstract_inverted_index.To | 53, 75, 112 |
| abstract_inverted_index.as | 106, 174 |
| abstract_inverted_index.at | 47 |
| abstract_inverted_index.by | 28 |
| abstract_inverted_index.in | 2, 11, 103 |
| abstract_inverted_index.is | 25 |
| abstract_inverted_index.it | 173 |
| abstract_inverted_index.of | 17, 72, 133, 196 |
| abstract_inverted_index.on | 41 |
| abstract_inverted_index.we | 62, 78, 91, 116 |
| abstract_inverted_index.LLM | 86 |
| abstract_inverted_index.Our | 170 |
| abstract_inverted_index.and | 38, 50, 85, 96, 109, 130, 138, 155, 167, 180, 191, 194 |
| abstract_inverted_index.can | 100 |
| abstract_inverted_index.far | 34 |
| abstract_inverted_index.for | 58 |
| abstract_inverted_index.the | 69, 118, 128, 141, 158 |
| abstract_inverted_index.Code | 119, 159 |
| abstract_inverted_index.Twin | 161 |
| abstract_inverted_index.both | 83, 127 |
| abstract_inverted_index.end, | 77 |
| abstract_inverted_index.from | 82 |
| abstract_inverted_index.have | 7 |
| abstract_inverted_index.into | 165 |
| abstract_inverted_index.such | 105 |
| abstract_inverted_index.that | 125 |
| abstract_inverted_index.this | 76 |
| abstract_inverted_index.with | 68, 140 |
| abstract_inverted_index.Twin, | 121 |
| abstract_inverted_index.align | 64 |
| abstract_inverted_index.issue | 107 |
| abstract_inverted_index.large | 3 |
| abstract_inverted_index.tacit | 42, 136 |
| abstract_inverted_index.tasks | 104 |
| abstract_inverted_index.these | 89, 114 |
| abstract_inverted_index.where | 31, 94 |
| abstract_inverted_index.(LLMs) | 6 |
| abstract_inverted_index.Recent | 0 |
| abstract_inverted_index.beyond | 35 |
| abstract_inverted_index.bridge | 176 |
| abstract_inverted_index.coding | 37 |
| abstract_inverted_index.depend | 39 |
| abstract_inverted_index.design | 45 |
| abstract_inverted_index.driven | 27 |
| abstract_inverted_index.extend | 33 |
| abstract_inverted_index.gains. | 20 |
| abstract_inverted_index.hybrid | 145 |
| abstract_inverted_index.impact | 110 |
| abstract_inverted_index.layers | 132 |
| abstract_inverted_index.levels | 49 |
| abstract_inverted_index.living | 123 |
| abstract_inverted_index.models | 5, 126 |
| abstract_inverted_index.needs, | 115 |
| abstract_inverted_index.should | 63 |
| abstract_inverted_index.strong | 9 |
| abstract_inverted_index.tasks, | 14 |
| abstract_inverted_index.toward | 188 |
| abstract_inverted_index.vision | 171 |
| abstract_inverted_index.Digital | 120, 160 |
| abstract_inverted_index.achieve | 54 |
| abstract_inverted_index.address | 113 |
| abstract_inverted_index.between | 177 |
| abstract_inverted_index.complex | 59 |
| abstract_inverted_index.enhance | 101 |
| abstract_inverted_index.largely | 26 |
| abstract_inverted_index.outline | 92 |
| abstract_inverted_index.propose | 117 |
| abstract_inverted_index.raising | 15 |
| abstract_inverted_index.roadmap | 187 |
| abstract_inverted_index.routine | 36 |
| abstract_inverted_index.support | 57 |
| abstract_inverted_index.However, | 21 |
| abstract_inverted_index.advances | 1 |
| abstract_inverted_index.concrete | 186 |
| abstract_inverted_index.emerging | 65 |
| abstract_inverted_index.explicit | 166 |
| abstract_inverted_index.identify | 80 |
| abstract_inverted_index.language | 4 |
| abstract_inverted_index.physical | 129 |
| abstract_inverted_index.software | 12, 23, 60, 84, 182 |
| abstract_inverted_index.systems. | 198 |
| abstract_inverted_index.updates, | 152 |
| abstract_inverted_index.Alongside | 88 |
| abstract_inverted_index.analysis. | 111 |
| abstract_inverted_index.codebase. | 142 |
| abstract_inverted_index.decisions | 46 |
| abstract_inverted_index.different | 48 |
| abstract_inverted_index.effective | 55 |
| abstract_inverted_index.evolution | 195 |
| abstract_inverted_index.feedback, | 157 |
| abstract_inverted_index.framework | 124 |
| abstract_inverted_index.including | 44 |
| abstract_inverted_index.knowledge | 98, 146, 164 |
| abstract_inverted_index.positions | 172 |
| abstract_inverted_index.practical | 70 |
| abstract_inverted_index.preserves | 135 |
| abstract_inverted_index.providing | 184 |
| abstract_inverted_index.realities | 71 |
| abstract_inverted_index.resilient | 192 |
| abstract_inverted_index.software, | 134 |
| abstract_inverted_index.AI-powered | 56 |
| abstract_inverted_index.actionable | 168 |
| abstract_inverted_index.challenges | 32, 81 |
| abstract_inverted_index.co-evolves | 139 |
| abstract_inverted_index.conceptual | 131 |
| abstract_inverted_index.critically | 40 |
| abstract_inverted_index.enterprise | 22, 73, 181 |
| abstract_inverted_index.evolution, | 30 |
| abstract_inverted_index.extraction | 149 |
| abstract_inverted_index.fragmented | 163 |
| abstract_inverted_index.frameworks | 99 |
| abstract_inverted_index.historical | 51 |
| abstract_inverted_index.knowledge, | 43, 137 |
| abstract_inverted_index.pipelines, | 150 |
| abstract_inverted_index.realities, | 183 |
| abstract_inverted_index.structured | 97 |
| abstract_inverted_index.transforms | 162 |
| abstract_inverted_index.challenges, | 90 |
| abstract_inverted_index.development | 24, 193 |
| abstract_inverted_index.engineering | 13 |
| abstract_inverted_index.incremental | 29, 151 |
| abstract_inverted_index.integrating | 144 |
| abstract_inverted_index.multi-stage | 148 |
| abstract_inverted_index.trade-offs. | 52 |
| abstract_inverted_index.advancements | 179 |
| abstract_inverted_index.capabilities | 10, 67 |
| abstract_inverted_index.demonstrated | 8 |
| abstract_inverted_index.development, | 61 |
| abstract_inverted_index.development. | 74 |
| abstract_inverted_index.expectations | 16 |
| abstract_inverted_index.intelligent, | 190 |
| abstract_inverted_index.localization | 108 |
| abstract_inverted_index.productivity | 19 |
| abstract_inverted_index.sustainable, | 189 |
| abstract_inverted_index.LLM-empowered | 153 |
| abstract_inverted_index.applications, | 154 |
| abstract_inverted_index.opportunities | 93 |
| abstract_inverted_index.perspectives. | 87 |
| abstract_inverted_index.revolutionary | 18 |
| abstract_inverted_index.ultra-complex | 197 |
| abstract_inverted_index.systematically | 79 |
| abstract_inverted_index.decision-making | 102 |
| abstract_inverted_index.representations, | 147 |
| abstract_inverted_index.representations. | 169 |
| abstract_inverted_index.human-in-the-loop | 156 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |