Framework-Aware Code Generation with API Knowledge Graph-Constructed Data: A Study on HarmonyOS Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2512.00380
In the context of software frameworks with limited resources (such as HarmonyOS), large language models (LLMs) often exhibit poor code generation performance because they lack sufficient exposure to such environments during pre-training. Although LLMs can usually maintain correct logical structures across programming languages, they frequently struggle when dealing with framework-specific APIs or syntax, resulting in errors. This indicates that while pre-training equips LLMs with general algorithmic capabilities, they remain unfamiliar with the distinctive syntax and API usage of underrepresented frameworks. As a result, even advanced commercial models like GPT-4o cannot reliably generate correct code without prior adaptation. To address this issue, we propose APIKG4SYN, a framework designed to exploit API knowledge graphs for the construction of API-oriented question-code pairs, specifically tailored for low-resource frameworks without requiring executable code. APIKG4SYN integrates both single-API and multi-API knowledge, where the latter is derived through uncertainty estimation (UE)-driven Monte Carlo Tree Search (MCTS), enabling the creation of a diverse and informative dataset for fine-tuning LLMs. Using HarmonyOS as a case study, we build the first benchmark for HarmonyOS code generation. Experimental results show that fine-tuning Qwen with APIKG4SYN raises pass@1 accuracy to 25.00%, compared with 17.59% for the baseline GPT model. These results confirm that API-oriented data significantly enhance LLM performance in low-resource software development scenarios.
Related Topics
- Type
- preprint
- Landing Page
- http://arxiv.org/abs/2512.00380
- https://arxiv.org/pdf/2512.00380
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4416936391
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4416936391Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2512.00380Digital Object Identifier
- Title
-
Framework-Aware Code Generation with API Knowledge Graph-Constructed Data: A Study on HarmonyOSWork title
- Type
-
preprintOpenAlex work type
- Publication year
-
2025Year of publication
- Publication date
-
2025-11-29Full publication date if available
- Authors
-
Zheng Pei, Xin PengList of authors in order
- Landing page
-
https://arxiv.org/abs/2512.00380Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2512.00380Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2512.00380Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4416936391 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2512.00380 |
| ids.doi | https://doi.org/10.48550/arxiv.2512.00380 |
| ids.openalex | https://openalex.org/W4416936391 |
| fwci | |
| type | preprint |
| title | Framework-Aware Code Generation with API Knowledge Graph-Constructed Data: A Study on HarmonyOS |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | |
| locations[0].id | pmh:oai:arXiv.org:2512.00380 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2512.00380 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2512.00380 |
| locations[1].id | doi:10.48550/arxiv.2512.00380 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2512.00380 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5046280013 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-5757-6607 |
| authorships[0].author.display_name | Zheng Pei |
| authorships[0].author_position | last |
| authorships[0].raw_author_name | Pei, Zheng |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5101854992 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-3376-2581 |
| authorships[1].author.display_name | Xin Peng |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Peng, Xin |
| authorships[1].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2512.00380 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-12-03T00:00:00 |
| display_name | Framework-Aware Code Generation with API Knowledge Graph-Constructed Data: A Study on HarmonyOS |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-12-03T23:12:59.920255 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2512.00380 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2512.00380 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2512.00380 |
| primary_location.id | pmh:oai:arXiv.org:2512.00380 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2512.00380 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2512.00380 |
| publication_date | 2025-11-29 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 81, 104, 153, 164 |
| abstract_inverted_index.As | 80 |
| abstract_inverted_index.In | 0 |
| abstract_inverted_index.To | 97 |
| abstract_inverted_index.as | 10, 163 |
| abstract_inverted_index.in | 54, 207 |
| abstract_inverted_index.is | 138 |
| abstract_inverted_index.of | 3, 77, 115, 152 |
| abstract_inverted_index.or | 51 |
| abstract_inverted_index.to | 27, 107, 187 |
| abstract_inverted_index.we | 101, 167 |
| abstract_inverted_index.API | 75, 109 |
| abstract_inverted_index.GPT | 195 |
| abstract_inverted_index.LLM | 205 |
| abstract_inverted_index.and | 74, 132, 155 |
| abstract_inverted_index.can | 34 |
| abstract_inverted_index.for | 112, 121, 158, 172, 192 |
| abstract_inverted_index.the | 1, 71, 113, 136, 150, 169, 193 |
| abstract_inverted_index.APIs | 50 |
| abstract_inverted_index.LLMs | 33, 62 |
| abstract_inverted_index.Qwen | 181 |
| abstract_inverted_index.This | 56 |
| abstract_inverted_index.Tree | 146 |
| abstract_inverted_index.both | 130 |
| abstract_inverted_index.case | 165 |
| abstract_inverted_index.code | 19, 93, 174 |
| abstract_inverted_index.data | 202 |
| abstract_inverted_index.even | 83 |
| abstract_inverted_index.lack | 24 |
| abstract_inverted_index.like | 87 |
| abstract_inverted_index.poor | 18 |
| abstract_inverted_index.show | 178 |
| abstract_inverted_index.such | 28 |
| abstract_inverted_index.that | 58, 179, 200 |
| abstract_inverted_index.they | 23, 43, 67 |
| abstract_inverted_index.this | 99 |
| abstract_inverted_index.when | 46 |
| abstract_inverted_index.with | 6, 48, 63, 70, 182, 190 |
| abstract_inverted_index.(such | 9 |
| abstract_inverted_index.Carlo | 145 |
| abstract_inverted_index.LLMs. | 160 |
| abstract_inverted_index.Monte | 144 |
| abstract_inverted_index.These | 197 |
| abstract_inverted_index.Using | 161 |
| abstract_inverted_index.build | 168 |
| abstract_inverted_index.code. | 127 |
| abstract_inverted_index.first | 170 |
| abstract_inverted_index.large | 12 |
| abstract_inverted_index.often | 16 |
| abstract_inverted_index.prior | 95 |
| abstract_inverted_index.usage | 76 |
| abstract_inverted_index.where | 135 |
| abstract_inverted_index.while | 59 |
| abstract_inverted_index.(LLMs) | 15 |
| abstract_inverted_index.17.59% | 191 |
| abstract_inverted_index.GPT-4o | 88 |
| abstract_inverted_index.Search | 147 |
| abstract_inverted_index.across | 40 |
| abstract_inverted_index.cannot | 89 |
| abstract_inverted_index.during | 30 |
| abstract_inverted_index.equips | 61 |
| abstract_inverted_index.graphs | 111 |
| abstract_inverted_index.issue, | 100 |
| abstract_inverted_index.latter | 137 |
| abstract_inverted_index.model. | 196 |
| abstract_inverted_index.models | 14, 86 |
| abstract_inverted_index.pairs, | 118 |
| abstract_inverted_index.pass@1 | 185 |
| abstract_inverted_index.raises | 184 |
| abstract_inverted_index.remain | 68 |
| abstract_inverted_index.study, | 166 |
| abstract_inverted_index.syntax | 73 |
| abstract_inverted_index.(MCTS), | 148 |
| abstract_inverted_index.25.00%, | 188 |
| abstract_inverted_index.address | 98 |
| abstract_inverted_index.because | 22 |
| abstract_inverted_index.confirm | 199 |
| abstract_inverted_index.context | 2 |
| abstract_inverted_index.correct | 37, 92 |
| abstract_inverted_index.dataset | 157 |
| abstract_inverted_index.dealing | 47 |
| abstract_inverted_index.derived | 139 |
| abstract_inverted_index.diverse | 154 |
| abstract_inverted_index.enhance | 204 |
| abstract_inverted_index.errors. | 55 |
| abstract_inverted_index.exhibit | 17 |
| abstract_inverted_index.exploit | 108 |
| abstract_inverted_index.general | 64 |
| abstract_inverted_index.limited | 7 |
| abstract_inverted_index.logical | 38 |
| abstract_inverted_index.propose | 102 |
| abstract_inverted_index.result, | 82 |
| abstract_inverted_index.results | 177, 198 |
| abstract_inverted_index.syntax, | 52 |
| abstract_inverted_index.through | 140 |
| abstract_inverted_index.usually | 35 |
| abstract_inverted_index.without | 94, 124 |
| abstract_inverted_index.Although | 32 |
| abstract_inverted_index.accuracy | 186 |
| abstract_inverted_index.advanced | 84 |
| abstract_inverted_index.baseline | 194 |
| abstract_inverted_index.compared | 189 |
| abstract_inverted_index.creation | 151 |
| abstract_inverted_index.designed | 106 |
| abstract_inverted_index.enabling | 149 |
| abstract_inverted_index.exposure | 26 |
| abstract_inverted_index.generate | 91 |
| abstract_inverted_index.language | 13 |
| abstract_inverted_index.maintain | 36 |
| abstract_inverted_index.reliably | 90 |
| abstract_inverted_index.software | 4, 209 |
| abstract_inverted_index.struggle | 45 |
| abstract_inverted_index.tailored | 120 |
| abstract_inverted_index.APIKG4SYN | 128, 183 |
| abstract_inverted_index.HarmonyOS | 162, 173 |
| abstract_inverted_index.benchmark | 171 |
| abstract_inverted_index.framework | 105 |
| abstract_inverted_index.indicates | 57 |
| abstract_inverted_index.knowledge | 110 |
| abstract_inverted_index.multi-API | 133 |
| abstract_inverted_index.requiring | 125 |
| abstract_inverted_index.resources | 8 |
| abstract_inverted_index.resulting | 53 |
| abstract_inverted_index.APIKG4SYN, | 103 |
| abstract_inverted_index.commercial | 85 |
| abstract_inverted_index.estimation | 142 |
| abstract_inverted_index.executable | 126 |
| abstract_inverted_index.frameworks | 5, 123 |
| abstract_inverted_index.frequently | 44 |
| abstract_inverted_index.generation | 20 |
| abstract_inverted_index.integrates | 129 |
| abstract_inverted_index.knowledge, | 134 |
| abstract_inverted_index.languages, | 42 |
| abstract_inverted_index.scenarios. | 211 |
| abstract_inverted_index.single-API | 131 |
| abstract_inverted_index.structures | 39 |
| abstract_inverted_index.sufficient | 25 |
| abstract_inverted_index.unfamiliar | 69 |
| abstract_inverted_index.(UE)-driven | 143 |
| abstract_inverted_index.HarmonyOS), | 11 |
| abstract_inverted_index.adaptation. | 96 |
| abstract_inverted_index.algorithmic | 65 |
| abstract_inverted_index.development | 210 |
| abstract_inverted_index.distinctive | 72 |
| abstract_inverted_index.fine-tuning | 159, 180 |
| abstract_inverted_index.frameworks. | 79 |
| abstract_inverted_index.generation. | 175 |
| abstract_inverted_index.informative | 156 |
| abstract_inverted_index.performance | 21, 206 |
| abstract_inverted_index.programming | 41 |
| abstract_inverted_index.uncertainty | 141 |
| abstract_inverted_index.API-oriented | 116, 201 |
| abstract_inverted_index.Experimental | 176 |
| abstract_inverted_index.construction | 114 |
| abstract_inverted_index.environments | 29 |
| abstract_inverted_index.low-resource | 122, 208 |
| abstract_inverted_index.pre-training | 60 |
| abstract_inverted_index.specifically | 119 |
| abstract_inverted_index.capabilities, | 66 |
| abstract_inverted_index.pre-training. | 31 |
| abstract_inverted_index.question-code | 117 |
| abstract_inverted_index.significantly | 203 |
| abstract_inverted_index.underrepresented | 78 |
| abstract_inverted_index.framework-specific | 49 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |