MLDebugging: Towards Benchmarking Code Debugging Across Multi-Library Scenarios Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2506.13824
Code debugging is a crucial task in software engineering, which attracts increasing attention. While remarkable success has been made in the era of large language models (LLMs), current research still focuses on the simple no-library or single-library setting, ignoring the complex multi-library scenario in real-world applications. To address this limitation, we make the first attempt to introduce MLDebugging (Multi-Library Debugging), a comprehensive benchmark designed to assess debugging challenges within multi-library Python code. Specifically, MLDebugging encompasses 126 distinct Python libraries, covering a wide range of multi-library code issues, categorized into seven distinct types. Furthermore, we conduct a thorough evaluation of MLDebugging using both mainstream open-source and closed-source LLMs and highlight that current LLMs still struggle to correctly perform code debugging across multi-library scenarios. We hope this work can uncover the potential of LLMs in multi-library debugging scenario and offer insights for future research.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2506.13824
- https://arxiv.org/pdf/2506.13824
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415311024
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415311024Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2506.13824Digital Object Identifier
- Title
-
MLDebugging: Towards Benchmarking Code Debugging Across Multi-Library ScenariosWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-06-15Full publication date if available
- Authors
-
Jinyang Huang, Xiachong Feng, Qiguang Chen, Hanzhang Zhao, Zheng Cheng, Jie Bai, Jingxuan Zhou, Min Li, L. Q. QinList of authors in order
- Landing page
-
https://arxiv.org/abs/2506.13824Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2506.13824Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2506.13824Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415311024 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2506.13824 |
| ids.doi | https://doi.org/10.48550/arxiv.2506.13824 |
| ids.openalex | https://openalex.org/W4415311024 |
| fwci | |
| type | preprint |
| title | MLDebugging: Towards Benchmarking Code Debugging Across Multi-Library Scenarios |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11986 |
| topics[0].field.id | https://openalex.org/fields/18 |
| topics[0].field.display_name | Decision Sciences |
| topics[0].score | 0.9902999997138977 |
| topics[0].domain.id | https://openalex.org/domains/2 |
| topics[0].domain.display_name | Social Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1802 |
| topics[0].subfield.display_name | Information Systems and Management |
| topics[0].display_name | Scientific Computing and Data Management |
| topics[1].id | https://openalex.org/T10054 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9900000095367432 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1708 |
| topics[1].subfield.display_name | Hardware and Architecture |
| topics[1].display_name | Parallel Computing and Optimization Techniques |
| topics[2].id | https://openalex.org/T10743 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9842000007629395 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1712 |
| topics[2].subfield.display_name | Software |
| topics[2].display_name | Software Testing and Debugging Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2506.13824 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2506.13824 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2506.13824 |
| locations[1].id | doi:10.48550/arxiv.2506.13824 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2506.13824 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5100928377 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-5483-2812 |
| authorships[0].author.display_name | Jinyang Huang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Huang, Jinyang |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5066977101 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-4761-7484 |
| authorships[1].author.display_name | Xiachong Feng |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Feng, Xiachong |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5103207823 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-9154-7858 |
| authorships[2].author.display_name | Qiguang Chen |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Chen, Qiguang |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5108771122 |
| authorships[3].author.orcid | https://orcid.org/0009-0003-8492-6803 |
| authorships[3].author.display_name | Hanzhang Zhao |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Zhao, Hanjie |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5015124710 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-6361-1103 |
| authorships[4].author.display_name | Zheng Cheng |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Cheng, Zihui |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5090878267 |
| authorships[5].author.orcid | https://orcid.org/0000-0002-6953-4698 |
| authorships[5].author.display_name | Jie Bai |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Bai, Jiesong |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5079408382 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-8120-1394 |
| authorships[6].author.display_name | Jingxuan Zhou |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Zhou, Jingxuan |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5100400762 |
| authorships[7].author.orcid | https://orcid.org/0000-0002-9366-2390 |
| authorships[7].author.display_name | Min Li |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Li, Min |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5092102582 |
| authorships[8].author.orcid | https://orcid.org/0009-0000-6141-3931 |
| authorships[8].author.display_name | L. Q. Qin |
| authorships[8].author_position | last |
| authorships[8].raw_author_name | Qin, Libo |
| authorships[8].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2506.13824 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-18T00:00:00 |
| display_name | MLDebugging: Towards Benchmarking Code Debugging Across Multi-Library Scenarios |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11986 |
| primary_topic.field.id | https://openalex.org/fields/18 |
| primary_topic.field.display_name | Decision Sciences |
| primary_topic.score | 0.9902999997138977 |
| primary_topic.domain.id | https://openalex.org/domains/2 |
| primary_topic.domain.display_name | Social Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1802 |
| primary_topic.subfield.display_name | Information Systems and Management |
| primary_topic.display_name | Scientific Computing and Data Management |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2506.13824 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2506.13824 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2506.13824 |
| primary_location.id | pmh:oai:arXiv.org:2506.13824 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2506.13824 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2506.13824 |
| publication_date | 2025-06-15 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 3, 60, 80, 95 |
| abstract_inverted_index.To | 46 |
| abstract_inverted_index.We | 122 |
| abstract_inverted_index.in | 6, 19, 43, 132 |
| abstract_inverted_index.is | 2 |
| abstract_inverted_index.of | 22, 83, 98, 130 |
| abstract_inverted_index.on | 31 |
| abstract_inverted_index.or | 35 |
| abstract_inverted_index.to | 55, 64, 114 |
| abstract_inverted_index.we | 50, 93 |
| abstract_inverted_index.126 | 75 |
| abstract_inverted_index.and | 104, 107, 136 |
| abstract_inverted_index.can | 126 |
| abstract_inverted_index.era | 21 |
| abstract_inverted_index.for | 139 |
| abstract_inverted_index.has | 16 |
| abstract_inverted_index.the | 20, 32, 39, 52, 128 |
| abstract_inverted_index.Code | 0 |
| abstract_inverted_index.LLMs | 106, 111, 131 |
| abstract_inverted_index.been | 17 |
| abstract_inverted_index.both | 101 |
| abstract_inverted_index.code | 85, 117 |
| abstract_inverted_index.hope | 123 |
| abstract_inverted_index.into | 88 |
| abstract_inverted_index.made | 18 |
| abstract_inverted_index.make | 51 |
| abstract_inverted_index.task | 5 |
| abstract_inverted_index.that | 109 |
| abstract_inverted_index.this | 48, 124 |
| abstract_inverted_index.wide | 81 |
| abstract_inverted_index.work | 125 |
| abstract_inverted_index.While | 13 |
| abstract_inverted_index.code. | 71 |
| abstract_inverted_index.first | 53 |
| abstract_inverted_index.large | 23 |
| abstract_inverted_index.offer | 137 |
| abstract_inverted_index.range | 82 |
| abstract_inverted_index.seven | 89 |
| abstract_inverted_index.still | 29, 112 |
| abstract_inverted_index.using | 100 |
| abstract_inverted_index.which | 9 |
| abstract_inverted_index.Python | 70, 77 |
| abstract_inverted_index.across | 119 |
| abstract_inverted_index.assess | 65 |
| abstract_inverted_index.future | 140 |
| abstract_inverted_index.models | 25 |
| abstract_inverted_index.simple | 33 |
| abstract_inverted_index.types. | 91 |
| abstract_inverted_index.within | 68 |
| abstract_inverted_index.(LLMs), | 26 |
| abstract_inverted_index.address | 47 |
| abstract_inverted_index.attempt | 54 |
| abstract_inverted_index.complex | 40 |
| abstract_inverted_index.conduct | 94 |
| abstract_inverted_index.crucial | 4 |
| abstract_inverted_index.current | 27, 110 |
| abstract_inverted_index.focuses | 30 |
| abstract_inverted_index.issues, | 86 |
| abstract_inverted_index.perform | 116 |
| abstract_inverted_index.success | 15 |
| abstract_inverted_index.uncover | 127 |
| abstract_inverted_index.attracts | 10 |
| abstract_inverted_index.covering | 79 |
| abstract_inverted_index.designed | 63 |
| abstract_inverted_index.distinct | 76, 90 |
| abstract_inverted_index.ignoring | 38 |
| abstract_inverted_index.insights | 138 |
| abstract_inverted_index.language | 24 |
| abstract_inverted_index.research | 28 |
| abstract_inverted_index.scenario | 42, 135 |
| abstract_inverted_index.setting, | 37 |
| abstract_inverted_index.software | 7 |
| abstract_inverted_index.struggle | 113 |
| abstract_inverted_index.thorough | 96 |
| abstract_inverted_index.benchmark | 62 |
| abstract_inverted_index.correctly | 115 |
| abstract_inverted_index.debugging | 1, 66, 118, 134 |
| abstract_inverted_index.highlight | 108 |
| abstract_inverted_index.introduce | 56 |
| abstract_inverted_index.potential | 129 |
| abstract_inverted_index.research. | 141 |
| abstract_inverted_index.attention. | 12 |
| abstract_inverted_index.challenges | 67 |
| abstract_inverted_index.evaluation | 97 |
| abstract_inverted_index.increasing | 11 |
| abstract_inverted_index.libraries, | 78 |
| abstract_inverted_index.mainstream | 102 |
| abstract_inverted_index.no-library | 34 |
| abstract_inverted_index.real-world | 44 |
| abstract_inverted_index.remarkable | 14 |
| abstract_inverted_index.scenarios. | 121 |
| abstract_inverted_index.Debugging), | 59 |
| abstract_inverted_index.MLDebugging | 57, 73, 99 |
| abstract_inverted_index.categorized | 87 |
| abstract_inverted_index.encompasses | 74 |
| abstract_inverted_index.limitation, | 49 |
| abstract_inverted_index.open-source | 103 |
| abstract_inverted_index.Furthermore, | 92 |
| abstract_inverted_index.engineering, | 8 |
| abstract_inverted_index.Specifically, | 72 |
| abstract_inverted_index.applications. | 45 |
| abstract_inverted_index.closed-source | 105 |
| abstract_inverted_index.comprehensive | 61 |
| abstract_inverted_index.multi-library | 41, 69, 84, 120, 133 |
| abstract_inverted_index.(Multi-Library | 58 |
| abstract_inverted_index.single-library | 36 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 9 |
| citation_normalized_percentile |