Unit Test Generation using Generative AI : A Comparative Performance Analysis of Autogeneration Tools Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2312.10622
Generating unit tests is a crucial task in software development, demanding substantial time and effort from programmers. The advent of Large Language Models (LLMs) introduces a novel avenue for unit test script generation. This research aims to experimentally investigate the effectiveness of LLMs, specifically exemplified by ChatGPT, for generating unit test scripts for Python programs, and how the generated test cases compare with those generated by an existing unit test generator (Pynguin). For experiments, we consider three types of code units: 1) Procedural scripts, 2) Function-based modular code, and 3) Class-based code. The generated test cases are evaluated based on criteria such as coverage, correctness, and readability. Our results show that ChatGPT's performance is comparable with Pynguin in terms of coverage, though for some cases its performance is superior to Pynguin. We also find that about a third of assertions generated by ChatGPT for some categories were incorrect. Our results also show that there is minimal overlap in missed statements between ChatGPT and Pynguin, thus, suggesting that a combination of both tools may enhance unit test generation performance. Finally, in our experiments, prompt engineering improved ChatGPT's performance, achieving a much higher coverage.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2312.10622
- https://arxiv.org/pdf/2312.10622
- OA Status
- green
- Cited By
- 4
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4389983379
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4389983379Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2312.10622Digital Object Identifier
- Title
-
Unit Test Generation using Generative AI : A Comparative Performance Analysis of Autogeneration ToolsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-12-17Full publication date if available
- Authors
-
Shreya Bhatia, Tarushi Gandhi, Dhruv Kumar, Pankaj JaloteList of authors in order
- Landing page
-
https://arxiv.org/abs/2312.10622Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2312.10622Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2312.10622Direct OA link when available
- Concepts
-
Generative grammar, Test (biology), Computer science, Unit testing, Unit (ring theory), Artificial intelligence, Psychology, Mathematics education, Programming language, Software, Paleontology, BiologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
4Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 3, 2024: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4389983379 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2312.10622 |
| ids.doi | https://doi.org/10.48550/arxiv.2312.10622 |
| ids.openalex | https://openalex.org/W4389983379 |
| fwci | |
| type | preprint |
| title | Unit Test Generation using Generative AI : A Comparative Performance Analysis of Autogeneration Tools |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11052 |
| topics[0].field.id | https://openalex.org/fields/22 |
| topics[0].field.display_name | Engineering |
| topics[0].score | 0.5486999750137329 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2208 |
| topics[0].subfield.display_name | Electrical and Electronic Engineering |
| topics[0].display_name | Energy Load and Power Forecasting |
| topics[1].id | https://openalex.org/T11122 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.5070000290870667 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1706 |
| topics[1].subfield.display_name | Computer Science Applications |
| topics[1].display_name | Online Learning and Analytics |
| topics[2].id | https://openalex.org/T11276 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.46320000290870667 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Solar Radiation and Photovoltaics |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C39890363 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6848151087760925 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q36108 |
| concepts[0].display_name | Generative grammar |
| concepts[1].id | https://openalex.org/C2777267654 |
| concepts[1].level | 2 |
| concepts[1].score | 0.5154234170913696 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q3519023 |
| concepts[1].display_name | Test (biology) |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.4913291037082672 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C148027188 |
| concepts[3].level | 3 |
| concepts[3].score | 0.4904765784740448 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q907375 |
| concepts[3].display_name | Unit testing |
| concepts[4].id | https://openalex.org/C122637931 |
| concepts[4].level | 2 |
| concepts[4].score | 0.41578418016433716 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q118084 |
| concepts[4].display_name | Unit (ring theory) |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.3604050874710083 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C15744967 |
| concepts[6].level | 0 |
| concepts[6].score | 0.18267515301704407 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q9418 |
| concepts[6].display_name | Psychology |
| concepts[7].id | https://openalex.org/C145420912 |
| concepts[7].level | 1 |
| concepts[7].score | 0.09652206301689148 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q853077 |
| concepts[7].display_name | Mathematics education |
| concepts[8].id | https://openalex.org/C199360897 |
| concepts[8].level | 1 |
| concepts[8].score | 0.09011882543563843 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[8].display_name | Programming language |
| concepts[9].id | https://openalex.org/C2777904410 |
| concepts[9].level | 2 |
| concepts[9].score | 0.0 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q7397 |
| concepts[9].display_name | Software |
| concepts[10].id | https://openalex.org/C151730666 |
| concepts[10].level | 1 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q7205 |
| concepts[10].display_name | Paleontology |
| concepts[11].id | https://openalex.org/C86803240 |
| concepts[11].level | 0 |
| concepts[11].score | 0.0 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q420 |
| concepts[11].display_name | Biology |
| keywords[0].id | https://openalex.org/keywords/generative-grammar |
| keywords[0].score | 0.6848151087760925 |
| keywords[0].display_name | Generative grammar |
| keywords[1].id | https://openalex.org/keywords/test |
| keywords[1].score | 0.5154234170913696 |
| keywords[1].display_name | Test (biology) |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.4913291037082672 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/unit-testing |
| keywords[3].score | 0.4904765784740448 |
| keywords[3].display_name | Unit testing |
| keywords[4].id | https://openalex.org/keywords/unit |
| keywords[4].score | 0.41578418016433716 |
| keywords[4].display_name | Unit (ring theory) |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.3604050874710083 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/psychology |
| keywords[6].score | 0.18267515301704407 |
| keywords[6].display_name | Psychology |
| keywords[7].id | https://openalex.org/keywords/mathematics-education |
| keywords[7].score | 0.09652206301689148 |
| keywords[7].display_name | Mathematics education |
| keywords[8].id | https://openalex.org/keywords/programming-language |
| keywords[8].score | 0.09011882543563843 |
| keywords[8].display_name | Programming language |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2312.10622 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2312.10622 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2312.10622 |
| locations[1].id | doi:10.48550/arxiv.2312.10622 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2312.10622 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5103052275 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-3661-7438 |
| authorships[0].author.display_name | Shreya Bhatia |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Bhatia, Shreya |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5114110822 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Tarushi Gandhi |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Gandhi, Tarushi |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5027859418 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-2586-9397 |
| authorships[2].author.display_name | Dhruv Kumar |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Kumar, Dhruv |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5057154526 |
| authorships[3].author.orcid | https://orcid.org/0009-0001-8552-8394 |
| authorships[3].author.display_name | Pankaj Jalote |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Jalote, Pankaj |
| authorships[3].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2312.10622 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2023-12-20T00:00:00 |
| display_name | Unit Test Generation using Generative AI : A Comparative Performance Analysis of Autogeneration Tools |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11052 |
| primary_topic.field.id | https://openalex.org/fields/22 |
| primary_topic.field.display_name | Engineering |
| primary_topic.score | 0.5486999750137329 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2208 |
| primary_topic.subfield.display_name | Electrical and Electronic Engineering |
| primary_topic.display_name | Energy Load and Power Forecasting |
| related_works | https://openalex.org/W2380075625, https://openalex.org/W2615173508, https://openalex.org/W611386996, https://openalex.org/W2593332592, https://openalex.org/W4390718435, https://openalex.org/W4390549206, https://openalex.org/W4380354325, https://openalex.org/W3137171911, https://openalex.org/W2205285032, https://openalex.org/W4237784285 |
| cited_by_count | 4 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 3 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2312.10622 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2312.10622 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2312.10622 |
| primary_location.id | pmh:oai:arXiv.org:2312.10622 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2312.10622 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2312.10622 |
| publication_date | 2023-12-17 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 4, 25, 136, 167, 188 |
| abstract_inverted_index.1) | 81 |
| abstract_inverted_index.2) | 84 |
| abstract_inverted_index.3) | 89 |
| abstract_inverted_index.We | 131 |
| abstract_inverted_index.an | 66 |
| abstract_inverted_index.as | 102 |
| abstract_inverted_index.by | 45, 65, 141 |
| abstract_inverted_index.in | 7, 117, 157, 179 |
| abstract_inverted_index.is | 3, 113, 127, 154 |
| abstract_inverted_index.of | 19, 41, 78, 119, 138, 169 |
| abstract_inverted_index.on | 99 |
| abstract_inverted_index.to | 36, 129 |
| abstract_inverted_index.we | 74 |
| abstract_inverted_index.For | 72 |
| abstract_inverted_index.Our | 107, 148 |
| abstract_inverted_index.The | 17, 92 |
| abstract_inverted_index.and | 13, 55, 88, 105, 162 |
| abstract_inverted_index.are | 96 |
| abstract_inverted_index.for | 28, 47, 52, 122, 143 |
| abstract_inverted_index.how | 56 |
| abstract_inverted_index.its | 125 |
| abstract_inverted_index.may | 172 |
| abstract_inverted_index.our | 180 |
| abstract_inverted_index.the | 39, 57 |
| abstract_inverted_index.This | 33 |
| abstract_inverted_index.aims | 35 |
| abstract_inverted_index.also | 132, 150 |
| abstract_inverted_index.both | 170 |
| abstract_inverted_index.code | 79 |
| abstract_inverted_index.find | 133 |
| abstract_inverted_index.from | 15 |
| abstract_inverted_index.much | 189 |
| abstract_inverted_index.show | 109, 151 |
| abstract_inverted_index.some | 123, 144 |
| abstract_inverted_index.such | 101 |
| abstract_inverted_index.task | 6 |
| abstract_inverted_index.test | 30, 50, 59, 69, 94, 175 |
| abstract_inverted_index.that | 110, 134, 152, 166 |
| abstract_inverted_index.time | 12 |
| abstract_inverted_index.unit | 1, 29, 49, 68, 174 |
| abstract_inverted_index.were | 146 |
| abstract_inverted_index.with | 62, 115 |
| abstract_inverted_index.LLMs, | 42 |
| abstract_inverted_index.Large | 20 |
| abstract_inverted_index.about | 135 |
| abstract_inverted_index.based | 98 |
| abstract_inverted_index.cases | 60, 95, 124 |
| abstract_inverted_index.code, | 87 |
| abstract_inverted_index.code. | 91 |
| abstract_inverted_index.novel | 26 |
| abstract_inverted_index.terms | 118 |
| abstract_inverted_index.tests | 2 |
| abstract_inverted_index.there | 153 |
| abstract_inverted_index.third | 137 |
| abstract_inverted_index.those | 63 |
| abstract_inverted_index.three | 76 |
| abstract_inverted_index.thus, | 164 |
| abstract_inverted_index.tools | 171 |
| abstract_inverted_index.types | 77 |
| abstract_inverted_index.(LLMs) | 23 |
| abstract_inverted_index.Models | 22 |
| abstract_inverted_index.Python | 53 |
| abstract_inverted_index.advent | 18 |
| abstract_inverted_index.avenue | 27 |
| abstract_inverted_index.effort | 14 |
| abstract_inverted_index.higher | 190 |
| abstract_inverted_index.missed | 158 |
| abstract_inverted_index.prompt | 182 |
| abstract_inverted_index.script | 31 |
| abstract_inverted_index.though | 121 |
| abstract_inverted_index.units: | 80 |
| abstract_inverted_index.ChatGPT | 142, 161 |
| abstract_inverted_index.Pynguin | 116 |
| abstract_inverted_index.between | 160 |
| abstract_inverted_index.compare | 61 |
| abstract_inverted_index.crucial | 5 |
| abstract_inverted_index.enhance | 173 |
| abstract_inverted_index.minimal | 155 |
| abstract_inverted_index.modular | 86 |
| abstract_inverted_index.overlap | 156 |
| abstract_inverted_index.results | 108, 149 |
| abstract_inverted_index.scripts | 51 |
| abstract_inverted_index.ChatGPT, | 46 |
| abstract_inverted_index.Finally, | 178 |
| abstract_inverted_index.Language | 21 |
| abstract_inverted_index.Pynguin, | 163 |
| abstract_inverted_index.Pynguin. | 130 |
| abstract_inverted_index.consider | 75 |
| abstract_inverted_index.criteria | 100 |
| abstract_inverted_index.existing | 67 |
| abstract_inverted_index.improved | 184 |
| abstract_inverted_index.research | 34 |
| abstract_inverted_index.scripts, | 83 |
| abstract_inverted_index.software | 8 |
| abstract_inverted_index.superior | 128 |
| abstract_inverted_index.ChatGPT's | 111, 185 |
| abstract_inverted_index.achieving | 187 |
| abstract_inverted_index.coverage, | 103, 120 |
| abstract_inverted_index.coverage. | 191 |
| abstract_inverted_index.demanding | 10 |
| abstract_inverted_index.evaluated | 97 |
| abstract_inverted_index.generated | 58, 64, 93, 140 |
| abstract_inverted_index.generator | 70 |
| abstract_inverted_index.programs, | 54 |
| abstract_inverted_index.(Pynguin). | 71 |
| abstract_inverted_index.Generating | 0 |
| abstract_inverted_index.Procedural | 82 |
| abstract_inverted_index.assertions | 139 |
| abstract_inverted_index.categories | 145 |
| abstract_inverted_index.comparable | 114 |
| abstract_inverted_index.generating | 48 |
| abstract_inverted_index.generation | 176 |
| abstract_inverted_index.incorrect. | 147 |
| abstract_inverted_index.introduces | 24 |
| abstract_inverted_index.statements | 159 |
| abstract_inverted_index.suggesting | 165 |
| abstract_inverted_index.Class-based | 90 |
| abstract_inverted_index.combination | 168 |
| abstract_inverted_index.engineering | 183 |
| abstract_inverted_index.exemplified | 44 |
| abstract_inverted_index.generation. | 32 |
| abstract_inverted_index.investigate | 38 |
| abstract_inverted_index.performance | 112, 126 |
| abstract_inverted_index.substantial | 11 |
| abstract_inverted_index.correctness, | 104 |
| abstract_inverted_index.development, | 9 |
| abstract_inverted_index.experiments, | 73, 181 |
| abstract_inverted_index.performance, | 186 |
| abstract_inverted_index.performance. | 177 |
| abstract_inverted_index.programmers. | 16 |
| abstract_inverted_index.readability. | 106 |
| abstract_inverted_index.specifically | 43 |
| abstract_inverted_index.effectiveness | 40 |
| abstract_inverted_index.Function-based | 85 |
| abstract_inverted_index.experimentally | 37 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/4 |
| sustainable_development_goals[0].score | 0.5699999928474426 |
| sustainable_development_goals[0].display_name | Quality Education |
| citation_normalized_percentile |