Mind the Prompt: Prompting Strategies in Audio Generations for Improving Sound Classification Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2504.03329
This paper investigates the design of effective prompt strategies for generating realistic datasets using Text-To-Audio (TTA) models. We also analyze different techniques for efficiently combining these datasets to enhance their utility in sound classification tasks. By evaluating two sound classification datasets with two TTA models, we apply a range of prompt strategies. Our findings reveal that task-specific prompt strategies significantly outperform basic prompt approaches in data generation. Furthermore, merging datasets generated using different TTA models proves to enhance classification performance more effectively than merely increasing the training dataset size. Overall, our results underscore the advantages of these methods as effective data augmentation techniques using synthetic data.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2504.03329
- https://arxiv.org/pdf/2504.03329
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415979406
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415979406Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2504.03329Digital Object Identifier
- Title
-
Mind the Prompt: Prompting Strategies in Audio Generations for Improving Sound ClassificationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-04-04Full publication date if available
- Authors
-
Francesca Ronchini, Ho-Hsiang Wu, Wei-Cheng Lin, Fabio AntonacciList of authors in order
- Landing page
-
https://arxiv.org/abs/2504.03329Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2504.03329Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2504.03329Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415979406 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2504.03329 |
| ids.doi | https://doi.org/10.48550/arxiv.2504.03329 |
| ids.openalex | https://openalex.org/W4415979406 |
| fwci | |
| type | preprint |
| title | Mind the Prompt: Prompting Strategies in Audio Generations for Improving Sound Classification |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2504.03329 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2504.03329 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2504.03329 |
| locations[1].id | doi:10.48550/arxiv.2504.03329 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2504.03329 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5056089196 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-6897-1645 |
| authorships[0].author.display_name | Francesca Ronchini |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Ronchini, Francesca |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5035643647 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-1102-074X |
| authorships[1].author.display_name | Ho-Hsiang Wu |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wu, Ho-Hsiang |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5070819601 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-1933-1590 |
| authorships[2].author.display_name | Wei-Cheng Lin |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Lin, Wei-Cheng |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5082156387 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-4545-0315 |
| authorships[3].author.display_name | Fabio Antonacci |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Antonacci, Fabio |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2504.03329 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Mind the Prompt: Prompting Strategies in Audio Generations for Improving Sound Classification |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-08T23:21:52.890332 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2504.03329 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2504.03329 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2504.03329 |
| primary_location.id | pmh:oai:arXiv.org:2504.03329 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2504.03329 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2504.03329 |
| publication_date | 2025-04-04 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 47 |
| abstract_inverted_index.By | 35 |
| abstract_inverted_index.We | 17 |
| abstract_inverted_index.as | 98 |
| abstract_inverted_index.in | 31, 64 |
| abstract_inverted_index.of | 5, 49, 95 |
| abstract_inverted_index.to | 27, 76 |
| abstract_inverted_index.we | 45 |
| abstract_inverted_index.Our | 52 |
| abstract_inverted_index.TTA | 43, 73 |
| abstract_inverted_index.for | 9, 22 |
| abstract_inverted_index.our | 90 |
| abstract_inverted_index.the | 3, 85, 93 |
| abstract_inverted_index.two | 37, 42 |
| abstract_inverted_index.This | 0 |
| abstract_inverted_index.also | 18 |
| abstract_inverted_index.data | 65, 100 |
| abstract_inverted_index.more | 80 |
| abstract_inverted_index.than | 82 |
| abstract_inverted_index.that | 55 |
| abstract_inverted_index.with | 41 |
| abstract_inverted_index.(TTA) | 15 |
| abstract_inverted_index.apply | 46 |
| abstract_inverted_index.basic | 61 |
| abstract_inverted_index.data. | 105 |
| abstract_inverted_index.paper | 1 |
| abstract_inverted_index.range | 48 |
| abstract_inverted_index.size. | 88 |
| abstract_inverted_index.sound | 32, 38 |
| abstract_inverted_index.their | 29 |
| abstract_inverted_index.these | 25, 96 |
| abstract_inverted_index.using | 13, 71, 103 |
| abstract_inverted_index.design | 4 |
| abstract_inverted_index.merely | 83 |
| abstract_inverted_index.models | 74 |
| abstract_inverted_index.prompt | 7, 50, 57, 62 |
| abstract_inverted_index.proves | 75 |
| abstract_inverted_index.reveal | 54 |
| abstract_inverted_index.tasks. | 34 |
| abstract_inverted_index.analyze | 19 |
| abstract_inverted_index.dataset | 87 |
| abstract_inverted_index.enhance | 28, 77 |
| abstract_inverted_index.merging | 68 |
| abstract_inverted_index.methods | 97 |
| abstract_inverted_index.models, | 44 |
| abstract_inverted_index.models. | 16 |
| abstract_inverted_index.results | 91 |
| abstract_inverted_index.utility | 30 |
| abstract_inverted_index.Overall, | 89 |
| abstract_inverted_index.datasets | 12, 26, 40, 69 |
| abstract_inverted_index.findings | 53 |
| abstract_inverted_index.training | 86 |
| abstract_inverted_index.combining | 24 |
| abstract_inverted_index.different | 20, 72 |
| abstract_inverted_index.effective | 6, 99 |
| abstract_inverted_index.generated | 70 |
| abstract_inverted_index.realistic | 11 |
| abstract_inverted_index.synthetic | 104 |
| abstract_inverted_index.advantages | 94 |
| abstract_inverted_index.approaches | 63 |
| abstract_inverted_index.evaluating | 36 |
| abstract_inverted_index.generating | 10 |
| abstract_inverted_index.increasing | 84 |
| abstract_inverted_index.outperform | 60 |
| abstract_inverted_index.strategies | 8, 58 |
| abstract_inverted_index.techniques | 21, 102 |
| abstract_inverted_index.underscore | 92 |
| abstract_inverted_index.effectively | 81 |
| abstract_inverted_index.efficiently | 23 |
| abstract_inverted_index.generation. | 66 |
| abstract_inverted_index.performance | 79 |
| abstract_inverted_index.strategies. | 51 |
| abstract_inverted_index.Furthermore, | 67 |
| abstract_inverted_index.augmentation | 101 |
| abstract_inverted_index.investigates | 2 |
| abstract_inverted_index.Text-To-Audio | 14 |
| abstract_inverted_index.significantly | 59 |
| abstract_inverted_index.task-specific | 56 |
| abstract_inverted_index.classification | 33, 39, 78 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |