FuseGen: PLM Fusion for Data-generation based Zero-shot Learning Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2406.12527
Data generation-based zero-shot learning, although effective in training Small Task-specific Models (STMs) via synthetic datasets generated by Pre-trained Language Models (PLMs), is often limited by the low quality of such synthetic datasets. Previous solutions have primarily focused on single PLM settings, where synthetic datasets are typically restricted to specific sub-spaces and often deviate from real-world distributions, leading to severe distribution bias. To mitigate such bias, we propose FuseGen, a novel data generation-based zero-shot learning framework that introduces a new criteria for subset selection from synthetic datasets via utilizing multiple PLMs and trained STMs. The chosen subset provides in-context feedback to each PLM, enhancing dataset quality through iterative data generation. Trained STMs are then used for sample re-weighting as well, further improving data quality. Extensive experiments across diverse tasks demonstrate that FuseGen substantially outperforms existing methods, highly effective in boosting STM performance in a PLM-agnostic way. Code is provided in https://github.com/LindaLydia/FuseGen.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2406.12527
- https://arxiv.org/pdf/2406.12527
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4399837450
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4399837450Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2406.12527Digital Object Identifier
- Title
-
FuseGen: PLM Fusion for Data-generation based Zero-shot LearningWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-06-18Full publication date if available
- Authors
-
Tianyuan Zou, Yang Liu, Peng Li, Jianqing Zhang, Jingjing Liu, Ya-Qin ZhangList of authors in order
- Landing page
-
https://arxiv.org/abs/2406.12527Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2406.12527Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2406.12527Direct OA link when available
- Concepts
-
Shot (pellet), Zero (linguistics), Fusion, Computer science, Artificial intelligence, Materials science, Philosophy, Metallurgy, LinguisticsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4399837450 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2406.12527 |
| ids.doi | https://doi.org/10.48550/arxiv.2406.12527 |
| ids.openalex | https://openalex.org/W4399837450 |
| fwci | |
| type | preprint |
| title | FuseGen: PLM Fusion for Data-generation based Zero-shot Learning |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11307 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9787999987602234 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Domain Adaptation and Few-Shot Learning |
| topics[1].id | https://openalex.org/T11775 |
| topics[1].field.id | https://openalex.org/fields/27 |
| topics[1].field.display_name | Medicine |
| topics[1].score | 0.9692999720573425 |
| topics[1].domain.id | https://openalex.org/domains/4 |
| topics[1].domain.display_name | Health Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2741 |
| topics[1].subfield.display_name | Radiology, Nuclear Medicine and Imaging |
| topics[1].display_name | COVID-19 diagnosis using AI |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2778344882 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6303427219390869 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q278938 |
| concepts[0].display_name | Shot (pellet) |
| concepts[1].id | https://openalex.org/C2780813799 |
| concepts[1].level | 2 |
| concepts[1].score | 0.602383017539978 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q3274237 |
| concepts[1].display_name | Zero (linguistics) |
| concepts[2].id | https://openalex.org/C158525013 |
| concepts[2].level | 2 |
| concepts[2].score | 0.5998764038085938 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q2593739 |
| concepts[2].display_name | Fusion |
| concepts[3].id | https://openalex.org/C41008148 |
| concepts[3].level | 0 |
| concepts[3].score | 0.4413849413394928 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[3].display_name | Computer science |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.3857696056365967 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C192562407 |
| concepts[5].level | 0 |
| concepts[5].score | 0.11888560652732849 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q228736 |
| concepts[5].display_name | Materials science |
| concepts[6].id | https://openalex.org/C138885662 |
| concepts[6].level | 0 |
| concepts[6].score | 0.081624835729599 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q5891 |
| concepts[6].display_name | Philosophy |
| concepts[7].id | https://openalex.org/C191897082 |
| concepts[7].level | 1 |
| concepts[7].score | 0.0 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q11467 |
| concepts[7].display_name | Metallurgy |
| concepts[8].id | https://openalex.org/C41895202 |
| concepts[8].level | 1 |
| concepts[8].score | 0.0 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q8162 |
| concepts[8].display_name | Linguistics |
| keywords[0].id | https://openalex.org/keywords/shot |
| keywords[0].score | 0.6303427219390869 |
| keywords[0].display_name | Shot (pellet) |
| keywords[1].id | https://openalex.org/keywords/zero |
| keywords[1].score | 0.602383017539978 |
| keywords[1].display_name | Zero (linguistics) |
| keywords[2].id | https://openalex.org/keywords/fusion |
| keywords[2].score | 0.5998764038085938 |
| keywords[2].display_name | Fusion |
| keywords[3].id | https://openalex.org/keywords/computer-science |
| keywords[3].score | 0.4413849413394928 |
| keywords[3].display_name | Computer science |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.3857696056365967 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/materials-science |
| keywords[5].score | 0.11888560652732849 |
| keywords[5].display_name | Materials science |
| keywords[6].id | https://openalex.org/keywords/philosophy |
| keywords[6].score | 0.081624835729599 |
| keywords[6].display_name | Philosophy |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2406.12527 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by-nc-sa |
| locations[0].pdf_url | https://arxiv.org/pdf/2406.12527 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | https://openalex.org/licenses/cc-by-nc-sa |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2406.12527 |
| locations[1].id | doi:10.48550/arxiv.2406.12527 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2406.12527 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5056834945 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Tianyuan Zou |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Zou, Tianyuan |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5100355692 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-7300-9215 |
| authorships[1].author.display_name | Yang Liu |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Liu, Yang |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5100457855 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-4684-4909 |
| authorships[2].author.display_name | Peng Li |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Li, Peng |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5056834769 |
| authorships[3].author.orcid | https://orcid.org/0009-0003-4990-8466 |
| authorships[3].author.display_name | Jianqing Zhang |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Zhang, Jianqing |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5100442576 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-5331-7015 |
| authorships[4].author.display_name | Jingjing Liu |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Liu, Jingjing |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5107228559 |
| authorships[5].author.orcid | https://orcid.org/0000-0003-4515-6212 |
| authorships[5].author.display_name | Ya-Qin Zhang |
| authorships[5].author_position | last |
| authorships[5].raw_author_name | Zhang, Ya-Qin |
| authorships[5].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2406.12527 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | FuseGen: PLM Fusion for Data-generation based Zero-shot Learning |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11307 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9787999987602234 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Domain Adaptation and Few-Shot Learning |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2748952813, https://openalex.org/W2074502265, https://openalex.org/W4214877189, https://openalex.org/W2773965352, https://openalex.org/W2381179799, https://openalex.org/W2980279061, https://openalex.org/W2334685461, https://openalex.org/W2366718574, https://openalex.org/W2359774528 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2406.12527 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by-nc-sa |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2406.12527 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by-nc-sa |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2406.12527 |
| primary_location.id | pmh:oai:arXiv.org:2406.12527 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by-nc-sa |
| primary_location.pdf_url | https://arxiv.org/pdf/2406.12527 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | https://openalex.org/licenses/cc-by-nc-sa |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2406.12527 |
| publication_date | 2024-06-18 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 68, 77, 142 |
| abstract_inverted_index.To | 61 |
| abstract_inverted_index.as | 117 |
| abstract_inverted_index.by | 16, 24 |
| abstract_inverted_index.in | 6, 137, 141, 148 |
| abstract_inverted_index.is | 21, 146 |
| abstract_inverted_index.of | 28 |
| abstract_inverted_index.on | 37 |
| abstract_inverted_index.to | 47, 57, 99 |
| abstract_inverted_index.we | 65 |
| abstract_inverted_index.PLM | 39 |
| abstract_inverted_index.STM | 139 |
| abstract_inverted_index.The | 93 |
| abstract_inverted_index.and | 50, 90 |
| abstract_inverted_index.are | 44, 111 |
| abstract_inverted_index.for | 80, 114 |
| abstract_inverted_index.low | 26 |
| abstract_inverted_index.new | 78 |
| abstract_inverted_index.the | 25 |
| abstract_inverted_index.via | 12, 86 |
| abstract_inverted_index.Code | 145 |
| abstract_inverted_index.Data | 0 |
| abstract_inverted_index.PLM, | 101 |
| abstract_inverted_index.PLMs | 89 |
| abstract_inverted_index.STMs | 110 |
| abstract_inverted_index.data | 70, 107, 121 |
| abstract_inverted_index.each | 100 |
| abstract_inverted_index.from | 53, 83 |
| abstract_inverted_index.have | 34 |
| abstract_inverted_index.such | 29, 63 |
| abstract_inverted_index.that | 75, 129 |
| abstract_inverted_index.then | 112 |
| abstract_inverted_index.used | 113 |
| abstract_inverted_index.way. | 144 |
| abstract_inverted_index.STMs. | 92 |
| abstract_inverted_index.Small | 8 |
| abstract_inverted_index.bias, | 64 |
| abstract_inverted_index.bias. | 60 |
| abstract_inverted_index.novel | 69 |
| abstract_inverted_index.often | 22, 51 |
| abstract_inverted_index.tasks | 127 |
| abstract_inverted_index.well, | 118 |
| abstract_inverted_index.where | 41 |
| abstract_inverted_index.(STMs) | 11 |
| abstract_inverted_index.Models | 10, 19 |
| abstract_inverted_index.across | 125 |
| abstract_inverted_index.chosen | 94 |
| abstract_inverted_index.highly | 135 |
| abstract_inverted_index.sample | 115 |
| abstract_inverted_index.severe | 58 |
| abstract_inverted_index.single | 38 |
| abstract_inverted_index.subset | 81, 95 |
| abstract_inverted_index.(PLMs), | 20 |
| abstract_inverted_index.FuseGen | 130 |
| abstract_inverted_index.Trained | 109 |
| abstract_inverted_index.dataset | 103 |
| abstract_inverted_index.deviate | 52 |
| abstract_inverted_index.diverse | 126 |
| abstract_inverted_index.focused | 36 |
| abstract_inverted_index.further | 119 |
| abstract_inverted_index.leading | 56 |
| abstract_inverted_index.limited | 23 |
| abstract_inverted_index.propose | 66 |
| abstract_inverted_index.quality | 27, 104 |
| abstract_inverted_index.through | 105 |
| abstract_inverted_index.trained | 91 |
| abstract_inverted_index.FuseGen, | 67 |
| abstract_inverted_index.Language | 18 |
| abstract_inverted_index.Previous | 32 |
| abstract_inverted_index.although | 4 |
| abstract_inverted_index.boosting | 138 |
| abstract_inverted_index.criteria | 79 |
| abstract_inverted_index.datasets | 14, 43, 85 |
| abstract_inverted_index.existing | 133 |
| abstract_inverted_index.feedback | 98 |
| abstract_inverted_index.learning | 73 |
| abstract_inverted_index.methods, | 134 |
| abstract_inverted_index.mitigate | 62 |
| abstract_inverted_index.multiple | 88 |
| abstract_inverted_index.provided | 147 |
| abstract_inverted_index.provides | 96 |
| abstract_inverted_index.quality. | 122 |
| abstract_inverted_index.specific | 48 |
| abstract_inverted_index.training | 7 |
| abstract_inverted_index.Extensive | 123 |
| abstract_inverted_index.datasets. | 31 |
| abstract_inverted_index.effective | 5, 136 |
| abstract_inverted_index.enhancing | 102 |
| abstract_inverted_index.framework | 74 |
| abstract_inverted_index.generated | 15 |
| abstract_inverted_index.improving | 120 |
| abstract_inverted_index.iterative | 106 |
| abstract_inverted_index.learning, | 3 |
| abstract_inverted_index.primarily | 35 |
| abstract_inverted_index.selection | 82 |
| abstract_inverted_index.settings, | 40 |
| abstract_inverted_index.solutions | 33 |
| abstract_inverted_index.synthetic | 13, 30, 42, 84 |
| abstract_inverted_index.typically | 45 |
| abstract_inverted_index.utilizing | 87 |
| abstract_inverted_index.zero-shot | 2, 72 |
| abstract_inverted_index.in-context | 97 |
| abstract_inverted_index.introduces | 76 |
| abstract_inverted_index.real-world | 54 |
| abstract_inverted_index.restricted | 46 |
| abstract_inverted_index.sub-spaces | 49 |
| abstract_inverted_index.Pre-trained | 17 |
| abstract_inverted_index.demonstrate | 128 |
| abstract_inverted_index.experiments | 124 |
| abstract_inverted_index.generation. | 108 |
| abstract_inverted_index.outperforms | 132 |
| abstract_inverted_index.performance | 140 |
| abstract_inverted_index.PLM-agnostic | 143 |
| abstract_inverted_index.distribution | 59 |
| abstract_inverted_index.re-weighting | 116 |
| abstract_inverted_index.Task-specific | 9 |
| abstract_inverted_index.substantially | 131 |
| abstract_inverted_index.distributions, | 55 |
| abstract_inverted_index.generation-based | 1, 71 |
| abstract_inverted_index.https://github.com/LindaLydia/FuseGen. | 149 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 6 |
| citation_normalized_percentile |