Vision-Language Matching for Text-to-Image Synthesis via Generative Adversarial Networks Article Swipe
YOU?
·
· 2022
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2208.09596
Text-to-image synthesis aims to generate a photo-realistic and semantic consistent image from a specific text description. The images synthesized by off-the-shelf models usually contain limited components compared with the corresponding image and text description, which decreases the image quality and the textual-visual consistency. To address this issue, we propose a novel Vision-Language Matching strategy for text-to-image synthesis, named VLMGAN*, which introduces a dual vision-language matching mechanism to strengthen the image quality and semantic consistency. The dual vision-language matching mechanism considers textual-visual matching between the generated image and the corresponding text description, and visual-visual consistent constraints between the synthesized image and the real image. Given a specific text description, VLMGAN* firstly encodes it into textual features and then feeds them to a dual vision-language matching-based generative model to synthesize a photo-realistic and textual semantic consistent image. Besides, the popular evaluation metrics for text-to-image synthesis are borrowed from simple image generation, which mainly evaluates the reality and diversity of the synthesized images. Therefore, we introduce a metric named Vision-Language Matching Score (VLMS) to evaluate the performance of text-to-image synthesis which can consider both the image quality and the semantic consistency between synthesized image and the description. The proposed dual multi-level vision-language matching strategy can be applied to other text-to-image synthesis methods. We implement this strategy on two popular baselines, which are marked with ${\text{VLMGAN}_{+\text{AttnGAN}}}$ and ${\text{VLMGAN}_{+\text{DFGAN}}}$. The experimental results on two widely-used datasets show that the model achieves significant improvements over other state-of-the-art methods.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2208.09596
- https://arxiv.org/pdf/2208.09596
- OA Status
- green
- Cited By
- 4
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4292957667
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4292957667Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2208.09596Digital Object Identifier
- Title
-
Vision-Language Matching for Text-to-Image Synthesis via Generative Adversarial NetworksWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2022Year of publication
- Publication date
-
2022-08-20Full publication date if available
- Authors
-
Qingrong Cheng, Keyu Wen, Xiaodong GuList of authors in order
- Landing page
-
https://arxiv.org/abs/2208.09596Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2208.09596Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2208.09596Direct OA link when available
- Concepts
-
Computer science, Consistency (knowledge bases), Image (mathematics), Artificial intelligence, Matching (statistics), Generative grammar, Image synthesis, Dual (grammatical number), Semantics (computer science), Metric (unit), Natural language processing, Computer vision, Information retrieval, Pattern recognition (psychology), Linguistics, Mathematics, Programming language, Statistics, Economics, Philosophy, Operations managementTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
4Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 3, 2024: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4292957667 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2208.09596 |
| ids.doi | https://doi.org/10.48550/arxiv.2208.09596 |
| ids.openalex | https://openalex.org/W4292957667 |
| fwci | |
| type | preprint |
| title | Vision-Language Matching for Text-to-Image Synthesis via Generative Adversarial Networks |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11714 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9835000038146973 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Multimodal Machine Learning Applications |
| topics[1].id | https://openalex.org/T10775 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.982200026512146 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1707 |
| topics[1].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[1].display_name | Generative Adversarial Networks and Image Synthesis |
| topics[2].id | https://openalex.org/T10627 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9761000275611877 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1707 |
| topics[2].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[2].display_name | Advanced Image and Video Retrieval Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.7517380714416504 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C2776436953 |
| concepts[1].level | 2 |
| concepts[1].score | 0.7331121563911438 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q5163215 |
| concepts[1].display_name | Consistency (knowledge bases) |
| concepts[2].id | https://openalex.org/C115961682 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6106201410293579 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q860623 |
| concepts[2].display_name | Image (mathematics) |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.6010429859161377 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C165064840 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5813676118850708 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q1321061 |
| concepts[4].display_name | Matching (statistics) |
| concepts[5].id | https://openalex.org/C39890363 |
| concepts[5].level | 2 |
| concepts[5].score | 0.5097548365592957 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q36108 |
| concepts[5].display_name | Generative grammar |
| concepts[6].id | https://openalex.org/C2989087649 |
| concepts[6].level | 3 |
| concepts[6].score | 0.43131938576698303 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q176953 |
| concepts[6].display_name | Image synthesis |
| concepts[7].id | https://openalex.org/C2780980858 |
| concepts[7].level | 2 |
| concepts[7].score | 0.42436468601226807 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q110022 |
| concepts[7].display_name | Dual (grammatical number) |
| concepts[8].id | https://openalex.org/C184337299 |
| concepts[8].level | 2 |
| concepts[8].score | 0.4216313660144806 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q1437428 |
| concepts[8].display_name | Semantics (computer science) |
| concepts[9].id | https://openalex.org/C176217482 |
| concepts[9].level | 2 |
| concepts[9].score | 0.41704171895980835 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q860554 |
| concepts[9].display_name | Metric (unit) |
| concepts[10].id | https://openalex.org/C204321447 |
| concepts[10].level | 1 |
| concepts[10].score | 0.399132639169693 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[10].display_name | Natural language processing |
| concepts[11].id | https://openalex.org/C31972630 |
| concepts[11].level | 1 |
| concepts[11].score | 0.3591722846031189 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q844240 |
| concepts[11].display_name | Computer vision |
| concepts[12].id | https://openalex.org/C23123220 |
| concepts[12].level | 1 |
| concepts[12].score | 0.3397189676761627 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q816826 |
| concepts[12].display_name | Information retrieval |
| concepts[13].id | https://openalex.org/C153180895 |
| concepts[13].level | 2 |
| concepts[13].score | 0.32047832012176514 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q7148389 |
| concepts[13].display_name | Pattern recognition (psychology) |
| concepts[14].id | https://openalex.org/C41895202 |
| concepts[14].level | 1 |
| concepts[14].score | 0.11056378483772278 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q8162 |
| concepts[14].display_name | Linguistics |
| concepts[15].id | https://openalex.org/C33923547 |
| concepts[15].level | 0 |
| concepts[15].score | 0.10984760522842407 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[15].display_name | Mathematics |
| concepts[16].id | https://openalex.org/C199360897 |
| concepts[16].level | 1 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[16].display_name | Programming language |
| concepts[17].id | https://openalex.org/C105795698 |
| concepts[17].level | 1 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q12483 |
| concepts[17].display_name | Statistics |
| concepts[18].id | https://openalex.org/C162324750 |
| concepts[18].level | 0 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q8134 |
| concepts[18].display_name | Economics |
| concepts[19].id | https://openalex.org/C138885662 |
| concepts[19].level | 0 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q5891 |
| concepts[19].display_name | Philosophy |
| concepts[20].id | https://openalex.org/C21547014 |
| concepts[20].level | 1 |
| concepts[20].score | 0.0 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q1423657 |
| concepts[20].display_name | Operations management |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.7517380714416504 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/consistency |
| keywords[1].score | 0.7331121563911438 |
| keywords[1].display_name | Consistency (knowledge bases) |
| keywords[2].id | https://openalex.org/keywords/image |
| keywords[2].score | 0.6106201410293579 |
| keywords[2].display_name | Image (mathematics) |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.6010429859161377 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/matching |
| keywords[4].score | 0.5813676118850708 |
| keywords[4].display_name | Matching (statistics) |
| keywords[5].id | https://openalex.org/keywords/generative-grammar |
| keywords[5].score | 0.5097548365592957 |
| keywords[5].display_name | Generative grammar |
| keywords[6].id | https://openalex.org/keywords/image-synthesis |
| keywords[6].score | 0.43131938576698303 |
| keywords[6].display_name | Image synthesis |
| keywords[7].id | https://openalex.org/keywords/dual |
| keywords[7].score | 0.42436468601226807 |
| keywords[7].display_name | Dual (grammatical number) |
| keywords[8].id | https://openalex.org/keywords/semantics |
| keywords[8].score | 0.4216313660144806 |
| keywords[8].display_name | Semantics (computer science) |
| keywords[9].id | https://openalex.org/keywords/metric |
| keywords[9].score | 0.41704171895980835 |
| keywords[9].display_name | Metric (unit) |
| keywords[10].id | https://openalex.org/keywords/natural-language-processing |
| keywords[10].score | 0.399132639169693 |
| keywords[10].display_name | Natural language processing |
| keywords[11].id | https://openalex.org/keywords/computer-vision |
| keywords[11].score | 0.3591722846031189 |
| keywords[11].display_name | Computer vision |
| keywords[12].id | https://openalex.org/keywords/information-retrieval |
| keywords[12].score | 0.3397189676761627 |
| keywords[12].display_name | Information retrieval |
| keywords[13].id | https://openalex.org/keywords/pattern-recognition |
| keywords[13].score | 0.32047832012176514 |
| keywords[13].display_name | Pattern recognition (psychology) |
| keywords[14].id | https://openalex.org/keywords/linguistics |
| keywords[14].score | 0.11056378483772278 |
| keywords[14].display_name | Linguistics |
| keywords[15].id | https://openalex.org/keywords/mathematics |
| keywords[15].score | 0.10984760522842407 |
| keywords[15].display_name | Mathematics |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2208.09596 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2208.09596 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2208.09596 |
| locations[1].id | doi:10.48550/arxiv.2208.09596 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2208.09596 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5101178676 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Qingrong Cheng |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Cheng, Qingrong |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5004061050 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-5048-9014 |
| authorships[1].author.display_name | Keyu Wen |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wen, Keyu |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5101294804 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Xiaodong Gu |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Gu, Xiaodong |
| authorships[2].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2208.09596 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2022-08-24T00:00:00 |
| display_name | Vision-Language Matching for Text-to-Image Synthesis via Generative Adversarial Networks |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11714 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9835000038146973 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Multimodal Machine Learning Applications |
| related_works | https://openalex.org/W2380075625, https://openalex.org/W4390718435, https://openalex.org/W4390549206, https://openalex.org/W3137171911, https://openalex.org/W4237784285, https://openalex.org/W2374712251, https://openalex.org/W4383031710, https://openalex.org/W3165231707, https://openalex.org/W4297411772, https://openalex.org/W4366834432 |
| cited_by_count | 4 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 3 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2208.09596 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2208.09596 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2208.09596 |
| primary_location.id | pmh:oai:arXiv.org:2208.09596 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2208.09596 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2208.09596 |
| publication_date | 2022-08-20 |
| publication_year | 2022 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 5, 12, 49, 61, 104, 120, 128, 163 |
| abstract_inverted_index.To | 43 |
| abstract_inverted_index.We | 209 |
| abstract_inverted_index.be | 202 |
| abstract_inverted_index.by | 19 |
| abstract_inverted_index.it | 111 |
| abstract_inverted_index.of | 156, 174 |
| abstract_inverted_index.on | 213, 227 |
| abstract_inverted_index.to | 3, 66, 119, 126, 170, 204 |
| abstract_inverted_index.we | 47, 161 |
| abstract_inverted_index.The | 16, 74, 194, 224 |
| abstract_inverted_index.and | 7, 31, 39, 71, 86, 91, 99, 115, 130, 154, 184, 191, 222 |
| abstract_inverted_index.are | 143, 218 |
| abstract_inverted_index.can | 178, 201 |
| abstract_inverted_index.for | 54, 140 |
| abstract_inverted_index.the | 28, 36, 40, 68, 83, 87, 96, 100, 136, 152, 157, 172, 181, 185, 192, 233 |
| abstract_inverted_index.two | 214, 228 |
| abstract_inverted_index.aims | 2 |
| abstract_inverted_index.both | 180 |
| abstract_inverted_index.dual | 62, 75, 121, 196 |
| abstract_inverted_index.from | 11, 145 |
| abstract_inverted_index.into | 112 |
| abstract_inverted_index.over | 238 |
| abstract_inverted_index.real | 101 |
| abstract_inverted_index.show | 231 |
| abstract_inverted_index.text | 14, 32, 89, 106 |
| abstract_inverted_index.that | 232 |
| abstract_inverted_index.them | 118 |
| abstract_inverted_index.then | 116 |
| abstract_inverted_index.this | 45, 211 |
| abstract_inverted_index.with | 27, 220 |
| abstract_inverted_index.Given | 103 |
| abstract_inverted_index.Score | 168 |
| abstract_inverted_index.feeds | 117 |
| abstract_inverted_index.image | 10, 30, 37, 69, 85, 98, 147, 182, 190 |
| abstract_inverted_index.model | 125, 234 |
| abstract_inverted_index.named | 57, 165 |
| abstract_inverted_index.novel | 50 |
| abstract_inverted_index.other | 205, 239 |
| abstract_inverted_index.which | 34, 59, 149, 177, 217 |
| abstract_inverted_index.(VLMS) | 169 |
| abstract_inverted_index.image. | 102, 134 |
| abstract_inverted_index.images | 17 |
| abstract_inverted_index.issue, | 46 |
| abstract_inverted_index.mainly | 150 |
| abstract_inverted_index.marked | 219 |
| abstract_inverted_index.metric | 164 |
| abstract_inverted_index.models | 21 |
| abstract_inverted_index.simple | 146 |
| abstract_inverted_index.VLMGAN* | 108 |
| abstract_inverted_index.address | 44 |
| abstract_inverted_index.applied | 203 |
| abstract_inverted_index.between | 82, 95, 188 |
| abstract_inverted_index.contain | 23 |
| abstract_inverted_index.encodes | 110 |
| abstract_inverted_index.firstly | 109 |
| abstract_inverted_index.images. | 159 |
| abstract_inverted_index.limited | 24 |
| abstract_inverted_index.metrics | 139 |
| abstract_inverted_index.popular | 137, 215 |
| abstract_inverted_index.propose | 48 |
| abstract_inverted_index.quality | 38, 70, 183 |
| abstract_inverted_index.reality | 153 |
| abstract_inverted_index.results | 226 |
| abstract_inverted_index.textual | 113, 131 |
| abstract_inverted_index.usually | 22 |
| abstract_inverted_index.Besides, | 135 |
| abstract_inverted_index.Matching | 52, 167 |
| abstract_inverted_index.VLMGAN*, | 58 |
| abstract_inverted_index.achieves | 235 |
| abstract_inverted_index.borrowed | 144 |
| abstract_inverted_index.compared | 26 |
| abstract_inverted_index.consider | 179 |
| abstract_inverted_index.datasets | 230 |
| abstract_inverted_index.evaluate | 171 |
| abstract_inverted_index.features | 114 |
| abstract_inverted_index.generate | 4 |
| abstract_inverted_index.matching | 64, 77, 81, 199 |
| abstract_inverted_index.methods. | 208, 241 |
| abstract_inverted_index.proposed | 195 |
| abstract_inverted_index.semantic | 8, 72, 132, 186 |
| abstract_inverted_index.specific | 13, 105 |
| abstract_inverted_index.strategy | 53, 200, 212 |
| abstract_inverted_index.considers | 79 |
| abstract_inverted_index.decreases | 35 |
| abstract_inverted_index.diversity | 155 |
| abstract_inverted_index.evaluates | 151 |
| abstract_inverted_index.generated | 84 |
| abstract_inverted_index.implement | 210 |
| abstract_inverted_index.introduce | 162 |
| abstract_inverted_index.mechanism | 65, 78 |
| abstract_inverted_index.synthesis | 1, 142, 176, 207 |
| abstract_inverted_index.Therefore, | 160 |
| abstract_inverted_index.baselines, | 216 |
| abstract_inverted_index.components | 25 |
| abstract_inverted_index.consistent | 9, 93, 133 |
| abstract_inverted_index.evaluation | 138 |
| abstract_inverted_index.generative | 124 |
| abstract_inverted_index.introduces | 60 |
| abstract_inverted_index.strengthen | 67 |
| abstract_inverted_index.synthesis, | 56 |
| abstract_inverted_index.synthesize | 127 |
| abstract_inverted_index.consistency | 187 |
| abstract_inverted_index.constraints | 94 |
| abstract_inverted_index.generation, | 148 |
| abstract_inverted_index.multi-level | 197 |
| abstract_inverted_index.performance | 173 |
| abstract_inverted_index.significant | 236 |
| abstract_inverted_index.synthesized | 18, 97, 158, 189 |
| abstract_inverted_index.widely-used | 229 |
| abstract_inverted_index.consistency. | 42, 73 |
| abstract_inverted_index.description, | 33, 90, 107 |
| abstract_inverted_index.description. | 15, 193 |
| abstract_inverted_index.experimental | 225 |
| abstract_inverted_index.improvements | 237 |
| abstract_inverted_index.Text-to-image | 0 |
| abstract_inverted_index.corresponding | 29, 88 |
| abstract_inverted_index.off-the-shelf | 20 |
| abstract_inverted_index.text-to-image | 55, 141, 175, 206 |
| abstract_inverted_index.visual-visual | 92 |
| abstract_inverted_index.matching-based | 123 |
| abstract_inverted_index.textual-visual | 41, 80 |
| abstract_inverted_index.Vision-Language | 51, 166 |
| abstract_inverted_index.photo-realistic | 6, 129 |
| abstract_inverted_index.vision-language | 63, 76, 122, 198 |
| abstract_inverted_index.state-of-the-art | 240 |
| abstract_inverted_index.${\text{VLMGAN}_{+\text{DFGAN}}}$. | 223 |
| abstract_inverted_index.${\text{VLMGAN}_{+\text{AttnGAN}}}$ | 221 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| citation_normalized_percentile |