ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2311.16494
Although soft prompt tuning is effective in efficiently adapting Vision-Language (V&L) models for downstream tasks, it shows limitations in dealing with distribution shifts. We address this issue with Attribute-Guided Prompt Tuning (ArGue), making three key contributions. 1) In contrast to the conventional approach of directly appending soft prompts preceding class names, we align the model with primitive visual attributes generated by Large Language Models (LLMs). We posit that a model's ability to express high confidence in these attributes signifies its capacity to discern the correct class rationales. 2) We introduce attribute sampling to eliminate disadvantageous attributes, thus only semantically meaningful attributes are preserved. 3) We propose negative prompting, explicitly enumerating class-agnostic attributes to activate spurious correlations and encourage the model to generate highly orthogonal probability distributions in relation to these negative features. In experiments, our method significantly outperforms current state-of-the-art prompt tuning methods on both novel class prediction and out-of-distribution generalization tasks.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2311.16494
- https://arxiv.org/pdf/2311.16494
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4389156639
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4389156639Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2311.16494Digital Object Identifier
- Title
-
ArGue: Attribute-Guided Prompt Tuning for Vision-Language ModelsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-11-27Full publication date if available
- Authors
-
Xinyu Tian, Shu Zou, Zhaoyuan Yang, Jing ZhangList of authors in order
- Landing page
-
https://arxiv.org/abs/2311.16494Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2311.16494Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2311.16494Direct OA link when available
- Concepts
-
Spurious relationship, Computer science, Generalization, Class (philosophy), Contrast (vision), Language model, Artificial intelligence, Key (lock), Machine learning, Relation (database), Data mining, Mathematics, Computer security, Mathematical analysisTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4389156639 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2311.16494 |
| ids.doi | https://doi.org/10.48550/arxiv.2311.16494 |
| ids.openalex | https://openalex.org/W4389156639 |
| fwci | |
| type | preprint |
| title | ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11714 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9980999827384949 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Multimodal Machine Learning Applications |
| topics[1].id | https://openalex.org/T11307 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9911999702453613 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Domain Adaptation and Few-Shot Learning |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C97256817 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7874338626861572 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1462316 |
| concepts[0].display_name | Spurious relationship |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.7746772766113281 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C177148314 |
| concepts[2].level | 2 |
| concepts[2].score | 0.7599666118621826 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q170084 |
| concepts[2].display_name | Generalization |
| concepts[3].id | https://openalex.org/C2777212361 |
| concepts[3].level | 2 |
| concepts[3].score | 0.6446007490158081 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q5127848 |
| concepts[3].display_name | Class (philosophy) |
| concepts[4].id | https://openalex.org/C2776502983 |
| concepts[4].level | 2 |
| concepts[4].score | 0.57625812292099 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q690182 |
| concepts[4].display_name | Contrast (vision) |
| concepts[5].id | https://openalex.org/C137293760 |
| concepts[5].level | 2 |
| concepts[5].score | 0.5646227598190308 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q3621696 |
| concepts[5].display_name | Language model |
| concepts[6].id | https://openalex.org/C154945302 |
| concepts[6].level | 1 |
| concepts[6].score | 0.550142228603363 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[6].display_name | Artificial intelligence |
| concepts[7].id | https://openalex.org/C26517878 |
| concepts[7].level | 2 |
| concepts[7].score | 0.49235451221466064 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q228039 |
| concepts[7].display_name | Key (lock) |
| concepts[8].id | https://openalex.org/C119857082 |
| concepts[8].level | 1 |
| concepts[8].score | 0.4618946313858032 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[8].display_name | Machine learning |
| concepts[9].id | https://openalex.org/C25343380 |
| concepts[9].level | 2 |
| concepts[9].score | 0.4121745228767395 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q277521 |
| concepts[9].display_name | Relation (database) |
| concepts[10].id | https://openalex.org/C124101348 |
| concepts[10].level | 1 |
| concepts[10].score | 0.1986425518989563 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q172491 |
| concepts[10].display_name | Data mining |
| concepts[11].id | https://openalex.org/C33923547 |
| concepts[11].level | 0 |
| concepts[11].score | 0.10394129157066345 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[11].display_name | Mathematics |
| concepts[12].id | https://openalex.org/C38652104 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q3510521 |
| concepts[12].display_name | Computer security |
| concepts[13].id | https://openalex.org/C134306372 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q7754 |
| concepts[13].display_name | Mathematical analysis |
| keywords[0].id | https://openalex.org/keywords/spurious-relationship |
| keywords[0].score | 0.7874338626861572 |
| keywords[0].display_name | Spurious relationship |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.7746772766113281 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/generalization |
| keywords[2].score | 0.7599666118621826 |
| keywords[2].display_name | Generalization |
| keywords[3].id | https://openalex.org/keywords/class |
| keywords[3].score | 0.6446007490158081 |
| keywords[3].display_name | Class (philosophy) |
| keywords[4].id | https://openalex.org/keywords/contrast |
| keywords[4].score | 0.57625812292099 |
| keywords[4].display_name | Contrast (vision) |
| keywords[5].id | https://openalex.org/keywords/language-model |
| keywords[5].score | 0.5646227598190308 |
| keywords[5].display_name | Language model |
| keywords[6].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[6].score | 0.550142228603363 |
| keywords[6].display_name | Artificial intelligence |
| keywords[7].id | https://openalex.org/keywords/key |
| keywords[7].score | 0.49235451221466064 |
| keywords[7].display_name | Key (lock) |
| keywords[8].id | https://openalex.org/keywords/machine-learning |
| keywords[8].score | 0.4618946313858032 |
| keywords[8].display_name | Machine learning |
| keywords[9].id | https://openalex.org/keywords/relation |
| keywords[9].score | 0.4121745228767395 |
| keywords[9].display_name | Relation (database) |
| keywords[10].id | https://openalex.org/keywords/data-mining |
| keywords[10].score | 0.1986425518989563 |
| keywords[10].display_name | Data mining |
| keywords[11].id | https://openalex.org/keywords/mathematics |
| keywords[11].score | 0.10394129157066345 |
| keywords[11].display_name | Mathematics |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2311.16494 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://arxiv.org/pdf/2311.16494 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2311.16494 |
| locations[1].id | doi:10.48550/arxiv.2311.16494 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2311.16494 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5029944382 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-1247-6076 |
| authorships[0].author.display_name | Xinyu Tian |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Tian, Xinyu |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5101292475 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Shu Zou |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zou, Shu |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5056835845 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-0011-0093 |
| authorships[2].author.display_name | Zhaoyuan Yang |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Yang, Zhaoyuan |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5014482509 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-2541-4923 |
| authorships[3].author.display_name | Jing Zhang |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Zhang, Jing |
| authorships[3].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2311.16494 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11714 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9980999827384949 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Multimodal Machine Learning Applications |
| related_works | https://openalex.org/W2162899405, https://openalex.org/W3113091479, https://openalex.org/W941090075, https://openalex.org/W2044987316, https://openalex.org/W3134374554, https://openalex.org/W2237480245, https://openalex.org/W2075065631, https://openalex.org/W2519167559, https://openalex.org/W4288358396, https://openalex.org/W4311248832 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2311.16494 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2311.16494 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2311.16494 |
| primary_location.id | pmh:oai:arXiv.org:2311.16494 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://arxiv.org/pdf/2311.16494 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2311.16494 |
| publication_date | 2023-11-27 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 68 |
| abstract_inverted_index.1) | 36 |
| abstract_inverted_index.2) | 87 |
| abstract_inverted_index.3) | 103 |
| abstract_inverted_index.In | 37, 132 |
| abstract_inverted_index.We | 23, 65, 88, 104 |
| abstract_inverted_index.by | 60 |
| abstract_inverted_index.in | 6, 18, 75, 126 |
| abstract_inverted_index.is | 4 |
| abstract_inverted_index.it | 15 |
| abstract_inverted_index.of | 43 |
| abstract_inverted_index.on | 143 |
| abstract_inverted_index.to | 39, 71, 81, 92, 112, 120, 128 |
| abstract_inverted_index.we | 51 |
| abstract_inverted_index.and | 116, 148 |
| abstract_inverted_index.are | 101 |
| abstract_inverted_index.for | 12 |
| abstract_inverted_index.its | 79 |
| abstract_inverted_index.key | 34 |
| abstract_inverted_index.our | 134 |
| abstract_inverted_index.the | 40, 53, 83, 118 |
| abstract_inverted_index.both | 144 |
| abstract_inverted_index.high | 73 |
| abstract_inverted_index.only | 97 |
| abstract_inverted_index.soft | 1, 46 |
| abstract_inverted_index.that | 67 |
| abstract_inverted_index.this | 25 |
| abstract_inverted_index.thus | 96 |
| abstract_inverted_index.with | 20, 27, 55 |
| abstract_inverted_index.Large | 61 |
| abstract_inverted_index.align | 52 |
| abstract_inverted_index.class | 49, 85, 146 |
| abstract_inverted_index.issue | 26 |
| abstract_inverted_index.model | 54, 119 |
| abstract_inverted_index.novel | 145 |
| abstract_inverted_index.posit | 66 |
| abstract_inverted_index.shows | 16 |
| abstract_inverted_index.these | 76, 129 |
| abstract_inverted_index.three | 33 |
| abstract_inverted_index.Models | 63 |
| abstract_inverted_index.Prompt | 29 |
| abstract_inverted_index.Tuning | 30 |
| abstract_inverted_index.highly | 122 |
| abstract_inverted_index.making | 32 |
| abstract_inverted_index.method | 135 |
| abstract_inverted_index.models | 11 |
| abstract_inverted_index.names, | 50 |
| abstract_inverted_index.prompt | 2, 140 |
| abstract_inverted_index.tasks, | 14 |
| abstract_inverted_index.tasks. | 151 |
| abstract_inverted_index.tuning | 3, 141 |
| abstract_inverted_index.visual | 57 |
| abstract_inverted_index.(LLMs). | 64 |
| abstract_inverted_index.ability | 70 |
| abstract_inverted_index.address | 24 |
| abstract_inverted_index.correct | 84 |
| abstract_inverted_index.current | 138 |
| abstract_inverted_index.dealing | 19 |
| abstract_inverted_index.discern | 82 |
| abstract_inverted_index.express | 72 |
| abstract_inverted_index.methods | 142 |
| abstract_inverted_index.model's | 69 |
| abstract_inverted_index.prompts | 47 |
| abstract_inverted_index.propose | 105 |
| abstract_inverted_index.shifts. | 22 |
| abstract_inverted_index.(ArGue), | 31 |
| abstract_inverted_index.Although | 0 |
| abstract_inverted_index.Language | 62 |
| abstract_inverted_index.activate | 113 |
| abstract_inverted_index.adapting | 8 |
| abstract_inverted_index.approach | 42 |
| abstract_inverted_index.capacity | 80 |
| abstract_inverted_index.contrast | 38 |
| abstract_inverted_index.directly | 44 |
| abstract_inverted_index.generate | 121 |
| abstract_inverted_index.negative | 106, 130 |
| abstract_inverted_index.relation | 127 |
| abstract_inverted_index.sampling | 91 |
| abstract_inverted_index.spurious | 114 |
| abstract_inverted_index.(V&L) | 10 |
| abstract_inverted_index.appending | 45 |
| abstract_inverted_index.attribute | 90 |
| abstract_inverted_index.effective | 5 |
| abstract_inverted_index.eliminate | 93 |
| abstract_inverted_index.encourage | 117 |
| abstract_inverted_index.features. | 131 |
| abstract_inverted_index.generated | 59 |
| abstract_inverted_index.introduce | 89 |
| abstract_inverted_index.preceding | 48 |
| abstract_inverted_index.primitive | 56 |
| abstract_inverted_index.signifies | 78 |
| abstract_inverted_index.attributes | 58, 77, 100, 111 |
| abstract_inverted_index.confidence | 74 |
| abstract_inverted_index.downstream | 13 |
| abstract_inverted_index.explicitly | 108 |
| abstract_inverted_index.meaningful | 99 |
| abstract_inverted_index.orthogonal | 123 |
| abstract_inverted_index.prediction | 147 |
| abstract_inverted_index.preserved. | 102 |
| abstract_inverted_index.prompting, | 107 |
| abstract_inverted_index.attributes, | 95 |
| abstract_inverted_index.efficiently | 7 |
| abstract_inverted_index.enumerating | 109 |
| abstract_inverted_index.limitations | 17 |
| abstract_inverted_index.outperforms | 137 |
| abstract_inverted_index.probability | 124 |
| abstract_inverted_index.rationales. | 86 |
| abstract_inverted_index.conventional | 41 |
| abstract_inverted_index.correlations | 115 |
| abstract_inverted_index.distribution | 21 |
| abstract_inverted_index.experiments, | 133 |
| abstract_inverted_index.semantically | 98 |
| abstract_inverted_index.distributions | 125 |
| abstract_inverted_index.significantly | 136 |
| abstract_inverted_index.class-agnostic | 110 |
| abstract_inverted_index.contributions. | 35 |
| abstract_inverted_index.generalization | 150 |
| abstract_inverted_index.Vision-Language | 9 |
| abstract_inverted_index.disadvantageous | 94 |
| abstract_inverted_index.Attribute-Guided | 28 |
| abstract_inverted_index.state-of-the-art | 139 |
| abstract_inverted_index.out-of-distribution | 149 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |