Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation Article Swipe
YOU?
·
· 2022
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2203.09811
Scene Graph Generation, which generally follows a regular encoder-decoder pipeline, aims to first encode the visual contents within the given image and then parse them into a compact summary graph. Existing SGG approaches generally not only neglect the insufficient modality fusion between vision and language, but also fail to provide informative predicates due to the biased relationship predictions, leading SGG far from practical. Towards this end, in this paper, we first present a novel Stacked Hybrid-Attention network, which facilitates the intra-modal refinement as well as the inter-modal interaction, to serve as the encoder. We then devise an innovative Group Collaborative Learning strategy to optimize the decoder. Particularly, based upon the observation that the recognition capability of one classifier is limited towards an extremely unbalanced dataset, we first deploy a group of classifiers that are expert in distinguishing different subsets of classes, and then cooperatively optimize them from two aspects to promote the unbiased SGG. Experiments conducted on VG and GQA datasets demonstrate that, we not only establish a new state-of-the-art in the unbiased metric, but also nearly double the performance compared with two baselines.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2203.09811
- https://arxiv.org/pdf/2203.09811
- OA Status
- green
- Cited By
- 6
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4221163079
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4221163079Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2203.09811Digital Object Identifier
- Title
-
Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph GenerationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2022Year of publication
- Publication date
-
2022-03-18Full publication date if available
- Authors
-
Xingning Dong, Tian Gan, Xuemeng Song, Jianlong Wu, Yuan Cheng, Liqiang NieList of authors in order
- Landing page
-
https://arxiv.org/abs/2203.09811Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2203.09811Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2203.09811Direct OA link when available
- Concepts
-
Computer science, Scene graph, ENCODE, Classifier (UML), Parsing, Encoder, Graph, Artificial intelligence, Modal, Machine learning, Theoretical computer science, Pattern recognition (psychology), Chemistry, Rendering (computer graphics), Operating system, Polymer chemistry, Gene, BiochemistryTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
6Total citation count in OpenAlex
- Citations by year (recent)
-
2024: 3, 2023: 3Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4221163079 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2203.09811 |
| ids.doi | https://doi.org/10.48550/arxiv.2203.09811 |
| ids.openalex | https://openalex.org/W4221163079 |
| fwci | |
| type | preprint |
| title | Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11714 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 1.0 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Multimodal Machine Learning Applications |
| topics[1].id | https://openalex.org/T10627 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9994000196456909 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1707 |
| topics[1].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[1].display_name | Advanced Image and Video Retrieval Techniques |
| topics[2].id | https://openalex.org/T11307 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9927999973297119 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Domain Adaptation and Few-Shot Learning |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.7756763696670532 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C179372163 |
| concepts[1].level | 3 |
| concepts[1].score | 0.7266556024551392 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q1406181 |
| concepts[1].display_name | Scene graph |
| concepts[2].id | https://openalex.org/C66746571 |
| concepts[2].level | 3 |
| concepts[2].score | 0.633080244064331 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q1134833 |
| concepts[2].display_name | ENCODE |
| concepts[3].id | https://openalex.org/C95623464 |
| concepts[3].level | 2 |
| concepts[3].score | 0.612680196762085 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q1096149 |
| concepts[3].display_name | Classifier (UML) |
| concepts[4].id | https://openalex.org/C186644900 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5740151405334473 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q194152 |
| concepts[4].display_name | Parsing |
| concepts[5].id | https://openalex.org/C118505674 |
| concepts[5].level | 2 |
| concepts[5].score | 0.548311173915863 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q42586063 |
| concepts[5].display_name | Encoder |
| concepts[6].id | https://openalex.org/C132525143 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5278534889221191 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q141488 |
| concepts[6].display_name | Graph |
| concepts[7].id | https://openalex.org/C154945302 |
| concepts[7].level | 1 |
| concepts[7].score | 0.495976060628891 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[7].display_name | Artificial intelligence |
| concepts[8].id | https://openalex.org/C71139939 |
| concepts[8].level | 2 |
| concepts[8].score | 0.4611987769603729 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q910194 |
| concepts[8].display_name | Modal |
| concepts[9].id | https://openalex.org/C119857082 |
| concepts[9].level | 1 |
| concepts[9].score | 0.4608418941497803 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[9].display_name | Machine learning |
| concepts[10].id | https://openalex.org/C80444323 |
| concepts[10].level | 1 |
| concepts[10].score | 0.347426176071167 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q2878974 |
| concepts[10].display_name | Theoretical computer science |
| concepts[11].id | https://openalex.org/C153180895 |
| concepts[11].level | 2 |
| concepts[11].score | 0.33721762895584106 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q7148389 |
| concepts[11].display_name | Pattern recognition (psychology) |
| concepts[12].id | https://openalex.org/C185592680 |
| concepts[12].level | 0 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[12].display_name | Chemistry |
| concepts[13].id | https://openalex.org/C205711294 |
| concepts[13].level | 2 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q176953 |
| concepts[13].display_name | Rendering (computer graphics) |
| concepts[14].id | https://openalex.org/C111919701 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[14].display_name | Operating system |
| concepts[15].id | https://openalex.org/C188027245 |
| concepts[15].level | 1 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q750446 |
| concepts[15].display_name | Polymer chemistry |
| concepts[16].id | https://openalex.org/C104317684 |
| concepts[16].level | 2 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q7187 |
| concepts[16].display_name | Gene |
| concepts[17].id | https://openalex.org/C55493867 |
| concepts[17].level | 1 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q7094 |
| concepts[17].display_name | Biochemistry |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.7756763696670532 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/scene-graph |
| keywords[1].score | 0.7266556024551392 |
| keywords[1].display_name | Scene graph |
| keywords[2].id | https://openalex.org/keywords/encode |
| keywords[2].score | 0.633080244064331 |
| keywords[2].display_name | ENCODE |
| keywords[3].id | https://openalex.org/keywords/classifier |
| keywords[3].score | 0.612680196762085 |
| keywords[3].display_name | Classifier (UML) |
| keywords[4].id | https://openalex.org/keywords/parsing |
| keywords[4].score | 0.5740151405334473 |
| keywords[4].display_name | Parsing |
| keywords[5].id | https://openalex.org/keywords/encoder |
| keywords[5].score | 0.548311173915863 |
| keywords[5].display_name | Encoder |
| keywords[6].id | https://openalex.org/keywords/graph |
| keywords[6].score | 0.5278534889221191 |
| keywords[6].display_name | Graph |
| keywords[7].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[7].score | 0.495976060628891 |
| keywords[7].display_name | Artificial intelligence |
| keywords[8].id | https://openalex.org/keywords/modal |
| keywords[8].score | 0.4611987769603729 |
| keywords[8].display_name | Modal |
| keywords[9].id | https://openalex.org/keywords/machine-learning |
| keywords[9].score | 0.4608418941497803 |
| keywords[9].display_name | Machine learning |
| keywords[10].id | https://openalex.org/keywords/theoretical-computer-science |
| keywords[10].score | 0.347426176071167 |
| keywords[10].display_name | Theoretical computer science |
| keywords[11].id | https://openalex.org/keywords/pattern-recognition |
| keywords[11].score | 0.33721762895584106 |
| keywords[11].display_name | Pattern recognition (psychology) |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2203.09811 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://arxiv.org/pdf/2203.09811 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2203.09811 |
| locations[1].id | doi:10.48550/arxiv.2203.09811 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2203.09811 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5036171993 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-0245-9064 |
| authorships[0].author.display_name | Xingning Dong |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Dong, Xingning |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5100654958 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-3197-5698 |
| authorships[1].author.display_name | Tian Gan |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Gan, Tian |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5072768866 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-5274-4197 |
| authorships[2].author.display_name | Xuemeng Song |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Song, Xuemeng |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5100654190 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-0247-5221 |
| authorships[3].author.display_name | Jianlong Wu |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Wu, Jianlong |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5100582162 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-1830-7951 |
| authorships[4].author.display_name | Yuan Cheng |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Cheng, Yuan |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5038612499 |
| authorships[5].author.orcid | https://orcid.org/0000-0003-1476-0273 |
| authorships[5].author.display_name | Liqiang Nie |
| authorships[5].author_position | last |
| authorships[5].raw_author_name | Nie, Liqiang |
| authorships[5].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2203.09811 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2022-04-03T00:00:00 |
| display_name | Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11714 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 1.0 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Multimodal Machine Learning Applications |
| related_works | https://openalex.org/W2468279273, https://openalex.org/W2354198838, https://openalex.org/W1989130879, https://openalex.org/W2103419012, https://openalex.org/W2988126442, https://openalex.org/W2275988210, https://openalex.org/W2754155766, https://openalex.org/W2963192850, https://openalex.org/W4287854977, https://openalex.org/W2769151336 |
| cited_by_count | 6 |
| counts_by_year[0].year | 2024 |
| counts_by_year[0].cited_by_count | 3 |
| counts_by_year[1].year | 2023 |
| counts_by_year[1].cited_by_count | 3 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2203.09811 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2203.09811 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2203.09811 |
| primary_location.id | pmh:oai:arXiv.org:2203.09811 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://arxiv.org/pdf/2203.09811 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2203.09811 |
| publication_date | 2022-03-18 |
| publication_year | 2022 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 6, 26, 72, 128, 167 |
| abstract_inverted_index.VG | 157 |
| abstract_inverted_index.We | 93 |
| abstract_inverted_index.an | 96, 121 |
| abstract_inverted_index.as | 82, 84, 90 |
| abstract_inverted_index.in | 66, 135, 170 |
| abstract_inverted_index.is | 118 |
| abstract_inverted_index.of | 115, 130, 139 |
| abstract_inverted_index.on | 156 |
| abstract_inverted_index.to | 11, 48, 53, 88, 102, 149 |
| abstract_inverted_index.we | 69, 125, 163 |
| abstract_inverted_index.GQA | 159 |
| abstract_inverted_index.SGG | 31, 59 |
| abstract_inverted_index.and | 21, 43, 141, 158 |
| abstract_inverted_index.are | 133 |
| abstract_inverted_index.but | 45, 174 |
| abstract_inverted_index.due | 52 |
| abstract_inverted_index.far | 60 |
| abstract_inverted_index.new | 168 |
| abstract_inverted_index.not | 34, 164 |
| abstract_inverted_index.one | 116 |
| abstract_inverted_index.the | 14, 18, 37, 54, 79, 85, 91, 104, 109, 112, 151, 171, 178 |
| abstract_inverted_index.two | 147, 182 |
| abstract_inverted_index.SGG. | 153 |
| abstract_inverted_index.aims | 10 |
| abstract_inverted_index.also | 46, 175 |
| abstract_inverted_index.end, | 65 |
| abstract_inverted_index.fail | 47 |
| abstract_inverted_index.from | 61, 146 |
| abstract_inverted_index.into | 25 |
| abstract_inverted_index.only | 35, 165 |
| abstract_inverted_index.that | 111, 132 |
| abstract_inverted_index.them | 24, 145 |
| abstract_inverted_index.then | 22, 94, 142 |
| abstract_inverted_index.this | 64, 67 |
| abstract_inverted_index.upon | 108 |
| abstract_inverted_index.well | 83 |
| abstract_inverted_index.with | 181 |
| abstract_inverted_index.Graph | 1 |
| abstract_inverted_index.Group | 98 |
| abstract_inverted_index.Scene | 0 |
| abstract_inverted_index.based | 107 |
| abstract_inverted_index.first | 12, 70, 126 |
| abstract_inverted_index.given | 19 |
| abstract_inverted_index.group | 129 |
| abstract_inverted_index.image | 20 |
| abstract_inverted_index.novel | 73 |
| abstract_inverted_index.parse | 23 |
| abstract_inverted_index.serve | 89 |
| abstract_inverted_index.that, | 162 |
| abstract_inverted_index.which | 3, 77 |
| abstract_inverted_index.biased | 55 |
| abstract_inverted_index.deploy | 127 |
| abstract_inverted_index.devise | 95 |
| abstract_inverted_index.double | 177 |
| abstract_inverted_index.encode | 13 |
| abstract_inverted_index.expert | 134 |
| abstract_inverted_index.fusion | 40 |
| abstract_inverted_index.graph. | 29 |
| abstract_inverted_index.nearly | 176 |
| abstract_inverted_index.paper, | 68 |
| abstract_inverted_index.vision | 42 |
| abstract_inverted_index.visual | 15 |
| abstract_inverted_index.within | 17 |
| abstract_inverted_index.Stacked | 74 |
| abstract_inverted_index.Towards | 63 |
| abstract_inverted_index.aspects | 148 |
| abstract_inverted_index.between | 41 |
| abstract_inverted_index.compact | 27 |
| abstract_inverted_index.follows | 5 |
| abstract_inverted_index.leading | 58 |
| abstract_inverted_index.limited | 119 |
| abstract_inverted_index.metric, | 173 |
| abstract_inverted_index.neglect | 36 |
| abstract_inverted_index.present | 71 |
| abstract_inverted_index.promote | 150 |
| abstract_inverted_index.provide | 49 |
| abstract_inverted_index.regular | 7 |
| abstract_inverted_index.subsets | 138 |
| abstract_inverted_index.summary | 28 |
| abstract_inverted_index.towards | 120 |
| abstract_inverted_index.Existing | 30 |
| abstract_inverted_index.Learning | 100 |
| abstract_inverted_index.classes, | 140 |
| abstract_inverted_index.compared | 180 |
| abstract_inverted_index.contents | 16 |
| abstract_inverted_index.dataset, | 124 |
| abstract_inverted_index.datasets | 160 |
| abstract_inverted_index.decoder. | 105 |
| abstract_inverted_index.encoder. | 92 |
| abstract_inverted_index.modality | 39 |
| abstract_inverted_index.network, | 76 |
| abstract_inverted_index.optimize | 103, 144 |
| abstract_inverted_index.strategy | 101 |
| abstract_inverted_index.unbiased | 152, 172 |
| abstract_inverted_index.conducted | 155 |
| abstract_inverted_index.different | 137 |
| abstract_inverted_index.establish | 166 |
| abstract_inverted_index.extremely | 122 |
| abstract_inverted_index.generally | 4, 33 |
| abstract_inverted_index.language, | 44 |
| abstract_inverted_index.pipeline, | 9 |
| abstract_inverted_index.approaches | 32 |
| abstract_inverted_index.baselines. | 183 |
| abstract_inverted_index.capability | 114 |
| abstract_inverted_index.classifier | 117 |
| abstract_inverted_index.innovative | 97 |
| abstract_inverted_index.practical. | 62 |
| abstract_inverted_index.predicates | 51 |
| abstract_inverted_index.refinement | 81 |
| abstract_inverted_index.unbalanced | 123 |
| abstract_inverted_index.Experiments | 154 |
| abstract_inverted_index.Generation, | 2 |
| abstract_inverted_index.classifiers | 131 |
| abstract_inverted_index.demonstrate | 161 |
| abstract_inverted_index.facilitates | 78 |
| abstract_inverted_index.informative | 50 |
| abstract_inverted_index.inter-modal | 86 |
| abstract_inverted_index.intra-modal | 80 |
| abstract_inverted_index.observation | 110 |
| abstract_inverted_index.performance | 179 |
| abstract_inverted_index.recognition | 113 |
| abstract_inverted_index.insufficient | 38 |
| abstract_inverted_index.interaction, | 87 |
| abstract_inverted_index.predictions, | 57 |
| abstract_inverted_index.relationship | 56 |
| abstract_inverted_index.Collaborative | 99 |
| abstract_inverted_index.Particularly, | 106 |
| abstract_inverted_index.cooperatively | 143 |
| abstract_inverted_index.distinguishing | 136 |
| abstract_inverted_index.encoder-decoder | 8 |
| abstract_inverted_index.Hybrid-Attention | 75 |
| abstract_inverted_index.state-of-the-art | 169 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 6 |
| citation_normalized_percentile |