Visio-Linguistic Brain Encoding Article Swipe
YOU?
·
· 2022
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2204.08261
Enabling effective brain-computer interfaces requires understanding how the human brain encodes stimuli across modalities such as visual, language (or text), etc. Brain encoding aims at constructing fMRI brain activity given a stimulus. There exists a plethora of neural encoding models which study brain encoding for single mode stimuli: visual (pretrained CNNs) or text (pretrained language models). Few recent papers have also obtained separate visual and text representation models and performed late-fusion using simple heuristics. However, previous work has failed to explore: (a) the effectiveness of image Transformer models for encoding visual stimuli, and (b) co-attentive multi-modal modeling for visual and text reasoning. In this paper, we systematically explore the efficacy of image Transformers (ViT, DEiT, and BEiT) and multi-modal Transformers (VisualBERT, LXMERT, and CLIP) for brain encoding. Extensive experiments on two popular datasets, BOLD5000 and Pereira, provide the following insights. (1) To the best of our knowledge, we are the first to investigate the effectiveness of image and multi-modal Transformers for brain encoding. (2) We find that VisualBERT, a multi-modal Transformer, significantly outperforms previously proposed single-mode CNNs, image Transformers as well as other previously proposed multi-modal models, thereby establishing new state-of-the-art. The supremacy of visio-linguistic models raises the question of whether the responses elicited in the visual regions are affected implicitly by linguistic processing even when passively viewing images. Future fMRI tasks can verify this computational insight in an appropriate experimental setting.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- https://doi.org/10.48550/arxiv.2204.08261
- OA Status
- green
- Cited By
- 9
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4226399507
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4226399507Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2204.08261Digital Object Identifier
- Title
-
Visio-Linguistic Brain EncodingWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2022Year of publication
- Publication date
-
2022-04-18Full publication date if available
- Authors
-
Subba Reddy Oota, Jashn Arora, Vijay Rowtula, Manish Gupta, Raju S. BapiList of authors in order
- Landing page
-
https://doi.org/10.48550/arxiv.2204.08261Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://doi.org/10.48550/arxiv.2204.08261Direct OA link when available
- Concepts
-
Computer science, Heuristics, Transformer, Modal, Encoding (memory), Artificial intelligence, Natural language processing, Speech recognition, Pattern recognition (psychology), Polymer chemistry, Chemistry, Quantum mechanics, Operating system, Physics, VoltageTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
9Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1, 2024: 1, 2023: 5, 2022: 2Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4226399507 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2204.08261 |
| ids.doi | https://doi.org/10.48550/arxiv.2204.08261 |
| ids.openalex | https://openalex.org/W4226399507 |
| fwci | |
| type | preprint |
| title | Visio-Linguistic Brain Encoding |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11714 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9751999974250793 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Multimodal Machine Learning Applications |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.7105696201324463 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C127705205 |
| concepts[1].level | 2 |
| concepts[1].score | 0.665917158126831 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q5748245 |
| concepts[1].display_name | Heuristics |
| concepts[2].id | https://openalex.org/C66322947 |
| concepts[2].level | 3 |
| concepts[2].score | 0.620612621307373 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11658 |
| concepts[2].display_name | Transformer |
| concepts[3].id | https://openalex.org/C71139939 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5153328776359558 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q910194 |
| concepts[3].display_name | Modal |
| concepts[4].id | https://openalex.org/C125411270 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5008759498596191 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q18653 |
| concepts[4].display_name | Encoding (memory) |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.4841778576374054 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C204321447 |
| concepts[6].level | 1 |
| concepts[6].score | 0.41500788927078247 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[6].display_name | Natural language processing |
| concepts[7].id | https://openalex.org/C28490314 |
| concepts[7].level | 1 |
| concepts[7].score | 0.3403284549713135 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q189436 |
| concepts[7].display_name | Speech recognition |
| concepts[8].id | https://openalex.org/C153180895 |
| concepts[8].level | 2 |
| concepts[8].score | 0.32122230529785156 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q7148389 |
| concepts[8].display_name | Pattern recognition (psychology) |
| concepts[9].id | https://openalex.org/C188027245 |
| concepts[9].level | 1 |
| concepts[9].score | 0.0 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q750446 |
| concepts[9].display_name | Polymer chemistry |
| concepts[10].id | https://openalex.org/C185592680 |
| concepts[10].level | 0 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[10].display_name | Chemistry |
| concepts[11].id | https://openalex.org/C62520636 |
| concepts[11].level | 1 |
| concepts[11].score | 0.0 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q944 |
| concepts[11].display_name | Quantum mechanics |
| concepts[12].id | https://openalex.org/C111919701 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[12].display_name | Operating system |
| concepts[13].id | https://openalex.org/C121332964 |
| concepts[13].level | 0 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[13].display_name | Physics |
| concepts[14].id | https://openalex.org/C165801399 |
| concepts[14].level | 2 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q25428 |
| concepts[14].display_name | Voltage |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.7105696201324463 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/heuristics |
| keywords[1].score | 0.665917158126831 |
| keywords[1].display_name | Heuristics |
| keywords[2].id | https://openalex.org/keywords/transformer |
| keywords[2].score | 0.620612621307373 |
| keywords[2].display_name | Transformer |
| keywords[3].id | https://openalex.org/keywords/modal |
| keywords[3].score | 0.5153328776359558 |
| keywords[3].display_name | Modal |
| keywords[4].id | https://openalex.org/keywords/encoding |
| keywords[4].score | 0.5008759498596191 |
| keywords[4].display_name | Encoding (memory) |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.4841778576374054 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/natural-language-processing |
| keywords[6].score | 0.41500788927078247 |
| keywords[6].display_name | Natural language processing |
| keywords[7].id | https://openalex.org/keywords/speech-recognition |
| keywords[7].score | 0.3403284549713135 |
| keywords[7].display_name | Speech recognition |
| keywords[8].id | https://openalex.org/keywords/pattern-recognition |
| keywords[8].score | 0.32122230529785156 |
| keywords[8].display_name | Pattern recognition (psychology) |
| language | en |
| locations[0].id | doi:10.48550/arxiv.2204.08261 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by |
| locations[0].pdf_url | |
| locations[0].version | |
| locations[0].raw_type | article |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | False |
| locations[0].is_published | |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.48550/arxiv.2204.08261 |
| indexed_in | datacite |
| authorships[0].author.id | https://openalex.org/A5029606497 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-5975-622X |
| authorships[0].author.display_name | Subba Reddy Oota |
| authorships[0].countries | FR, RU |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I4210142254 |
| authorships[0].affiliations[0].raw_affiliation_string | Laboratoire Bordelais de Recherche en Informatique |
| authorships[0].affiliations[1].institution_ids | https://openalex.org/I4210115303 |
| authorships[0].affiliations[1].raw_affiliation_string | Mnemonic Synergy |
| authorships[0].institutions[0].id | https://openalex.org/I4210142254 |
| authorships[0].institutions[0].ror | https://ror.org/03adqg323 |
| authorships[0].institutions[0].type | facility |
| authorships[0].institutions[0].lineage | https://openalex.org/I1294671590, https://openalex.org/I1294671590, https://openalex.org/I15057530, https://openalex.org/I4210142254, https://openalex.org/I4210159245, https://openalex.org/I4210160189 |
| authorships[0].institutions[0].country_code | FR |
| authorships[0].institutions[0].display_name | Laboratoire Bordelais de Recherche en Informatique |
| authorships[0].institutions[1].id | https://openalex.org/I4210115303 |
| authorships[0].institutions[1].ror | https://ror.org/028mtfb17 |
| authorships[0].institutions[1].type | education |
| authorships[0].institutions[1].lineage | https://openalex.org/I4210115303 |
| authorships[0].institutions[1].country_code | RU |
| authorships[0].institutions[1].display_name | Moscow University «Synergy» |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Oota, Subba Reddy |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Laboratoire Bordelais de Recherche en Informatique, Mnemonic Synergy |
| authorships[1].author.id | https://openalex.org/A5005968671 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Jashn Arora |
| authorships[1].countries | IN |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I64189192 |
| authorships[1].affiliations[0].raw_affiliation_string | International Institute of Information Technology, Hyderabad [Hyderabad] |
| authorships[1].institutions[0].id | https://openalex.org/I64189192 |
| authorships[1].institutions[0].ror | https://ror.org/05f11g639 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I64189192 |
| authorships[1].institutions[0].country_code | IN |
| authorships[1].institutions[0].display_name | International Institute of Information Technology, Hyderabad |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Arora, Jashn |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | International Institute of Information Technology, Hyderabad [Hyderabad] |
| authorships[2].author.id | https://openalex.org/A5084991192 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Vijay Rowtula |
| authorships[2].countries | IN |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I64189192 |
| authorships[2].affiliations[0].raw_affiliation_string | International Institute of Information Technology, Hyderabad [Hyderabad] |
| authorships[2].institutions[0].id | https://openalex.org/I64189192 |
| authorships[2].institutions[0].ror | https://ror.org/05f11g639 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I64189192 |
| authorships[2].institutions[0].country_code | IN |
| authorships[2].institutions[0].display_name | International Institute of Information Technology, Hyderabad |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Rowtula, Vijay |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | International Institute of Information Technology, Hyderabad [Hyderabad] |
| authorships[3].author.id | https://openalex.org/A5101454729 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-0848-6132 |
| authorships[3].author.display_name | Manish Gupta |
| authorships[3].countries | GB, IN |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I4210164937 |
| authorships[3].affiliations[0].raw_affiliation_string | Microsoft Research |
| authorships[3].affiliations[1].institution_ids | https://openalex.org/I64189192 |
| authorships[3].affiliations[1].raw_affiliation_string | International Institute of Information Technology, Hyderabad [Hyderabad] |
| authorships[3].institutions[0].id | https://openalex.org/I4210164937 |
| authorships[3].institutions[0].ror | https://ror.org/05k87vq12 |
| authorships[3].institutions[0].type | company |
| authorships[3].institutions[0].lineage | https://openalex.org/I1290206253, https://openalex.org/I4210164937 |
| authorships[3].institutions[0].country_code | GB |
| authorships[3].institutions[0].display_name | Microsoft Research (United Kingdom) |
| authorships[3].institutions[1].id | https://openalex.org/I64189192 |
| authorships[3].institutions[1].ror | https://ror.org/05f11g639 |
| authorships[3].institutions[1].type | education |
| authorships[3].institutions[1].lineage | https://openalex.org/I64189192 |
| authorships[3].institutions[1].country_code | IN |
| authorships[3].institutions[1].display_name | International Institute of Information Technology, Hyderabad |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Gupta, Manish |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | International Institute of Information Technology, Hyderabad [Hyderabad], Microsoft Research |
| authorships[4].author.id | https://openalex.org/A5049423985 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-2204-0890 |
| authorships[4].author.display_name | Raju S. Bapi |
| authorships[4].countries | IN |
| authorships[4].affiliations[0].institution_ids | https://openalex.org/I64189192 |
| authorships[4].affiliations[0].raw_affiliation_string | International Institute of Information Technology, Hyderabad [Hyderabad] |
| authorships[4].institutions[0].id | https://openalex.org/I64189192 |
| authorships[4].institutions[0].ror | https://ror.org/05f11g639 |
| authorships[4].institutions[0].type | education |
| authorships[4].institutions[0].lineage | https://openalex.org/I64189192 |
| authorships[4].institutions[0].country_code | IN |
| authorships[4].institutions[0].display_name | International Institute of Information Technology, Hyderabad |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Bapi, Raju S. |
| authorships[4].is_corresponding | False |
| authorships[4].raw_affiliation_strings | International Institute of Information Technology, Hyderabad [Hyderabad] |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://doi.org/10.48550/arxiv.2204.08261 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Visio-Linguistic Brain Encoding |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11714 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9751999974250793 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Multimodal Machine Learning Applications |
| related_works | https://openalex.org/W2280422768, https://openalex.org/W3143197806, https://openalex.org/W4252555497, https://openalex.org/W3121175838, https://openalex.org/W3016293053, https://openalex.org/W1690653314, https://openalex.org/W2401723157, https://openalex.org/W2065055572, https://openalex.org/W2784269775, https://openalex.org/W2952904874 |
| cited_by_count | 9 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 1 |
| counts_by_year[2].year | 2023 |
| counts_by_year[2].cited_by_count | 5 |
| counts_by_year[3].year | 2022 |
| counts_by_year[3].cited_by_count | 2 |
| locations_count | 1 |
| best_oa_location.id | doi:10.48550/arxiv.2204.08261 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | |
| best_oa_location.version | |
| best_oa_location.raw_type | article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.48550/arxiv.2204.08261 |
| primary_location.id | doi:10.48550/arxiv.2204.08261 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by |
| primary_location.pdf_url | |
| primary_location.version | |
| primary_location.raw_type | article |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.48550/arxiv.2204.08261 |
| publication_date | 2022-04-18 |
| publication_year | 2022 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 30, 34, 168 |
| abstract_inverted_index.In | 102 |
| abstract_inverted_index.To | 141 |
| abstract_inverted_index.We | 164 |
| abstract_inverted_index.an | 228 |
| abstract_inverted_index.as | 15, 179, 181 |
| abstract_inverted_index.at | 24 |
| abstract_inverted_index.by | 211 |
| abstract_inverted_index.in | 204, 227 |
| abstract_inverted_index.of | 36, 84, 110, 144, 155, 193, 199 |
| abstract_inverted_index.on | 129 |
| abstract_inverted_index.or | 51 |
| abstract_inverted_index.to | 79, 151 |
| abstract_inverted_index.we | 105, 147 |
| abstract_inverted_index.(1) | 140 |
| abstract_inverted_index.(2) | 163 |
| abstract_inverted_index.(a) | 81 |
| abstract_inverted_index.(b) | 93 |
| abstract_inverted_index.(or | 18 |
| abstract_inverted_index.Few | 56 |
| abstract_inverted_index.The | 191 |
| abstract_inverted_index.and | 64, 68, 92, 99, 115, 117, 122, 134, 157 |
| abstract_inverted_index.are | 148, 208 |
| abstract_inverted_index.can | 222 |
| abstract_inverted_index.for | 44, 88, 97, 124, 160 |
| abstract_inverted_index.has | 77 |
| abstract_inverted_index.how | 6 |
| abstract_inverted_index.new | 189 |
| abstract_inverted_index.our | 145 |
| abstract_inverted_index.the | 7, 82, 108, 137, 142, 149, 153, 197, 201, 205 |
| abstract_inverted_index.two | 130 |
| abstract_inverted_index.aims | 23 |
| abstract_inverted_index.also | 60 |
| abstract_inverted_index.best | 143 |
| abstract_inverted_index.etc. | 20 |
| abstract_inverted_index.even | 214 |
| abstract_inverted_index.fMRI | 26, 220 |
| abstract_inverted_index.find | 165 |
| abstract_inverted_index.have | 59 |
| abstract_inverted_index.mode | 46 |
| abstract_inverted_index.such | 14 |
| abstract_inverted_index.text | 52, 65, 100 |
| abstract_inverted_index.that | 166 |
| abstract_inverted_index.this | 103, 224 |
| abstract_inverted_index.well | 180 |
| abstract_inverted_index.when | 215 |
| abstract_inverted_index.work | 76 |
| abstract_inverted_index.(ViT, | 113 |
| abstract_inverted_index.BEiT) | 116 |
| abstract_inverted_index.Brain | 21 |
| abstract_inverted_index.CLIP) | 123 |
| abstract_inverted_index.CNNs) | 50 |
| abstract_inverted_index.CNNs, | 176 |
| abstract_inverted_index.DEiT, | 114 |
| abstract_inverted_index.There | 32 |
| abstract_inverted_index.brain | 9, 27, 42, 125, 161 |
| abstract_inverted_index.first | 150 |
| abstract_inverted_index.given | 29 |
| abstract_inverted_index.human | 8 |
| abstract_inverted_index.image | 85, 111, 156, 177 |
| abstract_inverted_index.other | 182 |
| abstract_inverted_index.study | 41 |
| abstract_inverted_index.tasks | 221 |
| abstract_inverted_index.using | 71 |
| abstract_inverted_index.which | 40 |
| abstract_inverted_index.Future | 219 |
| abstract_inverted_index.across | 12 |
| abstract_inverted_index.exists | 33 |
| abstract_inverted_index.failed | 78 |
| abstract_inverted_index.models | 39, 67, 87, 195 |
| abstract_inverted_index.neural | 37 |
| abstract_inverted_index.paper, | 104 |
| abstract_inverted_index.papers | 58 |
| abstract_inverted_index.raises | 196 |
| abstract_inverted_index.recent | 57 |
| abstract_inverted_index.simple | 72 |
| abstract_inverted_index.single | 45 |
| abstract_inverted_index.text), | 19 |
| abstract_inverted_index.verify | 223 |
| abstract_inverted_index.visual | 48, 63, 90, 98, 206 |
| abstract_inverted_index.LXMERT, | 121 |
| abstract_inverted_index.encodes | 10 |
| abstract_inverted_index.explore | 107 |
| abstract_inverted_index.images. | 218 |
| abstract_inverted_index.insight | 226 |
| abstract_inverted_index.models, | 186 |
| abstract_inverted_index.popular | 131 |
| abstract_inverted_index.provide | 136 |
| abstract_inverted_index.regions | 207 |
| abstract_inverted_index.stimuli | 11 |
| abstract_inverted_index.thereby | 187 |
| abstract_inverted_index.viewing | 217 |
| abstract_inverted_index.visual, | 16 |
| abstract_inverted_index.whether | 200 |
| abstract_inverted_index.BOLD5000 | 133 |
| abstract_inverted_index.Enabling | 0 |
| abstract_inverted_index.However, | 74 |
| abstract_inverted_index.Pereira, | 135 |
| abstract_inverted_index.activity | 28 |
| abstract_inverted_index.affected | 209 |
| abstract_inverted_index.efficacy | 109 |
| abstract_inverted_index.elicited | 203 |
| abstract_inverted_index.encoding | 22, 38, 43, 89 |
| abstract_inverted_index.explore: | 80 |
| abstract_inverted_index.language | 17, 54 |
| abstract_inverted_index.modeling | 96 |
| abstract_inverted_index.models). | 55 |
| abstract_inverted_index.obtained | 61 |
| abstract_inverted_index.plethora | 35 |
| abstract_inverted_index.previous | 75 |
| abstract_inverted_index.proposed | 174, 184 |
| abstract_inverted_index.question | 198 |
| abstract_inverted_index.requires | 4 |
| abstract_inverted_index.separate | 62 |
| abstract_inverted_index.setting. | 231 |
| abstract_inverted_index.stimuli, | 91 |
| abstract_inverted_index.stimuli: | 47 |
| abstract_inverted_index.Extensive | 127 |
| abstract_inverted_index.datasets, | 132 |
| abstract_inverted_index.effective | 1 |
| abstract_inverted_index.encoding. | 126, 162 |
| abstract_inverted_index.following | 138 |
| abstract_inverted_index.insights. | 139 |
| abstract_inverted_index.passively | 216 |
| abstract_inverted_index.performed | 69 |
| abstract_inverted_index.responses | 202 |
| abstract_inverted_index.stimulus. | 31 |
| abstract_inverted_index.supremacy | 192 |
| abstract_inverted_index.implicitly | 210 |
| abstract_inverted_index.interfaces | 3 |
| abstract_inverted_index.knowledge, | 146 |
| abstract_inverted_index.linguistic | 212 |
| abstract_inverted_index.modalities | 13 |
| abstract_inverted_index.previously | 173, 183 |
| abstract_inverted_index.processing | 213 |
| abstract_inverted_index.reasoning. | 101 |
| abstract_inverted_index.(pretrained | 49, 53 |
| abstract_inverted_index.Transformer | 86 |
| abstract_inverted_index.VisualBERT, | 167 |
| abstract_inverted_index.appropriate | 229 |
| abstract_inverted_index.experiments | 128 |
| abstract_inverted_index.heuristics. | 73 |
| abstract_inverted_index.investigate | 152 |
| abstract_inverted_index.late-fusion | 70 |
| abstract_inverted_index.multi-modal | 95, 118, 158, 169, 185 |
| abstract_inverted_index.outperforms | 172 |
| abstract_inverted_index.single-mode | 175 |
| abstract_inverted_index.(VisualBERT, | 120 |
| abstract_inverted_index.Transformer, | 170 |
| abstract_inverted_index.Transformers | 112, 119, 159, 178 |
| abstract_inverted_index.co-attentive | 94 |
| abstract_inverted_index.constructing | 25 |
| abstract_inverted_index.establishing | 188 |
| abstract_inverted_index.experimental | 230 |
| abstract_inverted_index.computational | 225 |
| abstract_inverted_index.effectiveness | 83, 154 |
| abstract_inverted_index.significantly | 171 |
| abstract_inverted_index.understanding | 5 |
| abstract_inverted_index.brain-computer | 2 |
| abstract_inverted_index.representation | 66 |
| abstract_inverted_index.systematically | 106 |
| abstract_inverted_index.visio-linguistic | 194 |
| abstract_inverted_index.state-of-the-art. | 190 |
| cited_by_percentile_year | |
| countries_distinct_count | 4 |
| institutions_distinct_count | 5 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/4 |
| sustainable_development_goals[0].score | 0.8199999928474426 |
| sustainable_development_goals[0].display_name | Quality Education |
| citation_normalized_percentile |