AITtrack: Attention-Based Image-Text Alignment for Visual Tracking Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.1109/access.2025.3555816
Vision-Language Models (VLMs) have recently advanced the Visual Object Tracking (VOT) performance. In VLMs, a vision encoder is employed to obtain visual representation, and a text encoder is employed to estimate the textual embeddings using natural language descriptions. By aligning the visual and textual representations, the VLMs achieve robust performance in complex and diverse tracking scenarios, efficiently handling dynamic target appearances such as motion blur, occlusion, fast motion, and similar object distractors. However, the input textual description in many existing VLM-based trackers incorporates class and semantics details without any contextual information. This is addressed in some recent VLM-based State Of The Art (SOTA) trackers by implicitly predicting some important attributes of the target object and encoding them as textual descriptions within the tracking paradigm. However, the SOTA methods neglect the contextual relationship among the predicted attributes. In this work, we propose an Attention-based Image-Text alignment Tracker (AITrack) for robust VOT tasks. AITrack simplifies the process of VLM-based tracking using attention-based visual and textual alignment modules. AITrack utilizes a region-of-interest (ROI) text-guided encoder that leverages existing pre-trained language models to implicitly extract and encode textual features and a simple image encoder to encode visual features. A simple alignment module is implemented to combine both encoded visual and textual features, thereby inherently exposing the semantic relationship between the template and search frames with their surroundings, providing rich encodings for improved tracking performance. We employ a simple decoder that takes past predictions as spatiotemporal clues to effectively model the target appearance changes without the need for complex customized post-processings and prediction heads. Extensive experiments are performed on six publicly available VOT benchmark datasets demonstrating the strong capabilities of our AITtrack by gaining an average success rate of 2.0%. Our code will be publicly available on: https://github.com/BasitAlawode/AITrack.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1109/access.2025.3555816
- OA Status
- gold
- Cited By
- 1
- References
- 107
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4408941374
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4408941374Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1109/access.2025.3555816Digital Object Identifier
- Title
-
AITtrack: Attention-Based Image-Text Alignment for Visual TrackingWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-01-01Full publication date if available
- Authors
-
Basit Alawode, Sajid JavedList of authors in order
- Landing page
-
https://doi.org/10.1109/access.2025.3555816Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
goldOpen access status per OpenAlex
- OA URL
-
https://doi.org/10.1109/access.2025.3555816Direct OA link when available
- Concepts
-
Computer science, Computer vision, Artificial intelligence, Eye tracking, Tracking (education), Image (mathematics), Pedagogy, PsychologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1Per-year citation counts (last 5 years)
- References (count)
-
107Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4408941374 |
|---|---|
| doi | https://doi.org/10.1109/access.2025.3555816 |
| ids.doi | https://doi.org/10.1109/access.2025.3555816 |
| ids.openalex | https://openalex.org/W4408941374 |
| fwci | 4.77340731 |
| type | article |
| title | AITtrack: Attention-Based Image-Text Alignment for Visual Tracking |
| biblio.issue | |
| biblio.volume | 13 |
| biblio.last_page | 67111 |
| biblio.first_page | 67095 |
| topics[0].id | https://openalex.org/T10824 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9926999807357788 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Image Retrieval and Classification Techniques |
| topics[1].id | https://openalex.org/T11439 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9751999974250793 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1707 |
| topics[1].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[1].display_name | Video Analysis and Summarization |
| topics[2].id | https://openalex.org/T10627 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9585999846458435 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1707 |
| topics[2].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[2].display_name | Advanced Image and Video Retrieval Techniques |
| is_xpac | False |
| apc_list.value | 1850 |
| apc_list.currency | USD |
| apc_list.value_usd | 1850 |
| apc_paid.value | 1850 |
| apc_paid.currency | USD |
| apc_paid.value_usd | 1850 |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.7781531810760498 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C31972630 |
| concepts[1].level | 1 |
| concepts[1].score | 0.7333129048347473 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q844240 |
| concepts[1].display_name | Computer vision |
| concepts[2].id | https://openalex.org/C154945302 |
| concepts[2].level | 1 |
| concepts[2].score | 0.6548985838890076 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[2].display_name | Artificial intelligence |
| concepts[3].id | https://openalex.org/C56461940 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5586032867431641 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q970687 |
| concepts[3].display_name | Eye tracking |
| concepts[4].id | https://openalex.org/C2775936607 |
| concepts[4].level | 2 |
| concepts[4].score | 0.45133596658706665 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q466845 |
| concepts[4].display_name | Tracking (education) |
| concepts[5].id | https://openalex.org/C115961682 |
| concepts[5].level | 2 |
| concepts[5].score | 0.42369717359542847 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q860623 |
| concepts[5].display_name | Image (mathematics) |
| concepts[6].id | https://openalex.org/C19417346 |
| concepts[6].level | 1 |
| concepts[6].score | 0.0 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q7922 |
| concepts[6].display_name | Pedagogy |
| concepts[7].id | https://openalex.org/C15744967 |
| concepts[7].level | 0 |
| concepts[7].score | 0.0 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q9418 |
| concepts[7].display_name | Psychology |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.7781531810760498 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/computer-vision |
| keywords[1].score | 0.7333129048347473 |
| keywords[1].display_name | Computer vision |
| keywords[2].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[2].score | 0.6548985838890076 |
| keywords[2].display_name | Artificial intelligence |
| keywords[3].id | https://openalex.org/keywords/eye-tracking |
| keywords[3].score | 0.5586032867431641 |
| keywords[3].display_name | Eye tracking |
| keywords[4].id | https://openalex.org/keywords/tracking |
| keywords[4].score | 0.45133596658706665 |
| keywords[4].display_name | Tracking (education) |
| keywords[5].id | https://openalex.org/keywords/image |
| keywords[5].score | 0.42369717359542847 |
| keywords[5].display_name | Image (mathematics) |
| language | en |
| locations[0].id | doi:10.1109/access.2025.3555816 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S2485537415 |
| locations[0].source.issn | 2169-3536 |
| locations[0].source.type | journal |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | 2169-3536 |
| locations[0].source.is_core | True |
| locations[0].source.is_in_doaj | True |
| locations[0].source.display_name | IEEE Access |
| locations[0].source.host_organization | https://openalex.org/P4310319808 |
| locations[0].source.host_organization_name | Institute of Electrical and Electronics Engineers |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310319808 |
| locations[0].source.host_organization_lineage_names | Institute of Electrical and Electronics Engineers |
| locations[0].license | cc-by |
| locations[0].pdf_url | |
| locations[0].version | publishedVersion |
| locations[0].raw_type | journal-article |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | IEEE Access |
| locations[0].landing_page_url | https://doi.org/10.1109/access.2025.3555816 |
| locations[1].id | pmh:oai:doaj.org/article:302ffb21b6bd436a9c608d3a54bf1a3a |
| locations[1].is_oa | False |
| locations[1].source.id | https://openalex.org/S4306401280 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | DOAJ (DOAJ: Directory of Open Access Journals) |
| locations[1].source.host_organization | |
| locations[1].source.host_organization_name | |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | submittedVersion |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | False |
| locations[1].raw_source_name | IEEE Access, Vol 13, Pp 67095-67111 (2025) |
| locations[1].landing_page_url | https://doaj.org/article/302ffb21b6bd436a9c608d3a54bf1a3a |
| indexed_in | crossref, doaj |
| authorships[0].author.id | https://openalex.org/A5052210676 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-6680-7948 |
| authorships[0].author.display_name | Basit Alawode |
| authorships[0].countries | AE |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I176601375 |
| authorships[0].affiliations[0].raw_affiliation_string | Department of Electrical Engineering and Computer Science, Khalifa University of Science of Technology, P.O No. 127788, Abu Dhabi, U.A.E. |
| authorships[0].institutions[0].id | https://openalex.org/I176601375 |
| authorships[0].institutions[0].ror | https://ror.org/05hffr360 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I176601375 |
| authorships[0].institutions[0].country_code | AE |
| authorships[0].institutions[0].display_name | Khalifa University of Science and Technology |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Basit Alawode |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Department of Electrical Engineering and Computer Science, Khalifa University of Science of Technology, P.O No. 127788, Abu Dhabi, U.A.E. |
| authorships[1].author.id | https://openalex.org/A5071515463 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-0036-2875 |
| authorships[1].author.display_name | Sajid Javed |
| authorships[1].countries | AE |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I176601375 |
| authorships[1].affiliations[0].raw_affiliation_string | Department of Electrical Engineering and Computer Science, Khalifa University of Science of Technology, P.O No. 127788, Abu Dhabi, U.A.E. |
| authorships[1].institutions[0].id | https://openalex.org/I176601375 |
| authorships[1].institutions[0].ror | https://ror.org/05hffr360 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I176601375 |
| authorships[1].institutions[0].country_code | AE |
| authorships[1].institutions[0].display_name | Khalifa University of Science and Technology |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Sajid Javed |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Department of Electrical Engineering and Computer Science, Khalifa University of Science of Technology, P.O No. 127788, Abu Dhabi, U.A.E. |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://doi.org/10.1109/access.2025.3555816 |
| open_access.oa_status | gold |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | AITtrack: Attention-Based Image-Text Alignment for Visual Tracking |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10824 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9926999807357788 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Image Retrieval and Classification Techniques |
| related_works | https://openalex.org/W2772917594, https://openalex.org/W2036807459, https://openalex.org/W2058170566, https://openalex.org/W2755342338, https://openalex.org/W2166024367, https://openalex.org/W3116076068, https://openalex.org/W2229312674, https://openalex.org/W2951359407, https://openalex.org/W2079911747, https://openalex.org/W1969923398 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | doi:10.1109/access.2025.3555816 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S2485537415 |
| best_oa_location.source.issn | 2169-3536 |
| best_oa_location.source.type | journal |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | 2169-3536 |
| best_oa_location.source.is_core | True |
| best_oa_location.source.is_in_doaj | True |
| best_oa_location.source.display_name | IEEE Access |
| best_oa_location.source.host_organization | https://openalex.org/P4310319808 |
| best_oa_location.source.host_organization_name | Institute of Electrical and Electronics Engineers |
| best_oa_location.source.host_organization_lineage | https://openalex.org/P4310319808 |
| best_oa_location.source.host_organization_lineage_names | Institute of Electrical and Electronics Engineers |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | journal-article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | IEEE Access |
| best_oa_location.landing_page_url | https://doi.org/10.1109/access.2025.3555816 |
| primary_location.id | doi:10.1109/access.2025.3555816 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S2485537415 |
| primary_location.source.issn | 2169-3536 |
| primary_location.source.type | journal |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | 2169-3536 |
| primary_location.source.is_core | True |
| primary_location.source.is_in_doaj | True |
| primary_location.source.display_name | IEEE Access |
| primary_location.source.host_organization | https://openalex.org/P4310319808 |
| primary_location.source.host_organization_name | Institute of Electrical and Electronics Engineers |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310319808 |
| primary_location.source.host_organization_lineage_names | Institute of Electrical and Electronics Engineers |
| primary_location.license | cc-by |
| primary_location.pdf_url | |
| primary_location.version | publishedVersion |
| primary_location.raw_type | journal-article |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | IEEE Access |
| primary_location.landing_page_url | https://doi.org/10.1109/access.2025.3555816 |
| publication_date | 2025-01-01 |
| publication_year | 2025 |
| referenced_works | https://openalex.org/W4304481542, https://openalex.org/W3155373209, https://openalex.org/W6847266680, https://openalex.org/W4388602741, https://openalex.org/W4385800807, https://openalex.org/W3090155371, https://openalex.org/W2794744029, https://openalex.org/W6856402063, https://openalex.org/W4323647665, https://openalex.org/W4386075643, https://openalex.org/W4390871890, https://openalex.org/W2558899534, https://openalex.org/W2557641257, https://openalex.org/W2988310840, https://openalex.org/W3035466700, https://openalex.org/W2987346479, https://openalex.org/W2966759264, https://openalex.org/W3001584168, https://openalex.org/W3035453691, https://openalex.org/W2470394683, https://openalex.org/W2963534981, https://openalex.org/W3035511673, https://openalex.org/W2886910176, https://openalex.org/W3035571898, https://openalex.org/W2963227409, https://openalex.org/W3168663926, https://openalex.org/W4312651496, https://openalex.org/W3204554907, https://openalex.org/W4312821902, https://openalex.org/W3035672751, https://openalex.org/W3167536469, https://openalex.org/W3214586131, https://openalex.org/W4312543717, https://openalex.org/W4312472480, https://openalex.org/W4312532041, https://openalex.org/W3208338480, https://openalex.org/W3217397355, https://openalex.org/W4285603016, https://openalex.org/W4312805142, https://openalex.org/W6852835101, https://openalex.org/W4312751983, https://openalex.org/W4386065544, https://openalex.org/W4386066394, https://openalex.org/W4386083067, https://openalex.org/W4385301850, https://openalex.org/W4386066459, https://openalex.org/W4402704627, https://openalex.org/W4402754169, https://openalex.org/W4385245566, https://openalex.org/W2194775991, https://openalex.org/W3094502228, https://openalex.org/W3204540098, https://openalex.org/W4214759957, https://openalex.org/W3181069167, https://openalex.org/W3173871266, https://openalex.org/W6839144149, https://openalex.org/W2891033863, https://openalex.org/W2898200825, https://openalex.org/W4320036905, https://openalex.org/W2747053578, https://openalex.org/W3111983220, https://openalex.org/W2161969291, https://openalex.org/W2518013266, https://openalex.org/W2963074722, https://openalex.org/W2214352687, https://openalex.org/W2473868734, https://openalex.org/W2955983623, https://openalex.org/W6637373629, https://openalex.org/W2963173190, https://openalex.org/W2097117768, https://openalex.org/W3108235634, https://openalex.org/W2797812763, https://openalex.org/W4206706211, https://openalex.org/W4213019189, https://openalex.org/W4385214091, https://openalex.org/W3167762749, https://openalex.org/W3203510176, https://openalex.org/W4214493665, https://openalex.org/W4313156423, https://openalex.org/W4392172801, https://openalex.org/W6755207826, https://openalex.org/W3138516171, https://openalex.org/W6791353385, https://openalex.org/W4386076084, https://openalex.org/W4386075997, https://openalex.org/W6850991314, https://openalex.org/W4386076397, https://openalex.org/W6849177959, https://openalex.org/W4312708649, https://openalex.org/W4382458695, https://openalex.org/W4390874393, https://openalex.org/W4390874575, https://openalex.org/W4312956471, https://openalex.org/W2964345792, https://openalex.org/W4402753899, https://openalex.org/W3106773277, https://openalex.org/W6754033419, https://openalex.org/W2470139095, https://openalex.org/W6854222408, https://openalex.org/W2108598243, https://openalex.org/W4281790833, https://openalex.org/W2158592639, https://openalex.org/W6757817989, https://openalex.org/W2154889144, https://openalex.org/W4402702990, https://openalex.org/W3106542916, https://openalex.org/W4292828074 |
| referenced_works_count | 107 |
| abstract_inverted_index.A | 194 |
| abstract_inverted_index.a | 14, 24, 167, 186, 232 |
| abstract_inverted_index.By | 38 |
| abstract_inverted_index.In | 12, 136 |
| abstract_inverted_index.Of | 99 |
| abstract_inverted_index.We | 230 |
| abstract_inverted_index.an | 141, 279 |
| abstract_inverted_index.as | 62, 117, 239 |
| abstract_inverted_index.be | 288 |
| abstract_inverted_index.by | 104, 277 |
| abstract_inverted_index.in | 50, 77, 94 |
| abstract_inverted_index.is | 17, 27, 92, 198 |
| abstract_inverted_index.of | 110, 155, 274, 283 |
| abstract_inverted_index.on | 263 |
| abstract_inverted_index.to | 19, 29, 178, 190, 200, 242 |
| abstract_inverted_index.we | 139 |
| abstract_inverted_index.Art | 101 |
| abstract_inverted_index.Our | 285 |
| abstract_inverted_index.The | 100 |
| abstract_inverted_index.VOT | 149, 267 |
| abstract_inverted_index.and | 23, 42, 52, 68, 84, 114, 161, 181, 185, 205, 217, 256 |
| abstract_inverted_index.any | 88 |
| abstract_inverted_index.are | 261 |
| abstract_inverted_index.for | 147, 226, 252 |
| abstract_inverted_index.on: | 291 |
| abstract_inverted_index.our | 275 |
| abstract_inverted_index.six | 264 |
| abstract_inverted_index.the | 6, 31, 40, 45, 73, 111, 121, 125, 129, 133, 153, 211, 215, 245, 250, 271 |
| abstract_inverted_index.SOTA | 126 |
| abstract_inverted_index.This | 91 |
| abstract_inverted_index.VLMs | 46 |
| abstract_inverted_index.both | 202 |
| abstract_inverted_index.code | 286 |
| abstract_inverted_index.fast | 66 |
| abstract_inverted_index.have | 3 |
| abstract_inverted_index.many | 78 |
| abstract_inverted_index.need | 251 |
| abstract_inverted_index.past | 237 |
| abstract_inverted_index.rate | 282 |
| abstract_inverted_index.rich | 224 |
| abstract_inverted_index.some | 95, 107 |
| abstract_inverted_index.such | 61 |
| abstract_inverted_index.text | 25 |
| abstract_inverted_index.that | 172, 235 |
| abstract_inverted_index.them | 116 |
| abstract_inverted_index.this | 137 |
| abstract_inverted_index.will | 287 |
| abstract_inverted_index.with | 220 |
| abstract_inverted_index.(ROI) | 169 |
| abstract_inverted_index.(VOT) | 10 |
| abstract_inverted_index.2.0%. | 284 |
| abstract_inverted_index.State | 98 |
| abstract_inverted_index.VLMs, | 13 |
| abstract_inverted_index.among | 132 |
| abstract_inverted_index.blur, | 64 |
| abstract_inverted_index.class | 83 |
| abstract_inverted_index.clues | 241 |
| abstract_inverted_index.image | 188 |
| abstract_inverted_index.input | 74 |
| abstract_inverted_index.model | 244 |
| abstract_inverted_index.takes | 236 |
| abstract_inverted_index.their | 221 |
| abstract_inverted_index.using | 34, 158 |
| abstract_inverted_index.work, | 138 |
| abstract_inverted_index.(SOTA) | 102 |
| abstract_inverted_index.(VLMs) | 2 |
| abstract_inverted_index.Models | 1 |
| abstract_inverted_index.Object | 8 |
| abstract_inverted_index.Visual | 7 |
| abstract_inverted_index.employ | 231 |
| abstract_inverted_index.encode | 182, 191 |
| abstract_inverted_index.frames | 219 |
| abstract_inverted_index.heads. | 258 |
| abstract_inverted_index.models | 177 |
| abstract_inverted_index.module | 197 |
| abstract_inverted_index.motion | 63 |
| abstract_inverted_index.object | 70, 113 |
| abstract_inverted_index.obtain | 20 |
| abstract_inverted_index.recent | 96 |
| abstract_inverted_index.robust | 48, 148 |
| abstract_inverted_index.search | 218 |
| abstract_inverted_index.simple | 187, 195, 233 |
| abstract_inverted_index.strong | 272 |
| abstract_inverted_index.target | 59, 112, 246 |
| abstract_inverted_index.tasks. | 150 |
| abstract_inverted_index.vision | 15 |
| abstract_inverted_index.visual | 21, 41, 160, 192, 204 |
| abstract_inverted_index.within | 120 |
| abstract_inverted_index.AITrack | 151, 165 |
| abstract_inverted_index.Tracker | 145 |
| abstract_inverted_index.achieve | 47 |
| abstract_inverted_index.average | 280 |
| abstract_inverted_index.between | 214 |
| abstract_inverted_index.changes | 248 |
| abstract_inverted_index.combine | 201 |
| abstract_inverted_index.complex | 51, 253 |
| abstract_inverted_index.decoder | 234 |
| abstract_inverted_index.details | 86 |
| abstract_inverted_index.diverse | 53 |
| abstract_inverted_index.dynamic | 58 |
| abstract_inverted_index.encoded | 203 |
| abstract_inverted_index.encoder | 16, 26, 171, 189 |
| abstract_inverted_index.extract | 180 |
| abstract_inverted_index.gaining | 278 |
| abstract_inverted_index.methods | 127 |
| abstract_inverted_index.motion, | 67 |
| abstract_inverted_index.natural | 35 |
| abstract_inverted_index.neglect | 128 |
| abstract_inverted_index.process | 154 |
| abstract_inverted_index.propose | 140 |
| abstract_inverted_index.similar | 69 |
| abstract_inverted_index.success | 281 |
| abstract_inverted_index.textual | 32, 43, 75, 118, 162, 183, 206 |
| abstract_inverted_index.thereby | 208 |
| abstract_inverted_index.without | 87, 249 |
| abstract_inverted_index.AITtrack | 276 |
| abstract_inverted_index.However, | 72, 124 |
| abstract_inverted_index.Tracking | 9 |
| abstract_inverted_index.advanced | 5 |
| abstract_inverted_index.aligning | 39 |
| abstract_inverted_index.datasets | 269 |
| abstract_inverted_index.employed | 18, 28 |
| abstract_inverted_index.encoding | 115 |
| abstract_inverted_index.estimate | 30 |
| abstract_inverted_index.existing | 79, 174 |
| abstract_inverted_index.exposing | 210 |
| abstract_inverted_index.features | 184 |
| abstract_inverted_index.handling | 57 |
| abstract_inverted_index.improved | 227 |
| abstract_inverted_index.language | 36, 176 |
| abstract_inverted_index.modules. | 164 |
| abstract_inverted_index.publicly | 265, 289 |
| abstract_inverted_index.recently | 4 |
| abstract_inverted_index.semantic | 212 |
| abstract_inverted_index.template | 216 |
| abstract_inverted_index.trackers | 81, 103 |
| abstract_inverted_index.tracking | 54, 122, 157, 228 |
| abstract_inverted_index.utilizes | 166 |
| abstract_inverted_index.(AITrack) | 146 |
| abstract_inverted_index.Extensive | 259 |
| abstract_inverted_index.VLM-based | 80, 97, 156 |
| abstract_inverted_index.addressed | 93 |
| abstract_inverted_index.alignment | 144, 163, 196 |
| abstract_inverted_index.available | 266, 290 |
| abstract_inverted_index.benchmark | 268 |
| abstract_inverted_index.encodings | 225 |
| abstract_inverted_index.features, | 207 |
| abstract_inverted_index.features. | 193 |
| abstract_inverted_index.important | 108 |
| abstract_inverted_index.leverages | 173 |
| abstract_inverted_index.paradigm. | 123 |
| abstract_inverted_index.performed | 262 |
| abstract_inverted_index.predicted | 134 |
| abstract_inverted_index.providing | 223 |
| abstract_inverted_index.semantics | 85 |
| abstract_inverted_index.Image-Text | 143 |
| abstract_inverted_index.appearance | 247 |
| abstract_inverted_index.attributes | 109 |
| abstract_inverted_index.contextual | 89, 130 |
| abstract_inverted_index.customized | 254 |
| abstract_inverted_index.embeddings | 33 |
| abstract_inverted_index.implicitly | 105, 179 |
| abstract_inverted_index.inherently | 209 |
| abstract_inverted_index.occlusion, | 65 |
| abstract_inverted_index.predicting | 106 |
| abstract_inverted_index.prediction | 257 |
| abstract_inverted_index.scenarios, | 55 |
| abstract_inverted_index.simplifies | 152 |
| abstract_inverted_index.appearances | 60 |
| abstract_inverted_index.attributes. | 135 |
| abstract_inverted_index.description | 76 |
| abstract_inverted_index.effectively | 243 |
| abstract_inverted_index.efficiently | 56 |
| abstract_inverted_index.experiments | 260 |
| abstract_inverted_index.implemented | 199 |
| abstract_inverted_index.performance | 49 |
| abstract_inverted_index.pre-trained | 175 |
| abstract_inverted_index.predictions | 238 |
| abstract_inverted_index.text-guided | 170 |
| abstract_inverted_index.capabilities | 273 |
| abstract_inverted_index.descriptions | 119 |
| abstract_inverted_index.distractors. | 71 |
| abstract_inverted_index.incorporates | 82 |
| abstract_inverted_index.information. | 90 |
| abstract_inverted_index.performance. | 11, 229 |
| abstract_inverted_index.relationship | 131, 213 |
| abstract_inverted_index.demonstrating | 270 |
| abstract_inverted_index.descriptions. | 37 |
| abstract_inverted_index.surroundings, | 222 |
| abstract_inverted_index.spatiotemporal | 240 |
| abstract_inverted_index.Attention-based | 142 |
| abstract_inverted_index.Vision-Language | 0 |
| abstract_inverted_index.attention-based | 159 |
| abstract_inverted_index.representation, | 22 |
| abstract_inverted_index.post-processings | 255 |
| abstract_inverted_index.representations, | 44 |
| abstract_inverted_index.region-of-interest | 168 |
| abstract_inverted_index.<uri>https://github.com/BasitAlawode/AITrack</uri>. | 292 |
| cited_by_percentile_year.max | 95 |
| cited_by_percentile_year.min | 91 |
| countries_distinct_count | 1 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile.value | 0.85207156 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |