An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022 Article Swipe
YOU?
·
· 2022
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2211.08776
This technical report describes the CONE approach for Ego4D Natural Language Queries (NLQ) Challenge in ECCV 2022. We leverage our model CONE, an efficient window-centric COarse-to-fiNE alignment framework. Specifically, CONE dynamically slices the long video into candidate windows via a sliding window approach. Centering at windows, CONE (1) learns the inter-window (coarse-grained) semantic variance through contrastive learning and speeds up inference by pre-filtering the candidate windows relevant to the NL query, and (2) conducts intra-window (fine-grained) candidate moments ranking utilizing the powerful multi-modal alignment ability of the contrastive vision-text pre-trained model EgoVLP. On the blind test set, CONE achieves 15.26 and 9.24 for R1@IoU=0.3 and R1@IoU=0.5, respectively.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2211.08776
- https://arxiv.org/pdf/2211.08776
- OA Status
- green
- Cited By
- 1
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4320560852
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4320560852Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2211.08776Digital Object Identifier
- Title
-
An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022Work title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2022Year of publication
- Publication date
-
2022-11-16Full publication date if available
- Authors
-
Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, W. K. Chan, Chong‐Wah Ngo, Zheng Shou, Nan DuanList of authors in order
- Landing page
-
https://arxiv.org/abs/2211.08776Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2211.08776Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2211.08776Direct OA link when available
- Concepts
-
Computer science, Sliding window protocol, Window (computing), Leverage (statistics), Artificial intelligence, Ranking (information retrieval), Set (abstract data type), Inference, Natural language, Natural language processing, Pattern recognition (psychology), Programming language, Operating systemTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2023: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4320560852 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2211.08776 |
| ids.doi | https://doi.org/10.48550/arxiv.2211.08776 |
| ids.openalex | https://openalex.org/W4320560852 |
| fwci | |
| type | preprint |
| title | An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022 |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11714 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9997000098228455 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Multimodal Machine Learning Applications |
| topics[1].id | https://openalex.org/T10028 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9983999729156494 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Topic Modeling |
| topics[2].id | https://openalex.org/T10181 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9976999759674072 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Natural Language Processing Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.8091060519218445 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C102392041 |
| concepts[1].level | 3 |
| concepts[1].score | 0.7314229011535645 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q592860 |
| concepts[1].display_name | Sliding window protocol |
| concepts[2].id | https://openalex.org/C2778751112 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6749194264411926 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q835016 |
| concepts[2].display_name | Window (computing) |
| concepts[3].id | https://openalex.org/C153083717 |
| concepts[3].level | 2 |
| concepts[3].score | 0.6708053946495056 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q6535263 |
| concepts[3].display_name | Leverage (statistics) |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.6005695462226868 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C189430467 |
| concepts[5].level | 2 |
| concepts[5].score | 0.5860366225242615 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q7293293 |
| concepts[5].display_name | Ranking (information retrieval) |
| concepts[6].id | https://openalex.org/C177264268 |
| concepts[6].level | 2 |
| concepts[6].score | 0.48542389273643494 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q1514741 |
| concepts[6].display_name | Set (abstract data type) |
| concepts[7].id | https://openalex.org/C2776214188 |
| concepts[7].level | 2 |
| concepts[7].score | 0.47482427954673767 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q408386 |
| concepts[7].display_name | Inference |
| concepts[8].id | https://openalex.org/C195324797 |
| concepts[8].level | 2 |
| concepts[8].score | 0.4559392035007477 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q33742 |
| concepts[8].display_name | Natural language |
| concepts[9].id | https://openalex.org/C204321447 |
| concepts[9].level | 1 |
| concepts[9].score | 0.40371447801589966 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[9].display_name | Natural language processing |
| concepts[10].id | https://openalex.org/C153180895 |
| concepts[10].level | 2 |
| concepts[10].score | 0.3286973536014557 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q7148389 |
| concepts[10].display_name | Pattern recognition (psychology) |
| concepts[11].id | https://openalex.org/C199360897 |
| concepts[11].level | 1 |
| concepts[11].score | 0.09432867169380188 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[11].display_name | Programming language |
| concepts[12].id | https://openalex.org/C111919701 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[12].display_name | Operating system |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.8091060519218445 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/sliding-window-protocol |
| keywords[1].score | 0.7314229011535645 |
| keywords[1].display_name | Sliding window protocol |
| keywords[2].id | https://openalex.org/keywords/window |
| keywords[2].score | 0.6749194264411926 |
| keywords[2].display_name | Window (computing) |
| keywords[3].id | https://openalex.org/keywords/leverage |
| keywords[3].score | 0.6708053946495056 |
| keywords[3].display_name | Leverage (statistics) |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.6005695462226868 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/ranking |
| keywords[5].score | 0.5860366225242615 |
| keywords[5].display_name | Ranking (information retrieval) |
| keywords[6].id | https://openalex.org/keywords/set |
| keywords[6].score | 0.48542389273643494 |
| keywords[6].display_name | Set (abstract data type) |
| keywords[7].id | https://openalex.org/keywords/inference |
| keywords[7].score | 0.47482427954673767 |
| keywords[7].display_name | Inference |
| keywords[8].id | https://openalex.org/keywords/natural-language |
| keywords[8].score | 0.4559392035007477 |
| keywords[8].display_name | Natural language |
| keywords[9].id | https://openalex.org/keywords/natural-language-processing |
| keywords[9].score | 0.40371447801589966 |
| keywords[9].display_name | Natural language processing |
| keywords[10].id | https://openalex.org/keywords/pattern-recognition |
| keywords[10].score | 0.3286973536014557 |
| keywords[10].display_name | Pattern recognition (psychology) |
| keywords[11].id | https://openalex.org/keywords/programming-language |
| keywords[11].score | 0.09432867169380188 |
| keywords[11].display_name | Programming language |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2211.08776 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2211.08776 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2211.08776 |
| locations[1].id | doi:10.48550/arxiv.2211.08776 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2211.08776 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5101556026 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-7297-6817 |
| authorships[0].author.display_name | Zhijian Hou |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Hou, Zhijian |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5019101763 |
| authorships[1].author.orcid | https://orcid.org/0009-0007-2236-228X |
| authorships[1].author.display_name | Wanjun Zhong |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zhong, Wanjun |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5100754482 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-7753-3313 |
| authorships[2].author.display_name | Lei Ji |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Ji, Lei |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5001133932 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-8494-3492 |
| authorships[3].author.display_name | Difei Gao |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Gao, Difei |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5100615077 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-1532-5935 |
| authorships[4].author.display_name | Kun Yan |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Yan, Kun |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5020936420 |
| authorships[5].author.orcid | https://orcid.org/0000-0001-7726-6235 |
| authorships[5].author.display_name | W. K. Chan |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Chan, Wing-Kwong |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5010722442 |
| authorships[6].author.orcid | https://orcid.org/0000-0003-4182-8261 |
| authorships[6].author.display_name | Chong‐Wah Ngo |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Ngo, Chong-Wah |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5112513463 |
| authorships[7].author.orcid | |
| authorships[7].author.display_name | Zheng Shou |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Shou, Zheng |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5042018181 |
| authorships[8].author.orcid | https://orcid.org/0000-0002-3387-4674 |
| authorships[8].author.display_name | Nan Duan |
| authorships[8].author_position | last |
| authorships[8].raw_author_name | Duan, Nan |
| authorships[8].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2211.08776 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2023-02-15T00:00:00 |
| display_name | An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022 |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11714 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9997000098228455 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Multimodal Machine Learning Applications |
| related_works | https://openalex.org/W2109115373, https://openalex.org/W2390901981, https://openalex.org/W2353818951, https://openalex.org/W1605879311, https://openalex.org/W2611980620, https://openalex.org/W4230691760, https://openalex.org/W2385763735, https://openalex.org/W4391923333, https://openalex.org/W2386394344, https://openalex.org/W3014558862 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2023 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2211.08776 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2211.08776 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2211.08776 |
| primary_location.id | pmh:oai:arXiv.org:2211.08776 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2211.08776 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2211.08776 |
| publication_date | 2022-11-16 |
| publication_year | 2022 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 39 |
| abstract_inverted_index.NL | 69 |
| abstract_inverted_index.On | 92 |
| abstract_inverted_index.We | 17 |
| abstract_inverted_index.an | 22 |
| abstract_inverted_index.at | 44 |
| abstract_inverted_index.by | 61 |
| abstract_inverted_index.in | 14 |
| abstract_inverted_index.of | 85 |
| abstract_inverted_index.to | 67 |
| abstract_inverted_index.up | 59 |
| abstract_inverted_index.(1) | 47 |
| abstract_inverted_index.(2) | 72 |
| abstract_inverted_index.and | 57, 71, 100, 104 |
| abstract_inverted_index.for | 7, 102 |
| abstract_inverted_index.our | 19 |
| abstract_inverted_index.the | 4, 32, 49, 63, 68, 80, 86, 93 |
| abstract_inverted_index.via | 38 |
| abstract_inverted_index.9.24 | 101 |
| abstract_inverted_index.CONE | 5, 29, 46, 97 |
| abstract_inverted_index.ECCV | 15 |
| abstract_inverted_index.This | 0 |
| abstract_inverted_index.into | 35 |
| abstract_inverted_index.long | 33 |
| abstract_inverted_index.set, | 96 |
| abstract_inverted_index.test | 95 |
| abstract_inverted_index.(NLQ) | 12 |
| abstract_inverted_index.15.26 | 99 |
| abstract_inverted_index.2022. | 16 |
| abstract_inverted_index.CONE, | 21 |
| abstract_inverted_index.Ego4D | 8 |
| abstract_inverted_index.blind | 94 |
| abstract_inverted_index.model | 20, 90 |
| abstract_inverted_index.video | 34 |
| abstract_inverted_index.learns | 48 |
| abstract_inverted_index.query, | 70 |
| abstract_inverted_index.report | 2 |
| abstract_inverted_index.slices | 31 |
| abstract_inverted_index.speeds | 58 |
| abstract_inverted_index.window | 41 |
| abstract_inverted_index.EgoVLP. | 91 |
| abstract_inverted_index.Natural | 9 |
| abstract_inverted_index.Queries | 11 |
| abstract_inverted_index.ability | 84 |
| abstract_inverted_index.moments | 77 |
| abstract_inverted_index.ranking | 78 |
| abstract_inverted_index.sliding | 40 |
| abstract_inverted_index.through | 54 |
| abstract_inverted_index.windows | 37, 65 |
| abstract_inverted_index.Language | 10 |
| abstract_inverted_index.achieves | 98 |
| abstract_inverted_index.approach | 6 |
| abstract_inverted_index.conducts | 73 |
| abstract_inverted_index.learning | 56 |
| abstract_inverted_index.leverage | 18 |
| abstract_inverted_index.powerful | 81 |
| abstract_inverted_index.relevant | 66 |
| abstract_inverted_index.semantic | 52 |
| abstract_inverted_index.variance | 53 |
| abstract_inverted_index.windows, | 45 |
| abstract_inverted_index.Centering | 43 |
| abstract_inverted_index.Challenge | 13 |
| abstract_inverted_index.alignment | 26, 83 |
| abstract_inverted_index.approach. | 42 |
| abstract_inverted_index.candidate | 36, 64, 76 |
| abstract_inverted_index.describes | 3 |
| abstract_inverted_index.efficient | 23 |
| abstract_inverted_index.inference | 60 |
| abstract_inverted_index.technical | 1 |
| abstract_inverted_index.utilizing | 79 |
| abstract_inverted_index.R1@IoU=0.3 | 103 |
| abstract_inverted_index.framework. | 27 |
| abstract_inverted_index.R1@IoU=0.5, | 105 |
| abstract_inverted_index.contrastive | 55, 87 |
| abstract_inverted_index.dynamically | 30 |
| abstract_inverted_index.multi-modal | 82 |
| abstract_inverted_index.pre-trained | 89 |
| abstract_inverted_index.vision-text | 88 |
| abstract_inverted_index.inter-window | 50 |
| abstract_inverted_index.intra-window | 74 |
| abstract_inverted_index.Specifically, | 28 |
| abstract_inverted_index.pre-filtering | 62 |
| abstract_inverted_index.respectively. | 106 |
| abstract_inverted_index.(fine-grained) | 75 |
| abstract_inverted_index.COarse-to-fiNE | 25 |
| abstract_inverted_index.window-centric | 24 |
| abstract_inverted_index.(coarse-grained) | 51 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 9 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/4 |
| sustainable_development_goals[0].score | 0.7900000214576721 |
| sustainable_development_goals[0].display_name | Quality Education |
| citation_normalized_percentile |