SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object Detection with Transformers Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2303.08481
Self-supervised pre-training and transformer-based networks have significantly improved the performance of object detection. However, most of the current self-supervised object detection methods are built on convolutional-based architectures. We believe that the transformers' sequence characteristics should be considered when designing a transformer-based self-supervised method for the object detection task. To this end, we propose SeqCo-DETR, a novel Sequence Consistency-based self-supervised method for object DEtection with TRansformers. SeqCo-DETR defines a simple but effective pretext by minimizes the discrepancy of the output sequences of transformers with different image views as input and leverages bipartite matching to find the most relevant sequence pairs to improve the sequence-level self-supervised representation learning performance. Furthermore, we provide a mask-based augmentation strategy incorporated with the sequence consistency strategy to extract more representative contextual information about the object for the object detection task. Our method achieves state-of-the-art results on MS COCO (45.8 AP) and PASCAL VOC (64.1 AP), demonstrating the effectiveness of our approach.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2303.08481
- https://arxiv.org/pdf/2303.08481
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4327671092
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4327671092Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2303.08481Digital Object Identifier
- Title
-
SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object Detection with TransformersWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-03-15Full publication date if available
- Authors
-
Guoqiang Jin, Fan Yang, Mingshan Sun, Ruyi Zhao, Yakun Liu, Wei Li, Tianpeng Bao, Liwei Wu, Xingyu Zeng, Rui ZhaoList of authors in order
- Landing page
-
https://arxiv.org/abs/2303.08481Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2303.08481Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2303.08481Direct OA link when available
- Concepts
-
Computer science, Pascal (unit), Transformer, Artificial intelligence, Object detection, Machine learning, Pattern recognition (psychology), Engineering, Voltage, Programming language, Electrical engineeringTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4327671092 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2303.08481 |
| ids.doi | https://doi.org/10.48550/arxiv.2303.08481 |
| ids.openalex | https://openalex.org/W4327671092 |
| fwci | |
| type | preprint |
| title | SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object Detection with Transformers |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10036 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9986000061035156 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Advanced Neural Network Applications |
| topics[1].id | https://openalex.org/T11307 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9954000115394592 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Domain Adaptation and Few-Shot Learning |
| topics[2].id | https://openalex.org/T11714 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9836000204086304 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1707 |
| topics[2].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[2].display_name | Multimodal Machine Learning Applications |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.6500182747840881 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C75608658 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6345034837722778 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q44395 |
| concepts[1].display_name | Pascal (unit) |
| concepts[2].id | https://openalex.org/C66322947 |
| concepts[2].level | 3 |
| concepts[2].score | 0.6207098364830017 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11658 |
| concepts[2].display_name | Transformer |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.5463089346885681 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C2776151529 |
| concepts[4].level | 3 |
| concepts[4].score | 0.4225630462169647 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q3045304 |
| concepts[4].display_name | Object detection |
| concepts[5].id | https://openalex.org/C119857082 |
| concepts[5].level | 1 |
| concepts[5].score | 0.41191551089286804 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[5].display_name | Machine learning |
| concepts[6].id | https://openalex.org/C153180895 |
| concepts[6].level | 2 |
| concepts[6].score | 0.4100342392921448 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q7148389 |
| concepts[6].display_name | Pattern recognition (psychology) |
| concepts[7].id | https://openalex.org/C127413603 |
| concepts[7].level | 0 |
| concepts[7].score | 0.1383705735206604 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q11023 |
| concepts[7].display_name | Engineering |
| concepts[8].id | https://openalex.org/C165801399 |
| concepts[8].level | 2 |
| concepts[8].score | 0.08027300238609314 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q25428 |
| concepts[8].display_name | Voltage |
| concepts[9].id | https://openalex.org/C199360897 |
| concepts[9].level | 1 |
| concepts[9].score | 0.0 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[9].display_name | Programming language |
| concepts[10].id | https://openalex.org/C119599485 |
| concepts[10].level | 1 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q43035 |
| concepts[10].display_name | Electrical engineering |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.6500182747840881 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/pascal |
| keywords[1].score | 0.6345034837722778 |
| keywords[1].display_name | Pascal (unit) |
| keywords[2].id | https://openalex.org/keywords/transformer |
| keywords[2].score | 0.6207098364830017 |
| keywords[2].display_name | Transformer |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.5463089346885681 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/object-detection |
| keywords[4].score | 0.4225630462169647 |
| keywords[4].display_name | Object detection |
| keywords[5].id | https://openalex.org/keywords/machine-learning |
| keywords[5].score | 0.41191551089286804 |
| keywords[5].display_name | Machine learning |
| keywords[6].id | https://openalex.org/keywords/pattern-recognition |
| keywords[6].score | 0.4100342392921448 |
| keywords[6].display_name | Pattern recognition (psychology) |
| keywords[7].id | https://openalex.org/keywords/engineering |
| keywords[7].score | 0.1383705735206604 |
| keywords[7].display_name | Engineering |
| keywords[8].id | https://openalex.org/keywords/voltage |
| keywords[8].score | 0.08027300238609314 |
| keywords[8].display_name | Voltage |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2303.08481 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2303.08481 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2303.08481 |
| locations[1].id | doi:10.48550/arxiv.2303.08481 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2303.08481 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5109074680 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Guoqiang Jin |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Jin, Guoqiang |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5100346607 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-1157-8719 |
| authorships[1].author.display_name | Fan Yang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Yang, Fan |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5100657908 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-0785-7387 |
| authorships[2].author.display_name | Mingshan Sun |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Sun, Mingshan |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5109436432 |
| authorships[3].author.orcid | https://orcid.org/0009-0004-5066-6051 |
| authorships[3].author.display_name | Ruyi Zhao |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Zhao, Ruyi |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5102025662 |
| authorships[4].author.orcid | https://orcid.org/0009-0008-1661-2470 |
| authorships[4].author.display_name | Yakun Liu |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Liu, Yakun |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5100318032 |
| authorships[5].author.orcid | https://orcid.org/0000-0001-8124-4645 |
| authorships[5].author.display_name | Wei Li |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Li, Wei |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5025650981 |
| authorships[6].author.orcid | |
| authorships[6].author.display_name | Tianpeng Bao |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Bao, Tianpeng |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5100675839 |
| authorships[7].author.orcid | |
| authorships[7].author.display_name | Liwei Wu |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Wu, Liwei |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5060379760 |
| authorships[8].author.orcid | https://orcid.org/0009-0007-8224-4461 |
| authorships[8].author.display_name | Xingyu Zeng |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Zeng, Xingyu |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5100684043 |
| authorships[9].author.orcid | https://orcid.org/0000-0003-2993-2023 |
| authorships[9].author.display_name | Rui Zhao |
| authorships[9].author_position | last |
| authorships[9].raw_author_name | Zhao, Rui |
| authorships[9].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2303.08481 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object Detection with Transformers |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10036 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9986000061035156 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Advanced Neural Network Applications |
| related_works | https://openalex.org/W4376620596, https://openalex.org/W3177249605, https://openalex.org/W2534152068, https://openalex.org/W3138508047, https://openalex.org/W1972515067, https://openalex.org/W1689909837, https://openalex.org/W4293054914, https://openalex.org/W4313315626, https://openalex.org/W4298525700, https://openalex.org/W2963418361 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2303.08481 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2303.08481 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2303.08481 |
| primary_location.id | pmh:oai:arXiv.org:2303.08481 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2303.08481 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2303.08481 |
| publication_date | 2023-03-15 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 39, 54, 67, 110 |
| abstract_inverted_index.MS | 140 |
| abstract_inverted_index.To | 48 |
| abstract_inverted_index.We | 27 |
| abstract_inverted_index.as | 86 |
| abstract_inverted_index.be | 35 |
| abstract_inverted_index.by | 72 |
| abstract_inverted_index.of | 10, 15, 76, 80, 152 |
| abstract_inverted_index.on | 24, 139 |
| abstract_inverted_index.to | 92, 99, 120 |
| abstract_inverted_index.we | 51, 108 |
| abstract_inverted_index.AP) | 143 |
| abstract_inverted_index.Our | 134 |
| abstract_inverted_index.VOC | 146 |
| abstract_inverted_index.and | 2, 88, 144 |
| abstract_inverted_index.are | 22 |
| abstract_inverted_index.but | 69 |
| abstract_inverted_index.for | 43, 60, 129 |
| abstract_inverted_index.our | 153 |
| abstract_inverted_index.the | 8, 16, 30, 44, 74, 77, 94, 101, 116, 127, 130, 150 |
| abstract_inverted_index.AP), | 148 |
| abstract_inverted_index.COCO | 141 |
| abstract_inverted_index.end, | 50 |
| abstract_inverted_index.find | 93 |
| abstract_inverted_index.have | 5 |
| abstract_inverted_index.more | 122 |
| abstract_inverted_index.most | 14, 95 |
| abstract_inverted_index.that | 29 |
| abstract_inverted_index.this | 49 |
| abstract_inverted_index.when | 37 |
| abstract_inverted_index.with | 63, 82, 115 |
| abstract_inverted_index.(45.8 | 142 |
| abstract_inverted_index.(64.1 | 147 |
| abstract_inverted_index.about | 126 |
| abstract_inverted_index.built | 23 |
| abstract_inverted_index.image | 84 |
| abstract_inverted_index.input | 87 |
| abstract_inverted_index.novel | 55 |
| abstract_inverted_index.pairs | 98 |
| abstract_inverted_index.task. | 47, 133 |
| abstract_inverted_index.views | 85 |
| abstract_inverted_index.PASCAL | 145 |
| abstract_inverted_index.method | 42, 59, 135 |
| abstract_inverted_index.object | 11, 19, 45, 61, 128, 131 |
| abstract_inverted_index.output | 78 |
| abstract_inverted_index.should | 34 |
| abstract_inverted_index.simple | 68 |
| abstract_inverted_index.believe | 28 |
| abstract_inverted_index.current | 17 |
| abstract_inverted_index.defines | 66 |
| abstract_inverted_index.extract | 121 |
| abstract_inverted_index.improve | 100 |
| abstract_inverted_index.methods | 21 |
| abstract_inverted_index.pretext | 71 |
| abstract_inverted_index.propose | 52 |
| abstract_inverted_index.provide | 109 |
| abstract_inverted_index.results | 138 |
| abstract_inverted_index.However, | 13 |
| abstract_inverted_index.Sequence | 56 |
| abstract_inverted_index.achieves | 136 |
| abstract_inverted_index.improved | 7 |
| abstract_inverted_index.learning | 105 |
| abstract_inverted_index.matching | 91 |
| abstract_inverted_index.networks | 4 |
| abstract_inverted_index.relevant | 96 |
| abstract_inverted_index.sequence | 32, 97, 117 |
| abstract_inverted_index.strategy | 113, 119 |
| abstract_inverted_index.DEtection | 62 |
| abstract_inverted_index.approach. | 154 |
| abstract_inverted_index.bipartite | 90 |
| abstract_inverted_index.designing | 38 |
| abstract_inverted_index.detection | 20, 46, 132 |
| abstract_inverted_index.different | 83 |
| abstract_inverted_index.effective | 70 |
| abstract_inverted_index.leverages | 89 |
| abstract_inverted_index.minimizes | 73 |
| abstract_inverted_index.sequences | 79 |
| abstract_inverted_index.SeqCo-DETR | 65 |
| abstract_inverted_index.considered | 36 |
| abstract_inverted_index.contextual | 124 |
| abstract_inverted_index.detection. | 12 |
| abstract_inverted_index.mask-based | 111 |
| abstract_inverted_index.SeqCo-DETR, | 53 |
| abstract_inverted_index.consistency | 118 |
| abstract_inverted_index.discrepancy | 75 |
| abstract_inverted_index.information | 125 |
| abstract_inverted_index.performance | 9 |
| abstract_inverted_index.Furthermore, | 107 |
| abstract_inverted_index.augmentation | 112 |
| abstract_inverted_index.incorporated | 114 |
| abstract_inverted_index.performance. | 106 |
| abstract_inverted_index.pre-training | 1 |
| abstract_inverted_index.transformers | 81 |
| abstract_inverted_index.TRansformers. | 64 |
| abstract_inverted_index.demonstrating | 149 |
| abstract_inverted_index.effectiveness | 151 |
| abstract_inverted_index.significantly | 6 |
| abstract_inverted_index.transformers' | 31 |
| abstract_inverted_index.architectures. | 26 |
| abstract_inverted_index.representation | 104 |
| abstract_inverted_index.representative | 123 |
| abstract_inverted_index.sequence-level | 102 |
| abstract_inverted_index.Self-supervised | 0 |
| abstract_inverted_index.characteristics | 33 |
| abstract_inverted_index.self-supervised | 18, 41, 58, 103 |
| abstract_inverted_index.state-of-the-art | 137 |
| abstract_inverted_index.Consistency-based | 57 |
| abstract_inverted_index.transformer-based | 3, 40 |
| abstract_inverted_index.convolutional-based | 25 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 10 |
| citation_normalized_percentile |