Explicit Relational Reasoning Network for Scene Text Detection Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2412.14692
Connected component (CC) is a proper text shape representation that aligns with human reading intuition. However, CC-based text detection methods have recently faced a developmental bottleneck that their time-consuming post-processing is difficult to eliminate. To address this issue, we introduce an explicit relational reasoning network (ERRNet) to elegantly model the component relationships without post-processing. Concretely, we first represent each text instance as multiple ordered text components, and then treat these components as objects in sequential movement. In this way, scene text detection can be innovatively viewed as a tracking problem. From this perspective, we design an end-to-end tracking decoder to achieve a CC-based method dispensing with post-processing entirely. Additionally, we observe that there is an inconsistency between classification confidence and localization quality, so we propose a Polygon Monte-Carlo method to quickly and accurately evaluate the localization quality. Based on this, we introduce a position-supervised classification loss to guide the task-aligned learning of ERRNet. Experiments on challenging benchmarks demonstrate the effectiveness of our ERRNet. It consistently achieves state-of-the-art accuracy while holding highly competitive inference speed.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2412.14692
- https://arxiv.org/pdf/2412.14692
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4405627766
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4405627766Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2412.14692Digital Object Identifier
- Title
-
Explicit Relational Reasoning Network for Scene Text DetectionWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-12-19Full publication date if available
- Authors
-
Yuchen Su, Zhineng Chen, Yongkun Du, Zhilong Ji, Kai Hu, Jinfeng Bai, Xieping GaoList of authors in order
- Landing page
-
https://arxiv.org/abs/2412.14692Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2412.14692Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2412.14692Direct OA link when available
- Concepts
-
Computer science, Artificial intelligence, Cognitive science, Natural language processing, Cognitive psychology, PsychologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4405627766 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2412.14692 |
| ids.doi | https://doi.org/10.48550/arxiv.2412.14692 |
| ids.openalex | https://openalex.org/W4405627766 |
| fwci | |
| type | preprint |
| title | Explicit Relational Reasoning Network for Scene Text Detection |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T13083 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.8382999897003174 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Advanced Text Analysis Techniques |
| topics[1].id | https://openalex.org/T10215 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.8321999907493591 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Semantic Web and Ontologies |
| topics[2].id | https://openalex.org/T11063 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.7749999761581421 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1703 |
| topics[2].subfield.display_name | Computational Theory and Mathematics |
| topics[2].display_name | Rough Sets and Fuzzy Logic |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.6046226024627686 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C154945302 |
| concepts[1].level | 1 |
| concepts[1].score | 0.45120254158973694 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[1].display_name | Artificial intelligence |
| concepts[2].id | https://openalex.org/C188147891 |
| concepts[2].level | 1 |
| concepts[2].score | 0.4098762273788452 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q147638 |
| concepts[2].display_name | Cognitive science |
| concepts[3].id | https://openalex.org/C204321447 |
| concepts[3].level | 1 |
| concepts[3].score | 0.4028722047805786 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[3].display_name | Natural language processing |
| concepts[4].id | https://openalex.org/C180747234 |
| concepts[4].level | 1 |
| concepts[4].score | 0.3318719267845154 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q23373 |
| concepts[4].display_name | Cognitive psychology |
| concepts[5].id | https://openalex.org/C15744967 |
| concepts[5].level | 0 |
| concepts[5].score | 0.28079527616500854 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q9418 |
| concepts[5].display_name | Psychology |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.6046226024627686 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[1].score | 0.45120254158973694 |
| keywords[1].display_name | Artificial intelligence |
| keywords[2].id | https://openalex.org/keywords/cognitive-science |
| keywords[2].score | 0.4098762273788452 |
| keywords[2].display_name | Cognitive science |
| keywords[3].id | https://openalex.org/keywords/natural-language-processing |
| keywords[3].score | 0.4028722047805786 |
| keywords[3].display_name | Natural language processing |
| keywords[4].id | https://openalex.org/keywords/cognitive-psychology |
| keywords[4].score | 0.3318719267845154 |
| keywords[4].display_name | Cognitive psychology |
| keywords[5].id | https://openalex.org/keywords/psychology |
| keywords[5].score | 0.28079527616500854 |
| keywords[5].display_name | Psychology |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2412.14692 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2412.14692 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2412.14692 |
| locations[1].id | doi:10.48550/arxiv.2412.14692 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2412.14692 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5101940296 |
| authorships[0].author.orcid | https://orcid.org/0009-0009-4034-5883 |
| authorships[0].author.display_name | Yuchen Su |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Su, Yuchen |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5041641853 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Zhineng Chen |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Chen, Zhineng |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5102915298 |
| authorships[2].author.orcid | https://orcid.org/0009-0000-9859-721X |
| authorships[2].author.display_name | Yongkun Du |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Du, Yongkun |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5024537984 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-8799-3409 |
| authorships[3].author.display_name | Zhilong Ji |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Ji, Zhilong |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5008566358 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-8829-1700 |
| authorships[4].author.display_name | Kai Hu |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Hu, Kai |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5036032938 |
| authorships[5].author.orcid | https://orcid.org/0000-0001-8940-480X |
| authorships[5].author.display_name | Jinfeng Bai |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Bai, Jinfeng |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5010870600 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-7764-3616 |
| authorships[6].author.display_name | Xieping Gao |
| authorships[6].author_position | last |
| authorships[6].raw_author_name | Gao, Xieping |
| authorships[6].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2412.14692 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Explicit Relational Reasoning Network for Scene Text Detection |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T13083 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.8382999897003174 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Advanced Text Analysis Techniques |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W3204019825 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2412.14692 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2412.14692 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2412.14692 |
| primary_location.id | pmh:oai:arXiv.org:2412.14692 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2412.14692 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2412.14692 |
| publication_date | 2024-12-19 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 4, 23, 87, 101, 125, 142 |
| abstract_inverted_index.In | 76 |
| abstract_inverted_index.It | 163 |
| abstract_inverted_index.To | 34 |
| abstract_inverted_index.an | 40, 95, 114 |
| abstract_inverted_index.as | 61, 71, 86 |
| abstract_inverted_index.be | 83 |
| abstract_inverted_index.in | 73 |
| abstract_inverted_index.is | 3, 30, 113 |
| abstract_inverted_index.of | 151, 160 |
| abstract_inverted_index.on | 138, 154 |
| abstract_inverted_index.so | 122 |
| abstract_inverted_index.to | 32, 46, 99, 129, 146 |
| abstract_inverted_index.we | 38, 55, 93, 109, 123, 140 |
| abstract_inverted_index.and | 66, 119, 131 |
| abstract_inverted_index.can | 82 |
| abstract_inverted_index.our | 161 |
| abstract_inverted_index.the | 49, 134, 148, 158 |
| abstract_inverted_index.(CC) | 2 |
| abstract_inverted_index.From | 90 |
| abstract_inverted_index.each | 58 |
| abstract_inverted_index.have | 20 |
| abstract_inverted_index.loss | 145 |
| abstract_inverted_index.text | 6, 17, 59, 64, 80 |
| abstract_inverted_index.that | 9, 26, 111 |
| abstract_inverted_index.then | 67 |
| abstract_inverted_index.this | 36, 77, 91 |
| abstract_inverted_index.way, | 78 |
| abstract_inverted_index.with | 11, 105 |
| abstract_inverted_index.Based | 137 |
| abstract_inverted_index.faced | 22 |
| abstract_inverted_index.first | 56 |
| abstract_inverted_index.guide | 147 |
| abstract_inverted_index.human | 12 |
| abstract_inverted_index.model | 48 |
| abstract_inverted_index.scene | 79 |
| abstract_inverted_index.shape | 7 |
| abstract_inverted_index.their | 27 |
| abstract_inverted_index.there | 112 |
| abstract_inverted_index.these | 69 |
| abstract_inverted_index.this, | 139 |
| abstract_inverted_index.treat | 68 |
| abstract_inverted_index.while | 168 |
| abstract_inverted_index.aligns | 10 |
| abstract_inverted_index.design | 94 |
| abstract_inverted_index.highly | 170 |
| abstract_inverted_index.issue, | 37 |
| abstract_inverted_index.method | 103, 128 |
| abstract_inverted_index.proper | 5 |
| abstract_inverted_index.speed. | 173 |
| abstract_inverted_index.viewed | 85 |
| abstract_inverted_index.ERRNet. | 152, 162 |
| abstract_inverted_index.Polygon | 126 |
| abstract_inverted_index.achieve | 100 |
| abstract_inverted_index.address | 35 |
| abstract_inverted_index.between | 116 |
| abstract_inverted_index.decoder | 98 |
| abstract_inverted_index.holding | 169 |
| abstract_inverted_index.methods | 19 |
| abstract_inverted_index.network | 44 |
| abstract_inverted_index.objects | 72 |
| abstract_inverted_index.observe | 110 |
| abstract_inverted_index.ordered | 63 |
| abstract_inverted_index.propose | 124 |
| abstract_inverted_index.quickly | 130 |
| abstract_inverted_index.reading | 13 |
| abstract_inverted_index.without | 52 |
| abstract_inverted_index.(ERRNet) | 45 |
| abstract_inverted_index.CC-based | 16, 102 |
| abstract_inverted_index.However, | 15 |
| abstract_inverted_index.accuracy | 167 |
| abstract_inverted_index.achieves | 165 |
| abstract_inverted_index.evaluate | 133 |
| abstract_inverted_index.explicit | 41 |
| abstract_inverted_index.instance | 60 |
| abstract_inverted_index.learning | 150 |
| abstract_inverted_index.multiple | 62 |
| abstract_inverted_index.problem. | 89 |
| abstract_inverted_index.quality, | 121 |
| abstract_inverted_index.quality. | 136 |
| abstract_inverted_index.recently | 21 |
| abstract_inverted_index.tracking | 88, 97 |
| abstract_inverted_index.Connected | 0 |
| abstract_inverted_index.component | 1, 50 |
| abstract_inverted_index.detection | 18, 81 |
| abstract_inverted_index.difficult | 31 |
| abstract_inverted_index.elegantly | 47 |
| abstract_inverted_index.entirely. | 107 |
| abstract_inverted_index.inference | 172 |
| abstract_inverted_index.introduce | 39, 141 |
| abstract_inverted_index.movement. | 75 |
| abstract_inverted_index.reasoning | 43 |
| abstract_inverted_index.represent | 57 |
| abstract_inverted_index.accurately | 132 |
| abstract_inverted_index.benchmarks | 156 |
| abstract_inverted_index.bottleneck | 25 |
| abstract_inverted_index.components | 70 |
| abstract_inverted_index.confidence | 118 |
| abstract_inverted_index.dispensing | 104 |
| abstract_inverted_index.eliminate. | 33 |
| abstract_inverted_index.end-to-end | 96 |
| abstract_inverted_index.intuition. | 14 |
| abstract_inverted_index.relational | 42 |
| abstract_inverted_index.sequential | 74 |
| abstract_inverted_index.Concretely, | 54 |
| abstract_inverted_index.Experiments | 153 |
| abstract_inverted_index.Monte-Carlo | 127 |
| abstract_inverted_index.challenging | 155 |
| abstract_inverted_index.competitive | 171 |
| abstract_inverted_index.components, | 65 |
| abstract_inverted_index.demonstrate | 157 |
| abstract_inverted_index.consistently | 164 |
| abstract_inverted_index.innovatively | 84 |
| abstract_inverted_index.localization | 120, 135 |
| abstract_inverted_index.perspective, | 92 |
| abstract_inverted_index.task-aligned | 149 |
| abstract_inverted_index.Additionally, | 108 |
| abstract_inverted_index.developmental | 24 |
| abstract_inverted_index.effectiveness | 159 |
| abstract_inverted_index.inconsistency | 115 |
| abstract_inverted_index.relationships | 51 |
| abstract_inverted_index.classification | 117, 144 |
| abstract_inverted_index.representation | 8 |
| abstract_inverted_index.time-consuming | 28 |
| abstract_inverted_index.post-processing | 29, 106 |
| abstract_inverted_index.post-processing. | 53 |
| abstract_inverted_index.state-of-the-art | 166 |
| abstract_inverted_index.position-supervised | 143 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 7 |
| citation_normalized_percentile |