OmniCount: Multi-label Object Counting with Semantic-Geometric Priors Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2403.05435
Object counting is pivotal for understanding the composition of scenes. Previously, this task was dominated by class-specific methods, which have gradually evolved into more adaptable class-agnostic strategies. However, these strategies come with their own set of limitations, such as the need for manual exemplar input and multiple passes for multiple categories, resulting in significant inefficiencies. This paper introduces a more practical approach enabling simultaneous counting of multiple object categories using an open-vocabulary framework. Our solution, OmniCount, stands out by using semantic and geometric insights (priors) from pre-trained models to count multiple categories of objects as specified by users, all without additional training. OmniCount distinguishes itself by generating precise object masks and leveraging varied interactive prompts via the Segment Anything Model for efficient counting. To evaluate OmniCount, we created the OmniCount-191 benchmark, a first-of-its-kind dataset with multi-label object counts, including points, bounding boxes, and VQA annotations. Our comprehensive evaluation in OmniCount-191, alongside other leading benchmarks, demonstrates OmniCount's exceptional performance, significantly outpacing existing solutions. The project webpage is available at https://mondalanindya.github.io/OmniCount.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2403.05435
- https://arxiv.org/pdf/2403.05435
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4392682316
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4392682316Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2403.05435Digital Object Identifier
- Title
-
OmniCount: Multi-label Object Counting with Semantic-Geometric PriorsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-03-08Full publication date if available
- Authors
-
Anindya Mondal, Sauradip Nag, Xiatian Zhu, Anjan DuttaList of authors in order
- Landing page
-
https://arxiv.org/abs/2403.05435Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2403.05435Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2403.05435Direct OA link when available
- Concepts
-
Prior probability, Object (grammar), Computer science, Artificial intelligence, Bayesian probabilityTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4392682316 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2403.05435 |
| ids.doi | https://doi.org/10.48550/arxiv.2403.05435 |
| ids.openalex | https://openalex.org/W4392682316 |
| fwci | |
| type | preprint |
| title | OmniCount: Multi-label Object Counting with Semantic-Geometric Priors |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11106 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9907000064849854 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1711 |
| topics[0].subfield.display_name | Signal Processing |
| topics[0].display_name | Data Management and Algorithms |
| topics[1].id | https://openalex.org/T12016 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9858999848365784 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1710 |
| topics[1].subfield.display_name | Information Systems |
| topics[1].display_name | Web Data Mining and Analysis |
| topics[2].id | https://openalex.org/T11439 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9767000079154968 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1707 |
| topics[2].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[2].display_name | Video Analysis and Summarization |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C177769412 |
| concepts[0].level | 3 |
| concepts[0].score | 0.7869936227798462 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q278090 |
| concepts[0].display_name | Prior probability |
| concepts[1].id | https://openalex.org/C2781238097 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6672304272651672 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q175026 |
| concepts[1].display_name | Object (grammar) |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.5821208357810974 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.4678873121738434 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C107673813 |
| concepts[4].level | 2 |
| concepts[4].score | 0.15332147479057312 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q812534 |
| concepts[4].display_name | Bayesian probability |
| keywords[0].id | https://openalex.org/keywords/prior-probability |
| keywords[0].score | 0.7869936227798462 |
| keywords[0].display_name | Prior probability |
| keywords[1].id | https://openalex.org/keywords/object |
| keywords[1].score | 0.6672304272651672 |
| keywords[1].display_name | Object (grammar) |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.5821208357810974 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.4678873121738434 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/bayesian-probability |
| keywords[4].score | 0.15332147479057312 |
| keywords[4].display_name | Bayesian probability |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2403.05435 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2403.05435 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2403.05435 |
| locations[1].id | doi:10.48550/arxiv.2403.05435 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2403.05435 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5070974200 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-2604-439X |
| authorships[0].author.display_name | Anindya Mondal |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Mondal, Anindya |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5076328294 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-2943-6663 |
| authorships[1].author.display_name | Sauradip Nag |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Nag, Sauradip |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5028643592 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-9284-2955 |
| authorships[2].author.display_name | Xiatian Zhu |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Zhu, Xiatian |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5008386240 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-1667-2245 |
| authorships[3].author.display_name | Anjan Dutta |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Dutta, Anjan |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2403.05435 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | OmniCount: Multi-label Object Counting with Semantic-Geometric Priors |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11106 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9907000064849854 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1711 |
| primary_topic.subfield.display_name | Signal Processing |
| primary_topic.display_name | Data Management and Algorithms |
| related_works | https://openalex.org/W2748952813, https://openalex.org/W2580650124, https://openalex.org/W4386190339, https://openalex.org/W2968424575, https://openalex.org/W3142333283, https://openalex.org/W3122088529, https://openalex.org/W3041320102, https://openalex.org/W2111669074, https://openalex.org/W2085259108, https://openalex.org/W2975200075 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2403.05435 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2403.05435 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2403.05435 |
| primary_location.id | pmh:oai:arXiv.org:2403.05435 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2403.05435 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2403.05435 |
| publication_date | 2024-03-08 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 58, 131 |
| abstract_inverted_index.To | 123 |
| abstract_inverted_index.an | 70 |
| abstract_inverted_index.as | 38, 94 |
| abstract_inverted_index.at | 167 |
| abstract_inverted_index.by | 15, 78, 96, 105 |
| abstract_inverted_index.in | 52, 148 |
| abstract_inverted_index.is | 2, 165 |
| abstract_inverted_index.of | 8, 35, 65, 92 |
| abstract_inverted_index.to | 88 |
| abstract_inverted_index.we | 126 |
| abstract_inverted_index.Our | 73, 145 |
| abstract_inverted_index.The | 162 |
| abstract_inverted_index.VQA | 143 |
| abstract_inverted_index.all | 98 |
| abstract_inverted_index.and | 45, 81, 110, 142 |
| abstract_inverted_index.for | 4, 41, 48, 120 |
| abstract_inverted_index.out | 77 |
| abstract_inverted_index.own | 33 |
| abstract_inverted_index.set | 34 |
| abstract_inverted_index.the | 6, 39, 116, 128 |
| abstract_inverted_index.via | 115 |
| abstract_inverted_index.was | 13 |
| abstract_inverted_index.This | 55 |
| abstract_inverted_index.come | 30 |
| abstract_inverted_index.from | 85 |
| abstract_inverted_index.have | 19 |
| abstract_inverted_index.into | 22 |
| abstract_inverted_index.more | 23, 59 |
| abstract_inverted_index.need | 40 |
| abstract_inverted_index.such | 37 |
| abstract_inverted_index.task | 12 |
| abstract_inverted_index.this | 11 |
| abstract_inverted_index.with | 31, 134 |
| abstract_inverted_index.Model | 119 |
| abstract_inverted_index.count | 89 |
| abstract_inverted_index.input | 44 |
| abstract_inverted_index.masks | 109 |
| abstract_inverted_index.other | 151 |
| abstract_inverted_index.paper | 56 |
| abstract_inverted_index.their | 32 |
| abstract_inverted_index.these | 28 |
| abstract_inverted_index.using | 69, 79 |
| abstract_inverted_index.which | 18 |
| abstract_inverted_index.Object | 0 |
| abstract_inverted_index.boxes, | 141 |
| abstract_inverted_index.itself | 104 |
| abstract_inverted_index.manual | 42 |
| abstract_inverted_index.models | 87 |
| abstract_inverted_index.object | 67, 108, 136 |
| abstract_inverted_index.passes | 47 |
| abstract_inverted_index.stands | 76 |
| abstract_inverted_index.users, | 97 |
| abstract_inverted_index.varied | 112 |
| abstract_inverted_index.Segment | 117 |
| abstract_inverted_index.counts, | 137 |
| abstract_inverted_index.created | 127 |
| abstract_inverted_index.dataset | 133 |
| abstract_inverted_index.evolved | 21 |
| abstract_inverted_index.leading | 152 |
| abstract_inverted_index.objects | 93 |
| abstract_inverted_index.pivotal | 3 |
| abstract_inverted_index.points, | 139 |
| abstract_inverted_index.precise | 107 |
| abstract_inverted_index.project | 163 |
| abstract_inverted_index.prompts | 114 |
| abstract_inverted_index.scenes. | 9 |
| abstract_inverted_index.webpage | 164 |
| abstract_inverted_index.without | 99 |
| abstract_inverted_index.(priors) | 84 |
| abstract_inverted_index.Anything | 118 |
| abstract_inverted_index.However, | 27 |
| abstract_inverted_index.approach | 61 |
| abstract_inverted_index.bounding | 140 |
| abstract_inverted_index.counting | 1, 64 |
| abstract_inverted_index.enabling | 62 |
| abstract_inverted_index.evaluate | 124 |
| abstract_inverted_index.exemplar | 43 |
| abstract_inverted_index.existing | 160 |
| abstract_inverted_index.insights | 83 |
| abstract_inverted_index.methods, | 17 |
| abstract_inverted_index.multiple | 46, 49, 66, 90 |
| abstract_inverted_index.semantic | 80 |
| abstract_inverted_index.OmniCount | 102 |
| abstract_inverted_index.adaptable | 24 |
| abstract_inverted_index.alongside | 150 |
| abstract_inverted_index.available | 166 |
| abstract_inverted_index.counting. | 122 |
| abstract_inverted_index.dominated | 14 |
| abstract_inverted_index.efficient | 121 |
| abstract_inverted_index.geometric | 82 |
| abstract_inverted_index.gradually | 20 |
| abstract_inverted_index.including | 138 |
| abstract_inverted_index.outpacing | 159 |
| abstract_inverted_index.practical | 60 |
| abstract_inverted_index.resulting | 51 |
| abstract_inverted_index.solution, | 74 |
| abstract_inverted_index.specified | 95 |
| abstract_inverted_index.training. | 101 |
| abstract_inverted_index.OmniCount, | 75, 125 |
| abstract_inverted_index.additional | 100 |
| abstract_inverted_index.benchmark, | 130 |
| abstract_inverted_index.categories | 68, 91 |
| abstract_inverted_index.evaluation | 147 |
| abstract_inverted_index.framework. | 72 |
| abstract_inverted_index.generating | 106 |
| abstract_inverted_index.introduces | 57 |
| abstract_inverted_index.leveraging | 111 |
| abstract_inverted_index.solutions. | 161 |
| abstract_inverted_index.strategies | 29 |
| abstract_inverted_index.OmniCount's | 155 |
| abstract_inverted_index.Previously, | 10 |
| abstract_inverted_index.benchmarks, | 153 |
| abstract_inverted_index.categories, | 50 |
| abstract_inverted_index.composition | 7 |
| abstract_inverted_index.exceptional | 156 |
| abstract_inverted_index.interactive | 113 |
| abstract_inverted_index.multi-label | 135 |
| abstract_inverted_index.pre-trained | 86 |
| abstract_inverted_index.significant | 53 |
| abstract_inverted_index.strategies. | 26 |
| abstract_inverted_index.annotations. | 144 |
| abstract_inverted_index.demonstrates | 154 |
| abstract_inverted_index.limitations, | 36 |
| abstract_inverted_index.performance, | 157 |
| abstract_inverted_index.simultaneous | 63 |
| abstract_inverted_index.OmniCount-191 | 129 |
| abstract_inverted_index.comprehensive | 146 |
| abstract_inverted_index.distinguishes | 103 |
| abstract_inverted_index.significantly | 158 |
| abstract_inverted_index.understanding | 5 |
| abstract_inverted_index.OmniCount-191, | 149 |
| abstract_inverted_index.class-agnostic | 25 |
| abstract_inverted_index.class-specific | 16 |
| abstract_inverted_index.inefficiencies. | 54 |
| abstract_inverted_index.open-vocabulary | 71 |
| abstract_inverted_index.first-of-its-kind | 132 |
| abstract_inverted_index.https://mondalanindya.github.io/OmniCount. | 168 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |