MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2408.07576
Beyond the Transformer, it is important to explore how to exploit the capacity of the MetaFormer, an architecture that is fundamental to the performance improvements of the Transformer. Previous studies have exploited it only for the backbone network. Unlike previous studies, we explore the capacity of the Metaformer architecture more extensively in the semantic segmentation task. We propose a powerful semantic segmentation network, MetaSeg, which leverages the Metaformer architecture from the backbone to the decoder. Our MetaSeg shows that the MetaFormer architecture plays a significant role in capturing the useful contexts for the decoder as well as for the backbone. In addition, recent segmentation methods have shown that using a CNN-based backbone for extracting the spatial information and a decoder for extracting the global information is more effective than using a transformer-based backbone with a CNN-based decoder. This motivates us to adopt the CNN-based backbone using the MetaFormer block and design our MetaFormer-based decoder, which consists of a novel self-attention module to capture the global contexts. To consider both the global contexts extraction and the computational efficiency of the self-attention for semantic segmentation, we propose a Channel Reduction Attention (CRA) module that reduces the channel dimension of the query and key into the one dimension. In this way, our proposed MetaSeg outperforms the previous state-of-the-art methods with more efficient computational costs on popular semantic segmentation and a medical image segmentation benchmark, including ADE20K, Cityscapes, COCO-stuff, and Synapse. The code is available at https://github.com/hyunwoo137/MetaSeg.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2408.07576
- https://arxiv.org/pdf/2408.07576
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4406021705
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4406021705Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2408.07576Digital Object Identifier
- Title
-
MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic SegmentationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-08-14Full publication date if available
- Authors
-
Beoungwoo Kang, Seunghun Moon, Yubin Cho, Hyunwoo Yu, Suk‐Ju KangList of authors in order
- Landing page
-
https://arxiv.org/abs/2408.07576Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2408.07576Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2408.07576Direct OA link when available
- Concepts
-
Segmentation, Computer science, Artificial intelligence, Natural language processingTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4406021705 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2408.07576 |
| ids.doi | https://doi.org/10.48550/arxiv.2408.07576 |
| ids.openalex | https://openalex.org/W4406021705 |
| fwci | |
| type | preprint |
| title | MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T13382 |
| topics[0].field.id | https://openalex.org/fields/22 |
| topics[0].field.display_name | Engineering |
| topics[0].score | 0.7803000211715698 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2207 |
| topics[0].subfield.display_name | Control and Systems Engineering |
| topics[0].display_name | Robotics and Automated Systems |
| topics[1].id | https://openalex.org/T10627 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.7457000017166138 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1707 |
| topics[1].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[1].display_name | Advanced Image and Video Retrieval Techniques |
| topics[2].id | https://openalex.org/T11439 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.70169997215271 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1707 |
| topics[2].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[2].display_name | Video Analysis and Summarization |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C89600930 |
| concepts[0].level | 2 |
| concepts[0].score | 0.686612606048584 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1423946 |
| concepts[0].display_name | Segmentation |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.6708390712738037 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C154945302 |
| concepts[2].level | 1 |
| concepts[2].score | 0.39844852685928345 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[2].display_name | Artificial intelligence |
| concepts[3].id | https://openalex.org/C204321447 |
| concepts[3].level | 1 |
| concepts[3].score | 0.37515145540237427 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[3].display_name | Natural language processing |
| keywords[0].id | https://openalex.org/keywords/segmentation |
| keywords[0].score | 0.686612606048584 |
| keywords[0].display_name | Segmentation |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.6708390712738037 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[2].score | 0.39844852685928345 |
| keywords[2].display_name | Artificial intelligence |
| keywords[3].id | https://openalex.org/keywords/natural-language-processing |
| keywords[3].score | 0.37515145540237427 |
| keywords[3].display_name | Natural language processing |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2408.07576 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://arxiv.org/pdf/2408.07576 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2408.07576 |
| locations[1].id | doi:10.48550/arxiv.2408.07576 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2408.07576 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5050416000 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Beoungwoo Kang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Kang, Beoungwoo |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5101274742 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Seunghun Moon |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Moon, Seunghun |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5060160801 |
| authorships[2].author.orcid | https://orcid.org/0009-0001-8604-5431 |
| authorships[2].author.display_name | Yubin Cho |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Cho, Yubin |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5015387461 |
| authorships[3].author.orcid | https://orcid.org/0009-0009-4426-8272 |
| authorships[3].author.display_name | Hyunwoo Yu |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Yu, Hyunwoo |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5084904773 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-4809-956X |
| authorships[4].author.display_name | Suk‐Ju Kang |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Kang, Suk-Ju |
| authorships[4].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2408.07576 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-01-03T00:00:00 |
| display_name | MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T13382 |
| primary_topic.field.id | https://openalex.org/fields/22 |
| primary_topic.field.display_name | Engineering |
| primary_topic.score | 0.7803000211715698 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2207 |
| primary_topic.subfield.display_name | Control and Systems Engineering |
| primary_topic.display_name | Robotics and Automated Systems |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W3204019825 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2408.07576 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2408.07576 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2408.07576 |
| primary_location.id | pmh:oai:arXiv.org:2408.07576 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://arxiv.org/pdf/2408.07576 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2408.07576 |
| publication_date | 2024-08-14 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 58, 83, 109, 118, 130, 134, 157, 185, 226 |
| abstract_inverted_index.In | 100, 205 |
| abstract_inverted_index.To | 166 |
| abstract_inverted_index.We | 56 |
| abstract_inverted_index.an | 16 |
| abstract_inverted_index.as | 94, 96 |
| abstract_inverted_index.at | 241 |
| abstract_inverted_index.in | 51, 86 |
| abstract_inverted_index.is | 4, 19, 125, 239 |
| abstract_inverted_index.it | 3, 32 |
| abstract_inverted_index.of | 13, 25, 45, 156, 177, 196 |
| abstract_inverted_index.on | 221 |
| abstract_inverted_index.to | 6, 9, 21, 72, 140, 161 |
| abstract_inverted_index.us | 139 |
| abstract_inverted_index.we | 41, 183 |
| abstract_inverted_index.Our | 75 |
| abstract_inverted_index.The | 237 |
| abstract_inverted_index.and | 117, 149, 173, 199, 225, 235 |
| abstract_inverted_index.for | 34, 91, 97, 112, 120, 180 |
| abstract_inverted_index.how | 8 |
| abstract_inverted_index.key | 200 |
| abstract_inverted_index.one | 203 |
| abstract_inverted_index.our | 151, 208 |
| abstract_inverted_index.the | 1, 11, 14, 22, 26, 35, 43, 46, 52, 66, 70, 73, 79, 88, 92, 98, 114, 122, 142, 146, 163, 169, 174, 178, 193, 197, 202, 212 |
| abstract_inverted_index.This | 137 |
| abstract_inverted_index.both | 168 |
| abstract_inverted_index.code | 238 |
| abstract_inverted_index.from | 69 |
| abstract_inverted_index.have | 30, 105 |
| abstract_inverted_index.into | 201 |
| abstract_inverted_index.more | 49, 126, 217 |
| abstract_inverted_index.only | 33 |
| abstract_inverted_index.role | 85 |
| abstract_inverted_index.than | 128 |
| abstract_inverted_index.that | 18, 78, 107, 191 |
| abstract_inverted_index.this | 206 |
| abstract_inverted_index.way, | 207 |
| abstract_inverted_index.well | 95 |
| abstract_inverted_index.with | 133, 216 |
| abstract_inverted_index.(CRA) | 189 |
| abstract_inverted_index.adopt | 141 |
| abstract_inverted_index.block | 148 |
| abstract_inverted_index.costs | 220 |
| abstract_inverted_index.image | 228 |
| abstract_inverted_index.novel | 158 |
| abstract_inverted_index.plays | 82 |
| abstract_inverted_index.query | 198 |
| abstract_inverted_index.shown | 106 |
| abstract_inverted_index.shows | 77 |
| abstract_inverted_index.task. | 55 |
| abstract_inverted_index.using | 108, 129, 145 |
| abstract_inverted_index.which | 64, 154 |
| abstract_inverted_index.Beyond | 0 |
| abstract_inverted_index.Unlike | 38 |
| abstract_inverted_index.design | 150 |
| abstract_inverted_index.global | 123, 164, 170 |
| abstract_inverted_index.module | 160, 190 |
| abstract_inverted_index.recent | 102 |
| abstract_inverted_index.useful | 89 |
| abstract_inverted_index.ADE20K, | 232 |
| abstract_inverted_index.Channel | 186 |
| abstract_inverted_index.MetaSeg | 76, 210 |
| abstract_inverted_index.capture | 162 |
| abstract_inverted_index.channel | 194 |
| abstract_inverted_index.decoder | 93, 119 |
| abstract_inverted_index.exploit | 10 |
| abstract_inverted_index.explore | 7, 42 |
| abstract_inverted_index.medical | 227 |
| abstract_inverted_index.methods | 104, 215 |
| abstract_inverted_index.popular | 222 |
| abstract_inverted_index.propose | 57, 184 |
| abstract_inverted_index.reduces | 192 |
| abstract_inverted_index.spatial | 115 |
| abstract_inverted_index.studies | 29 |
| abstract_inverted_index.MetaSeg, | 63 |
| abstract_inverted_index.Previous | 28 |
| abstract_inverted_index.Synapse. | 236 |
| abstract_inverted_index.backbone | 36, 71, 111, 132, 144 |
| abstract_inverted_index.capacity | 12, 44 |
| abstract_inverted_index.consider | 167 |
| abstract_inverted_index.consists | 155 |
| abstract_inverted_index.contexts | 90, 171 |
| abstract_inverted_index.decoder, | 153 |
| abstract_inverted_index.decoder. | 74, 136 |
| abstract_inverted_index.network, | 62 |
| abstract_inverted_index.network. | 37 |
| abstract_inverted_index.powerful | 59 |
| abstract_inverted_index.previous | 39, 213 |
| abstract_inverted_index.proposed | 209 |
| abstract_inverted_index.semantic | 53, 60, 181, 223 |
| abstract_inverted_index.studies, | 40 |
| abstract_inverted_index.Attention | 188 |
| abstract_inverted_index.CNN-based | 110, 135, 143 |
| abstract_inverted_index.Reduction | 187 |
| abstract_inverted_index.addition, | 101 |
| abstract_inverted_index.available | 240 |
| abstract_inverted_index.backbone. | 99 |
| abstract_inverted_index.capturing | 87 |
| abstract_inverted_index.contexts. | 165 |
| abstract_inverted_index.dimension | 195 |
| abstract_inverted_index.effective | 127 |
| abstract_inverted_index.efficient | 218 |
| abstract_inverted_index.exploited | 31 |
| abstract_inverted_index.important | 5 |
| abstract_inverted_index.including | 231 |
| abstract_inverted_index.leverages | 65 |
| abstract_inverted_index.motivates | 138 |
| abstract_inverted_index.MetaFormer | 80, 147 |
| abstract_inverted_index.Metaformer | 47, 67 |
| abstract_inverted_index.benchmark, | 230 |
| abstract_inverted_index.dimension. | 204 |
| abstract_inverted_index.efficiency | 176 |
| abstract_inverted_index.extracting | 113, 121 |
| abstract_inverted_index.extraction | 172 |
| abstract_inverted_index.COCO-stuff, | 234 |
| abstract_inverted_index.Cityscapes, | 233 |
| abstract_inverted_index.MetaFormer, | 15 |
| abstract_inverted_index.extensively | 50 |
| abstract_inverted_index.fundamental | 20 |
| abstract_inverted_index.information | 116, 124 |
| abstract_inverted_index.outperforms | 211 |
| abstract_inverted_index.performance | 23 |
| abstract_inverted_index.significant | 84 |
| abstract_inverted_index.Transformer, | 2 |
| abstract_inverted_index.Transformer. | 27 |
| abstract_inverted_index.architecture | 17, 48, 68, 81 |
| abstract_inverted_index.improvements | 24 |
| abstract_inverted_index.segmentation | 54, 61, 103, 224, 229 |
| abstract_inverted_index.computational | 175, 219 |
| abstract_inverted_index.segmentation, | 182 |
| abstract_inverted_index.self-attention | 159, 179 |
| abstract_inverted_index.MetaFormer-based | 152 |
| abstract_inverted_index.state-of-the-art | 214 |
| abstract_inverted_index.transformer-based | 131 |
| abstract_inverted_index.https://github.com/hyunwoo137/MetaSeg. | 242 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |