SegEarth-R1: Geospatial Pixel Reasoning via Large Language Model Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2504.09644
Remote sensing has become critical for understanding environmental dynamics, urban planning, and disaster management. However, traditional remote sensing workflows often rely on explicit segmentation or detection methods, which struggle to handle complex, implicit queries that require reasoning over spatial context, domain knowledge, and implicit user intent. Motivated by this, we introduce a new task, \ie, geospatial pixel reasoning, which allows implicit querying and reasoning and generates the mask of the target region. To advance this task, we construct and release the first large-scale benchmark dataset called EarthReason, which comprises 5,434 manually annotated image masks with over 30,000 implicit question-answer pairs. Moreover, we propose SegEarth-R1, a simple yet effective language-guided segmentation baseline that integrates a hierarchical visual encoder, a large language model (LLM) for instruction parsing, and a tailored mask generator for spatial correlation. The design of SegEarth-R1 incorporates domain-specific adaptations, including aggressive visual token compression to handle ultra-high-resolution remote sensing images, a description projection module to fuse language and multi-scale features, and a streamlined mask prediction pipeline that directly queries description embeddings. Extensive experiments demonstrate that SegEarth-R1 achieves state-of-the-art performance on both reasoning and referring segmentation tasks, significantly outperforming traditional and LLM-based segmentation methods. Our data and code will be released at https://github.com/earth-insights/SegEarth-R1.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2504.09644
- https://arxiv.org/pdf/2504.09644
- OA Status
- green
- Cited By
- 1
- OpenAlex ID
- https://openalex.org/W4415158162
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415158162Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2504.09644Digital Object Identifier
- Title
-
SegEarth-R1: Geospatial Pixel Reasoning via Large Language ModelWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-04-13Full publication date if available
- Authors
-
Kaiyu Li, Zepeng Xin, Pang Li, Chao Pang, Yupeng Deng, Jing Yao, Gui-Song Xia, Deyu Meng, Zhi Wang, Xiangyong CaoList of authors in order
- Landing page
-
https://arxiv.org/abs/2504.09644Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2504.09644Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2504.09644Direct OA link when available
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1Per-year citation counts (last 5 years)
Full payload
| id | https://openalex.org/W4415158162 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2504.09644 |
| ids.doi | https://doi.org/10.48550/arxiv.2504.09644 |
| ids.openalex | https://openalex.org/W4415158162 |
| fwci | |
| type | preprint |
| title | SegEarth-R1: Geospatial Pixel Reasoning via Large Language Model |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10757 |
| topics[0].field.id | https://openalex.org/fields/33 |
| topics[0].field.display_name | Social Sciences |
| topics[0].score | 0.9567999839782715 |
| topics[0].domain.id | https://openalex.org/domains/2 |
| topics[0].domain.display_name | Social Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/3305 |
| topics[0].subfield.display_name | Geography, Planning and Development |
| topics[0].display_name | Geographic Information Systems Studies |
| topics[1].id | https://openalex.org/T10215 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9312999844551086 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Semantic Web and Ontologies |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2504.09644 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2504.09644 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2504.09644 |
| locations[1].id | doi:10.48550/arxiv.2504.09644 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2504.09644 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5078760931 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-8015-6245 |
| authorships[0].author.display_name | Kaiyu Li |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Li, Kaiyu |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5116013973 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Zepeng Xin |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Xin, Zepeng |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5101510222 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-1992-2251 |
| authorships[2].author.display_name | Pang Li |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Pang, Li |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5100545207 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Chao Pang |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Pang, Chao |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5101545150 |
| authorships[4].author.orcid | https://orcid.org/0009-0008-9391-718X |
| authorships[4].author.display_name | Yupeng Deng |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Deng, Yupeng |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5107270420 |
| authorships[5].author.orcid | https://orcid.org/0000-0002-8867-3079 |
| authorships[5].author.display_name | Jing Yao |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Yao, Jing |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5073032922 |
| authorships[6].author.orcid | https://orcid.org/0000-0001-7660-6090 |
| authorships[6].author.display_name | Gui-Song Xia |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Xia, Guisong |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5091017287 |
| authorships[7].author.orcid | https://orcid.org/0000-0002-1294-8283 |
| authorships[7].author.display_name | Deyu Meng |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Meng, Deyu |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5100376418 |
| authorships[8].author.orcid | https://orcid.org/0000-0003-1693-7183 |
| authorships[8].author.display_name | Zhi Wang |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Wang, Zhi |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5028103486 |
| authorships[9].author.orcid | https://orcid.org/0000-0001-7912-3457 |
| authorships[9].author.display_name | Xiangyong Cao |
| authorships[9].author_position | last |
| authorships[9].raw_author_name | Cao, Xiangyong |
| authorships[9].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2504.09644 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-14T00:00:00 |
| display_name | SegEarth-R1: Geospatial Pixel Reasoning via Large Language Model |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10757 |
| primary_topic.field.id | https://openalex.org/fields/33 |
| primary_topic.field.display_name | Social Sciences |
| primary_topic.score | 0.9567999839782715 |
| primary_topic.domain.id | https://openalex.org/domains/2 |
| primary_topic.domain.display_name | Social Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/3305 |
| primary_topic.subfield.display_name | Geography, Planning and Development |
| primary_topic.display_name | Geographic Information Systems Studies |
| cited_by_count | 1 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2504.09644 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2504.09644 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2504.09644 |
| primary_location.id | pmh:oai:arXiv.org:2504.09644 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2504.09644 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2504.09644 |
| publication_date | 2025-04-13 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 51, 104, 113, 117, 126, 151, 162 |
| abstract_inverted_index.To | 72 |
| abstract_inverted_index.at | 201 |
| abstract_inverted_index.be | 199 |
| abstract_inverted_index.by | 47 |
| abstract_inverted_index.of | 68, 135 |
| abstract_inverted_index.on | 21, 180 |
| abstract_inverted_index.or | 24 |
| abstract_inverted_index.to | 29, 145, 155 |
| abstract_inverted_index.we | 49, 76, 101 |
| abstract_inverted_index.Our | 194 |
| abstract_inverted_index.The | 133 |
| abstract_inverted_index.and | 11, 42, 62, 64, 78, 125, 158, 161, 183, 190, 196 |
| abstract_inverted_index.for | 5, 122, 130 |
| abstract_inverted_index.has | 2 |
| abstract_inverted_index.new | 52 |
| abstract_inverted_index.the | 66, 69, 80 |
| abstract_inverted_index.yet | 106 |
| abstract_inverted_index.\ie, | 54 |
| abstract_inverted_index.both | 181 |
| abstract_inverted_index.code | 197 |
| abstract_inverted_index.data | 195 |
| abstract_inverted_index.fuse | 156 |
| abstract_inverted_index.mask | 67, 128, 164 |
| abstract_inverted_index.over | 37, 95 |
| abstract_inverted_index.rely | 20 |
| abstract_inverted_index.that | 34, 111, 167, 175 |
| abstract_inverted_index.this | 74 |
| abstract_inverted_index.user | 44 |
| abstract_inverted_index.will | 198 |
| abstract_inverted_index.with | 94 |
| abstract_inverted_index.(LLM) | 121 |
| abstract_inverted_index.5,434 | 89 |
| abstract_inverted_index.first | 81 |
| abstract_inverted_index.image | 92 |
| abstract_inverted_index.large | 118 |
| abstract_inverted_index.masks | 93 |
| abstract_inverted_index.model | 120 |
| abstract_inverted_index.often | 19 |
| abstract_inverted_index.pixel | 56 |
| abstract_inverted_index.task, | 53, 75 |
| abstract_inverted_index.this, | 48 |
| abstract_inverted_index.token | 143 |
| abstract_inverted_index.urban | 9 |
| abstract_inverted_index.which | 27, 58, 87 |
| abstract_inverted_index.30,000 | 96 |
| abstract_inverted_index.Remote | 0 |
| abstract_inverted_index.allows | 59 |
| abstract_inverted_index.become | 3 |
| abstract_inverted_index.called | 85 |
| abstract_inverted_index.design | 134 |
| abstract_inverted_index.domain | 40 |
| abstract_inverted_index.handle | 30, 146 |
| abstract_inverted_index.module | 154 |
| abstract_inverted_index.pairs. | 99 |
| abstract_inverted_index.remote | 16, 148 |
| abstract_inverted_index.simple | 105 |
| abstract_inverted_index.target | 70 |
| abstract_inverted_index.tasks, | 186 |
| abstract_inverted_index.visual | 115, 142 |
| abstract_inverted_index.advance | 73 |
| abstract_inverted_index.dataset | 84 |
| abstract_inverted_index.images, | 150 |
| abstract_inverted_index.intent. | 45 |
| abstract_inverted_index.propose | 102 |
| abstract_inverted_index.queries | 33, 169 |
| abstract_inverted_index.region. | 71 |
| abstract_inverted_index.release | 79 |
| abstract_inverted_index.require | 35 |
| abstract_inverted_index.sensing | 1, 17, 149 |
| abstract_inverted_index.spatial | 38, 131 |
| abstract_inverted_index.However, | 14 |
| abstract_inverted_index.achieves | 177 |
| abstract_inverted_index.baseline | 110 |
| abstract_inverted_index.complex, | 31 |
| abstract_inverted_index.context, | 39 |
| abstract_inverted_index.critical | 4 |
| abstract_inverted_index.directly | 168 |
| abstract_inverted_index.disaster | 12 |
| abstract_inverted_index.encoder, | 116 |
| abstract_inverted_index.explicit | 22 |
| abstract_inverted_index.implicit | 32, 43, 60, 97 |
| abstract_inverted_index.language | 119, 157 |
| abstract_inverted_index.manually | 90 |
| abstract_inverted_index.methods, | 26 |
| abstract_inverted_index.methods. | 193 |
| abstract_inverted_index.parsing, | 124 |
| abstract_inverted_index.pipeline | 166 |
| abstract_inverted_index.querying | 61 |
| abstract_inverted_index.released | 200 |
| abstract_inverted_index.struggle | 28 |
| abstract_inverted_index.tailored | 127 |
| abstract_inverted_index.Extensive | 172 |
| abstract_inverted_index.LLM-based | 191 |
| abstract_inverted_index.Moreover, | 100 |
| abstract_inverted_index.Motivated | 46 |
| abstract_inverted_index.annotated | 91 |
| abstract_inverted_index.benchmark | 83 |
| abstract_inverted_index.comprises | 88 |
| abstract_inverted_index.construct | 77 |
| abstract_inverted_index.detection | 25 |
| abstract_inverted_index.dynamics, | 8 |
| abstract_inverted_index.effective | 107 |
| abstract_inverted_index.features, | 160 |
| abstract_inverted_index.generates | 65 |
| abstract_inverted_index.generator | 129 |
| abstract_inverted_index.including | 140 |
| abstract_inverted_index.introduce | 50 |
| abstract_inverted_index.planning, | 10 |
| abstract_inverted_index.reasoning | 36, 63, 182 |
| abstract_inverted_index.referring | 184 |
| abstract_inverted_index.workflows | 18 |
| abstract_inverted_index.aggressive | 141 |
| abstract_inverted_index.geospatial | 55 |
| abstract_inverted_index.integrates | 112 |
| abstract_inverted_index.knowledge, | 41 |
| abstract_inverted_index.prediction | 165 |
| abstract_inverted_index.projection | 153 |
| abstract_inverted_index.reasoning, | 57 |
| abstract_inverted_index.SegEarth-R1 | 136, 176 |
| abstract_inverted_index.compression | 144 |
| abstract_inverted_index.demonstrate | 174 |
| abstract_inverted_index.description | 152, 170 |
| abstract_inverted_index.embeddings. | 171 |
| abstract_inverted_index.experiments | 173 |
| abstract_inverted_index.instruction | 123 |
| abstract_inverted_index.large-scale | 82 |
| abstract_inverted_index.management. | 13 |
| abstract_inverted_index.multi-scale | 159 |
| abstract_inverted_index.performance | 179 |
| abstract_inverted_index.streamlined | 163 |
| abstract_inverted_index.traditional | 15, 189 |
| abstract_inverted_index.EarthReason, | 86 |
| abstract_inverted_index.SegEarth-R1, | 103 |
| abstract_inverted_index.adaptations, | 139 |
| abstract_inverted_index.correlation. | 132 |
| abstract_inverted_index.hierarchical | 114 |
| abstract_inverted_index.incorporates | 137 |
| abstract_inverted_index.segmentation | 23, 109, 185, 192 |
| abstract_inverted_index.environmental | 7 |
| abstract_inverted_index.outperforming | 188 |
| abstract_inverted_index.significantly | 187 |
| abstract_inverted_index.understanding | 6 |
| abstract_inverted_index.domain-specific | 138 |
| abstract_inverted_index.language-guided | 108 |
| abstract_inverted_index.question-answer | 98 |
| abstract_inverted_index.state-of-the-art | 178 |
| abstract_inverted_index.ultra-high-resolution | 147 |
| abstract_inverted_index.https://github.com/earth-insights/SegEarth-R1. | 202 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 10 |
| citation_normalized_percentile |