GeoLLaVA: Efficient Fine-Tuned Vision-Language Models for Temporal Change Detection in Remote Sensing Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2410.19552
Detecting temporal changes in geographical landscapes is critical for applications like environmental monitoring and urban planning. While remote sensing data is abundant, existing vision-language models (VLMs) often fail to capture temporal dynamics effectively. This paper addresses these limitations by introducing an annotated dataset of video frame pairs to track evolving geographical patterns over time. Using fine-tuning techniques like Low-Rank Adaptation (LoRA), quantized LoRA (QLoRA), and model pruning on models such as Video-LLaVA and LLaVA-NeXT-Video, we significantly enhance VLM performance in processing remote sensing temporal changes. Results show significant improvements, with the best performance achieving a BERT score of 0.864 and ROUGE-1 score of 0.576, demonstrating superior accuracy in describing land-use transformations.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2410.19552
- https://arxiv.org/pdf/2410.19552
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4404312161
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4404312161Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2410.19552Digital Object Identifier
- Title
-
GeoLLaVA: Efficient Fine-Tuned Vision-Language Models for Temporal Change Detection in Remote SensingWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-10-25Full publication date if available
- Authors
-
Hosam Elgendy, Ahmed Sharshar, Ahmed Aboeitta, Yasser Ashraf, Mohsen GuizaniList of authors in order
- Landing page
-
https://arxiv.org/abs/2410.19552Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2410.19552Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2410.19552Direct OA link when available
- Concepts
-
Change detection, Computer science, Remote sensing, Computer vision, Artificial intelligence, GeographyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4404312161 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2410.19552 |
| ids.doi | https://doi.org/10.48550/arxiv.2410.19552 |
| ids.openalex | https://openalex.org/W4404312161 |
| fwci | |
| type | preprint |
| title | GeoLLaVA: Efficient Fine-Tuned Vision-Language Models for Temporal Change Detection in Remote Sensing |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10757 |
| topics[0].field.id | https://openalex.org/fields/33 |
| topics[0].field.display_name | Social Sciences |
| topics[0].score | 0.9139999747276306 |
| topics[0].domain.id | https://openalex.org/domains/2 |
| topics[0].domain.display_name | Social Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/3305 |
| topics[0].subfield.display_name | Geography, Planning and Development |
| topics[0].display_name | Geographic Information Systems Studies |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C203595873 |
| concepts[0].level | 2 |
| concepts[0].score | 0.5891968607902527 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q25389927 |
| concepts[0].display_name | Change detection |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.5307592749595642 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C62649853 |
| concepts[2].level | 1 |
| concepts[2].score | 0.5239220261573792 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q199687 |
| concepts[2].display_name | Remote sensing |
| concepts[3].id | https://openalex.org/C31972630 |
| concepts[3].level | 1 |
| concepts[3].score | 0.4661155343055725 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q844240 |
| concepts[3].display_name | Computer vision |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.4161761999130249 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C205649164 |
| concepts[5].level | 0 |
| concepts[5].score | 0.2178058922290802 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q1071 |
| concepts[5].display_name | Geography |
| keywords[0].id | https://openalex.org/keywords/change-detection |
| keywords[0].score | 0.5891968607902527 |
| keywords[0].display_name | Change detection |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.5307592749595642 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/remote-sensing |
| keywords[2].score | 0.5239220261573792 |
| keywords[2].display_name | Remote sensing |
| keywords[3].id | https://openalex.org/keywords/computer-vision |
| keywords[3].score | 0.4661155343055725 |
| keywords[3].display_name | Computer vision |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.4161761999130249 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/geography |
| keywords[5].score | 0.2178058922290802 |
| keywords[5].display_name | Geography |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2410.19552 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2410.19552 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2410.19552 |
| locations[1].id | doi:10.48550/arxiv.2410.19552 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2410.19552 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5093124886 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Hosam Elgendy |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Elgendy, Hosam |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5027381508 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-2280-5240 |
| authorships[1].author.display_name | Ahmed Sharshar |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Sharshar, Ahmed |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5094187960 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-8194-7073 |
| authorships[2].author.display_name | Ahmed Aboeitta |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Aboeitta, Ahmed |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5073010015 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-9350-5774 |
| authorships[3].author.display_name | Yasser Ashraf |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Ashraf, Yasser |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5057916222 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-8972-8094 |
| authorships[4].author.display_name | Mohsen Guizani |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Guizani, Mohsen |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2410.19552 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | GeoLLaVA: Efficient Fine-Tuned Vision-Language Models for Temporal Change Detection in Remote Sensing |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10757 |
| primary_topic.field.id | https://openalex.org/fields/33 |
| primary_topic.field.display_name | Social Sciences |
| primary_topic.score | 0.9139999747276306 |
| primary_topic.domain.id | https://openalex.org/domains/2 |
| primary_topic.domain.display_name | Social Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/3305 |
| primary_topic.subfield.display_name | Geography, Planning and Development |
| primary_topic.display_name | Geographic Information Systems Studies |
| related_works | https://openalex.org/W2772917594, https://openalex.org/W2036807459, https://openalex.org/W2058170566, https://openalex.org/W2755342338, https://openalex.org/W2166024367, https://openalex.org/W3116076068, https://openalex.org/W2229312674, https://openalex.org/W2951359407, https://openalex.org/W2079911747, https://openalex.org/W1969923398 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2410.19552 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2410.19552 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2410.19552 |
| primary_location.id | pmh:oai:arXiv.org:2410.19552 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2410.19552 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2410.19552 |
| publication_date | 2024-10-25 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 94 |
| abstract_inverted_index.an | 40 |
| abstract_inverted_index.as | 70 |
| abstract_inverted_index.by | 38 |
| abstract_inverted_index.in | 3, 79, 107 |
| abstract_inverted_index.is | 6, 20 |
| abstract_inverted_index.of | 43, 97, 102 |
| abstract_inverted_index.on | 67 |
| abstract_inverted_index.to | 28, 47 |
| abstract_inverted_index.we | 74 |
| abstract_inverted_index.VLM | 77 |
| abstract_inverted_index.and | 13, 64, 72, 99 |
| abstract_inverted_index.for | 8 |
| abstract_inverted_index.the | 90 |
| abstract_inverted_index.BERT | 95 |
| abstract_inverted_index.LoRA | 62 |
| abstract_inverted_index.This | 33 |
| abstract_inverted_index.best | 91 |
| abstract_inverted_index.data | 19 |
| abstract_inverted_index.fail | 27 |
| abstract_inverted_index.like | 10, 57 |
| abstract_inverted_index.over | 52 |
| abstract_inverted_index.show | 86 |
| abstract_inverted_index.such | 69 |
| abstract_inverted_index.with | 89 |
| abstract_inverted_index.0.864 | 98 |
| abstract_inverted_index.Using | 54 |
| abstract_inverted_index.While | 16 |
| abstract_inverted_index.frame | 45 |
| abstract_inverted_index.model | 65 |
| abstract_inverted_index.often | 26 |
| abstract_inverted_index.pairs | 46 |
| abstract_inverted_index.paper | 34 |
| abstract_inverted_index.score | 96, 101 |
| abstract_inverted_index.these | 36 |
| abstract_inverted_index.time. | 53 |
| abstract_inverted_index.track | 48 |
| abstract_inverted_index.urban | 14 |
| abstract_inverted_index.video | 44 |
| abstract_inverted_index.(VLMs) | 25 |
| abstract_inverted_index.0.576, | 103 |
| abstract_inverted_index.models | 24, 68 |
| abstract_inverted_index.remote | 17, 81 |
| abstract_inverted_index.(LoRA), | 60 |
| abstract_inverted_index.ROUGE-1 | 100 |
| abstract_inverted_index.Results | 85 |
| abstract_inverted_index.capture | 29 |
| abstract_inverted_index.changes | 2 |
| abstract_inverted_index.dataset | 42 |
| abstract_inverted_index.enhance | 76 |
| abstract_inverted_index.pruning | 66 |
| abstract_inverted_index.sensing | 18, 82 |
| abstract_inverted_index.(QLoRA), | 63 |
| abstract_inverted_index.Low-Rank | 58 |
| abstract_inverted_index.accuracy | 106 |
| abstract_inverted_index.changes. | 84 |
| abstract_inverted_index.critical | 7 |
| abstract_inverted_index.dynamics | 31 |
| abstract_inverted_index.evolving | 49 |
| abstract_inverted_index.existing | 22 |
| abstract_inverted_index.land-use | 109 |
| abstract_inverted_index.patterns | 51 |
| abstract_inverted_index.superior | 105 |
| abstract_inverted_index.temporal | 1, 30, 83 |
| abstract_inverted_index.Detecting | 0 |
| abstract_inverted_index.abundant, | 21 |
| abstract_inverted_index.achieving | 93 |
| abstract_inverted_index.addresses | 35 |
| abstract_inverted_index.annotated | 41 |
| abstract_inverted_index.planning. | 15 |
| abstract_inverted_index.quantized | 61 |
| abstract_inverted_index.Adaptation | 59 |
| abstract_inverted_index.describing | 108 |
| abstract_inverted_index.landscapes | 5 |
| abstract_inverted_index.monitoring | 12 |
| abstract_inverted_index.processing | 80 |
| abstract_inverted_index.techniques | 56 |
| abstract_inverted_index.Video-LLaVA | 71 |
| abstract_inverted_index.fine-tuning | 55 |
| abstract_inverted_index.introducing | 39 |
| abstract_inverted_index.limitations | 37 |
| abstract_inverted_index.performance | 78, 92 |
| abstract_inverted_index.significant | 87 |
| abstract_inverted_index.applications | 9 |
| abstract_inverted_index.effectively. | 32 |
| abstract_inverted_index.geographical | 4, 50 |
| abstract_inverted_index.demonstrating | 104 |
| abstract_inverted_index.environmental | 11 |
| abstract_inverted_index.improvements, | 88 |
| abstract_inverted_index.significantly | 75 |
| abstract_inverted_index.vision-language | 23 |
| abstract_inverted_index.transformations. | 110 |
| abstract_inverted_index.LLaVA-NeXT-Video, | 73 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |