MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task Automation Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2410.13757
Existing Multimodal Large Language Model (MLLM)-based agents face significant challenges in handling complex GUI (Graphical User Interface) interactions on devices. These challenges arise from the dynamic and structured nature of GUI environments, which integrate text, images, and spatial relationships, as well as the variability in action spaces across different pages and tasks. To address these limitations, we propose MobA, a novel MLLM-based mobile assistant system. MobA introduces an adaptive planning module that incorporates a reflection mechanism for error recovery and dynamically adjusts plans to align with the real environment contexts and action module's execution capacity. Additionally, a multifaceted memory module provides comprehensive memory support to enhance adaptability and efficiency. We also present MobBench, a dataset designed for complex mobile interactions. Experimental results on MobBench and AndroidArena demonstrate MobA's ability to handle dynamic GUI environments and perform complex mobile tasks.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2410.13757
- https://arxiv.org/pdf/2410.13757
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4403580074
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4403580074Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2410.13757Digital Object Identifier
- Title
-
MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task AutomationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-10-17Full publication date if available
- Authors
-
Zichen Zhu, Hao Tang, Y.F. Li, Kunyao Lan, Yixuan Jiang, Hao Zhou, Yixiao Wang, Situo Zhang, Liangtai Sun, Lu Chen, Kai YuList of authors in order
- Landing page
-
https://arxiv.org/abs/2410.13757Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2410.13757Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2410.13757Direct OA link when available
- Concepts
-
Task (project management), Automation, Computer science, Human–computer interaction, Engineering, Systems engineering, Mechanical engineeringTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4403580074 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2410.13757 |
| ids.doi | https://doi.org/10.48550/arxiv.2410.13757 |
| ids.openalex | https://openalex.org/W4403580074 |
| fwci | |
| type | preprint |
| title | MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task Automation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T12203 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9868999719619751 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1705 |
| topics[0].subfield.display_name | Computer Networks and Communications |
| topics[0].display_name | Mobile Agent-Based Network Management |
| topics[1].id | https://openalex.org/T12288 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9164000153541565 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1705 |
| topics[1].subfield.display_name | Computer Networks and Communications |
| topics[1].display_name | Optimization and Search Problems |
| topics[2].id | https://openalex.org/T10456 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.907800018787384 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Multi-Agent Systems and Negotiation |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2780451532 |
| concepts[0].level | 2 |
| concepts[0].score | 0.674557089805603 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q759676 |
| concepts[0].display_name | Task (project management) |
| concepts[1].id | https://openalex.org/C115901376 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6063978672027588 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q184199 |
| concepts[1].display_name | Automation |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.5833155512809753 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C107457646 |
| concepts[3].level | 1 |
| concepts[3].score | 0.4097844064235687 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q207434 |
| concepts[3].display_name | Human–computer interaction |
| concepts[4].id | https://openalex.org/C127413603 |
| concepts[4].level | 0 |
| concepts[4].score | 0.19525879621505737 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11023 |
| concepts[4].display_name | Engineering |
| concepts[5].id | https://openalex.org/C201995342 |
| concepts[5].level | 1 |
| concepts[5].score | 0.12668541073799133 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q682496 |
| concepts[5].display_name | Systems engineering |
| concepts[6].id | https://openalex.org/C78519656 |
| concepts[6].level | 1 |
| concepts[6].score | 0.0 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q101333 |
| concepts[6].display_name | Mechanical engineering |
| keywords[0].id | https://openalex.org/keywords/task |
| keywords[0].score | 0.674557089805603 |
| keywords[0].display_name | Task (project management) |
| keywords[1].id | https://openalex.org/keywords/automation |
| keywords[1].score | 0.6063978672027588 |
| keywords[1].display_name | Automation |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.5833155512809753 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/human–computer-interaction |
| keywords[3].score | 0.4097844064235687 |
| keywords[3].display_name | Human–computer interaction |
| keywords[4].id | https://openalex.org/keywords/engineering |
| keywords[4].score | 0.19525879621505737 |
| keywords[4].display_name | Engineering |
| keywords[5].id | https://openalex.org/keywords/systems-engineering |
| keywords[5].score | 0.12668541073799133 |
| keywords[5].display_name | Systems engineering |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2410.13757 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2410.13757 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2410.13757 |
| locations[1].id | doi:10.48550/arxiv.2410.13757 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2410.13757 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5002718343 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-9197-4649 |
| authorships[0].author.display_name | Zichen Zhu |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Zhu, Zichen |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5025097010 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-9409-5220 |
| authorships[1].author.display_name | Hao Tang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Tang, Hao |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5066886597 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Y.F. Li |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Li, Yansi |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5095762232 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Kunyao Lan |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Lan, Kunyao |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5101193401 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Yixuan Jiang |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Jiang, Yixuan |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5062361637 |
| authorships[5].author.orcid | https://orcid.org/0000-0002-3150-7029 |
| authorships[5].author.display_name | Hao Zhou |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Zhou, Hao |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5100651552 |
| authorships[6].author.orcid | https://orcid.org/0009-0009-6672-3395 |
| authorships[6].author.display_name | Yixiao Wang |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Wang, Yixiao |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5108750212 |
| authorships[7].author.orcid | |
| authorships[7].author.display_name | Situo Zhang |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Zhang, Situo |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5015801419 |
| authorships[8].author.orcid | |
| authorships[8].author.display_name | Liangtai Sun |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Sun, Liangtai |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5100432093 |
| authorships[9].author.orcid | https://orcid.org/0000-0002-5685-7017 |
| authorships[9].author.display_name | Lu Chen |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | Chen, Lu |
| authorships[9].is_corresponding | False |
| authorships[10].author.id | https://openalex.org/A5043098653 |
| authorships[10].author.orcid | https://orcid.org/0000-0002-7102-9826 |
| authorships[10].author.display_name | Kai Yu |
| authorships[10].author_position | last |
| authorships[10].raw_author_name | Yu, Kai |
| authorships[10].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2410.13757 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2024-10-21T00:00:00 |
| display_name | MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task Automation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-12T23:11:45.498971 |
| primary_topic.id | https://openalex.org/T12203 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9868999719619751 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1705 |
| primary_topic.subfield.display_name | Computer Networks and Communications |
| primary_topic.display_name | Mobile Agent-Based Network Management |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W3196817267, https://openalex.org/W1976600725 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2410.13757 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2410.13757 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2410.13757 |
| primary_location.id | pmh:oai:arXiv.org:2410.13757 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2410.13757 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2410.13757 |
| publication_date | 2024-10-17 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 59, 73, 96, 113 |
| abstract_inverted_index.To | 52 |
| abstract_inverted_index.We | 109 |
| abstract_inverted_index.an | 67 |
| abstract_inverted_index.as | 39, 41 |
| abstract_inverted_index.in | 10, 44 |
| abstract_inverted_index.of | 29 |
| abstract_inverted_index.on | 18, 122 |
| abstract_inverted_index.to | 83, 104, 129 |
| abstract_inverted_index.we | 56 |
| abstract_inverted_index.GUI | 13, 30, 132 |
| abstract_inverted_index.and | 26, 36, 50, 79, 90, 107, 124, 134 |
| abstract_inverted_index.for | 76, 116 |
| abstract_inverted_index.the | 24, 42, 86 |
| abstract_inverted_index.MobA | 65 |
| abstract_inverted_index.User | 15 |
| abstract_inverted_index.also | 110 |
| abstract_inverted_index.face | 7 |
| abstract_inverted_index.from | 23 |
| abstract_inverted_index.real | 87 |
| abstract_inverted_index.that | 71 |
| abstract_inverted_index.well | 40 |
| abstract_inverted_index.with | 85 |
| abstract_inverted_index.Large | 2 |
| abstract_inverted_index.MobA, | 58 |
| abstract_inverted_index.Model | 4 |
| abstract_inverted_index.These | 20 |
| abstract_inverted_index.align | 84 |
| abstract_inverted_index.arise | 22 |
| abstract_inverted_index.error | 77 |
| abstract_inverted_index.novel | 60 |
| abstract_inverted_index.pages | 49 |
| abstract_inverted_index.plans | 82 |
| abstract_inverted_index.text, | 34 |
| abstract_inverted_index.these | 54 |
| abstract_inverted_index.which | 32 |
| abstract_inverted_index.MobA's | 127 |
| abstract_inverted_index.across | 47 |
| abstract_inverted_index.action | 45, 91 |
| abstract_inverted_index.agents | 6 |
| abstract_inverted_index.handle | 130 |
| abstract_inverted_index.memory | 98, 102 |
| abstract_inverted_index.mobile | 62, 118, 137 |
| abstract_inverted_index.module | 70, 99 |
| abstract_inverted_index.nature | 28 |
| abstract_inverted_index.spaces | 46 |
| abstract_inverted_index.tasks. | 51, 138 |
| abstract_inverted_index.ability | 128 |
| abstract_inverted_index.address | 53 |
| abstract_inverted_index.adjusts | 81 |
| abstract_inverted_index.complex | 12, 117, 136 |
| abstract_inverted_index.dataset | 114 |
| abstract_inverted_index.dynamic | 25, 131 |
| abstract_inverted_index.enhance | 105 |
| abstract_inverted_index.images, | 35 |
| abstract_inverted_index.perform | 135 |
| abstract_inverted_index.present | 111 |
| abstract_inverted_index.propose | 57 |
| abstract_inverted_index.results | 121 |
| abstract_inverted_index.spatial | 37 |
| abstract_inverted_index.support | 103 |
| abstract_inverted_index.system. | 64 |
| abstract_inverted_index.Existing | 0 |
| abstract_inverted_index.Language | 3 |
| abstract_inverted_index.MobBench | 123 |
| abstract_inverted_index.adaptive | 68 |
| abstract_inverted_index.contexts | 89 |
| abstract_inverted_index.designed | 115 |
| abstract_inverted_index.devices. | 19 |
| abstract_inverted_index.handling | 11 |
| abstract_inverted_index.module's | 92 |
| abstract_inverted_index.planning | 69 |
| abstract_inverted_index.provides | 100 |
| abstract_inverted_index.recovery | 78 |
| abstract_inverted_index.MobBench, | 112 |
| abstract_inverted_index.assistant | 63 |
| abstract_inverted_index.capacity. | 94 |
| abstract_inverted_index.different | 48 |
| abstract_inverted_index.execution | 93 |
| abstract_inverted_index.integrate | 33 |
| abstract_inverted_index.mechanism | 75 |
| abstract_inverted_index.(Graphical | 14 |
| abstract_inverted_index.Interface) | 16 |
| abstract_inverted_index.MLLM-based | 61 |
| abstract_inverted_index.Multimodal | 1 |
| abstract_inverted_index.challenges | 9, 21 |
| abstract_inverted_index.introduces | 66 |
| abstract_inverted_index.reflection | 74 |
| abstract_inverted_index.structured | 27 |
| abstract_inverted_index.demonstrate | 126 |
| abstract_inverted_index.dynamically | 80 |
| abstract_inverted_index.efficiency. | 108 |
| abstract_inverted_index.environment | 88 |
| abstract_inverted_index.significant | 8 |
| abstract_inverted_index.variability | 43 |
| abstract_inverted_index.(MLLM)-based | 5 |
| abstract_inverted_index.AndroidArena | 125 |
| abstract_inverted_index.Experimental | 120 |
| abstract_inverted_index.adaptability | 106 |
| abstract_inverted_index.environments | 133 |
| abstract_inverted_index.incorporates | 72 |
| abstract_inverted_index.interactions | 17 |
| abstract_inverted_index.limitations, | 55 |
| abstract_inverted_index.multifaceted | 97 |
| abstract_inverted_index.Additionally, | 95 |
| abstract_inverted_index.comprehensive | 101 |
| abstract_inverted_index.environments, | 31 |
| abstract_inverted_index.interactions. | 119 |
| abstract_inverted_index.relationships, | 38 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 11 |
| citation_normalized_percentile |