Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2505.13887
The exponential rise in mobile device usage necessitates streamlined automation for effective task management, yet many AI frameworks fall short due to inadequate operational expertise. While manually written knowledge can bridge this gap, it is often burdensome and inefficient. We introduce Mobile-Agent-V, an innovative framework that utilizes video as a guiding tool to effortlessly and efficiently inject operational knowledge into mobile automation processes. By deriving knowledge directly from video content, Mobile-Agent-V eliminates manual intervention, significantly reducing the effort and time required for knowledge acquisition. To rigorously evaluate this approach, we propose Mobile-Knowledge, a benchmark tailored to assess the impact of external knowledge on mobile agent performance. Our experimental findings demonstrate that Mobile-Agent-V enhances performance by 36% compared to existing methods, underscoring its effortless and efficient advantages in mobile automation.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2505.13887
- https://arxiv.org/pdf/2505.13887
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415320129
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415320129Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2505.13887Digital Object Identifier
- Title
-
Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile AutomationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-05-20Full publication date if available
- Authors
-
Junyang Wang, Haiyang Xu, Xi Zhang, Ming Yan, Ji Zhang, Fei Huang, Jitao SangList of authors in order
- Landing page
-
https://arxiv.org/abs/2505.13887Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2505.13887Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2505.13887Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415320129 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2505.13887 |
| ids.doi | https://doi.org/10.48550/arxiv.2505.13887 |
| ids.openalex | https://openalex.org/W4415320129 |
| fwci | |
| type | preprint |
| title | Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10456 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9811999797821045 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Multi-Agent Systems and Negotiation |
| topics[1].id | https://openalex.org/T12203 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9769999980926514 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1705 |
| topics[1].subfield.display_name | Computer Networks and Communications |
| topics[1].display_name | Mobile Agent-Based Network Management |
| topics[2].id | https://openalex.org/T13382 |
| topics[2].field.id | https://openalex.org/fields/22 |
| topics[2].field.display_name | Engineering |
| topics[2].score | 0.9376999735832214 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2207 |
| topics[2].subfield.display_name | Control and Systems Engineering |
| topics[2].display_name | Robotics and Automated Systems |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2505.13887 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2505.13887 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2505.13887 |
| locations[1].id | doi:10.48550/arxiv.2505.13887 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2505.13887 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5018357502 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-3204-6607 |
| authorships[0].author.display_name | Junyang Wang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Wang, Junyang |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5038379561 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-4486-0590 |
| authorships[1].author.display_name | Haiyang Xu |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Xu, Haiyang |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5100430858 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-8994-8185 |
| authorships[2].author.display_name | Xi Zhang |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Zhang, Xi |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5000844861 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-4388-6708 |
| authorships[3].author.display_name | Ming Yan |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Yan, Ming |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5100329266 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-3835-7975 |
| authorships[4].author.display_name | Ji Zhang |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Zhang, Ji |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5101488344 |
| authorships[5].author.orcid | https://orcid.org/0000-0002-3709-5053 |
| authorships[5].author.display_name | Fei Huang |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Huang, Fei |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5023834030 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-0699-3205 |
| authorships[6].author.display_name | Jitao Sang |
| authorships[6].author_position | last |
| authorships[6].raw_author_name | Sang, Jitao |
| authorships[6].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2505.13887 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-18T00:00:00 |
| display_name | Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10456 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9811999797821045 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Multi-Agent Systems and Negotiation |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2505.13887 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2505.13887 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2505.13887 |
| primary_location.id | pmh:oai:arXiv.org:2505.13887 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2505.13887 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2505.13887 |
| publication_date | 2025-05-20 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 49, 92 |
| abstract_inverted_index.AI | 16 |
| abstract_inverted_index.By | 63 |
| abstract_inverted_index.To | 84 |
| abstract_inverted_index.We | 39 |
| abstract_inverted_index.an | 42 |
| abstract_inverted_index.as | 48 |
| abstract_inverted_index.by | 114 |
| abstract_inverted_index.in | 3, 126 |
| abstract_inverted_index.is | 34 |
| abstract_inverted_index.it | 33 |
| abstract_inverted_index.of | 99 |
| abstract_inverted_index.on | 102 |
| abstract_inverted_index.to | 21, 52, 95, 117 |
| abstract_inverted_index.we | 89 |
| abstract_inverted_index.36% | 115 |
| abstract_inverted_index.Our | 106 |
| abstract_inverted_index.The | 0 |
| abstract_inverted_index.and | 37, 54, 78, 123 |
| abstract_inverted_index.can | 29 |
| abstract_inverted_index.due | 20 |
| abstract_inverted_index.for | 10, 81 |
| abstract_inverted_index.its | 121 |
| abstract_inverted_index.the | 76, 97 |
| abstract_inverted_index.yet | 14 |
| abstract_inverted_index.fall | 18 |
| abstract_inverted_index.from | 67 |
| abstract_inverted_index.gap, | 32 |
| abstract_inverted_index.into | 59 |
| abstract_inverted_index.many | 15 |
| abstract_inverted_index.rise | 2 |
| abstract_inverted_index.task | 12 |
| abstract_inverted_index.that | 45, 110 |
| abstract_inverted_index.this | 31, 87 |
| abstract_inverted_index.time | 79 |
| abstract_inverted_index.tool | 51 |
| abstract_inverted_index.While | 25 |
| abstract_inverted_index.agent | 104 |
| abstract_inverted_index.often | 35 |
| abstract_inverted_index.short | 19 |
| abstract_inverted_index.usage | 6 |
| abstract_inverted_index.video | 47, 68 |
| abstract_inverted_index.assess | 96 |
| abstract_inverted_index.bridge | 30 |
| abstract_inverted_index.device | 5 |
| abstract_inverted_index.effort | 77 |
| abstract_inverted_index.impact | 98 |
| abstract_inverted_index.inject | 56 |
| abstract_inverted_index.manual | 72 |
| abstract_inverted_index.mobile | 4, 60, 103, 127 |
| abstract_inverted_index.guiding | 50 |
| abstract_inverted_index.propose | 90 |
| abstract_inverted_index.written | 27 |
| abstract_inverted_index.compared | 116 |
| abstract_inverted_index.content, | 69 |
| abstract_inverted_index.deriving | 64 |
| abstract_inverted_index.directly | 66 |
| abstract_inverted_index.enhances | 112 |
| abstract_inverted_index.evaluate | 86 |
| abstract_inverted_index.existing | 118 |
| abstract_inverted_index.external | 100 |
| abstract_inverted_index.findings | 108 |
| abstract_inverted_index.manually | 26 |
| abstract_inverted_index.methods, | 119 |
| abstract_inverted_index.reducing | 75 |
| abstract_inverted_index.required | 80 |
| abstract_inverted_index.tailored | 94 |
| abstract_inverted_index.utilizes | 46 |
| abstract_inverted_index.approach, | 88 |
| abstract_inverted_index.benchmark | 93 |
| abstract_inverted_index.effective | 11 |
| abstract_inverted_index.efficient | 124 |
| abstract_inverted_index.framework | 44 |
| abstract_inverted_index.introduce | 40 |
| abstract_inverted_index.knowledge | 28, 58, 65, 82, 101 |
| abstract_inverted_index.advantages | 125 |
| abstract_inverted_index.automation | 9, 61 |
| abstract_inverted_index.burdensome | 36 |
| abstract_inverted_index.effortless | 122 |
| abstract_inverted_index.eliminates | 71 |
| abstract_inverted_index.expertise. | 24 |
| abstract_inverted_index.frameworks | 17 |
| abstract_inverted_index.inadequate | 22 |
| abstract_inverted_index.innovative | 43 |
| abstract_inverted_index.processes. | 62 |
| abstract_inverted_index.rigorously | 85 |
| abstract_inverted_index.automation. | 128 |
| abstract_inverted_index.demonstrate | 109 |
| abstract_inverted_index.efficiently | 55 |
| abstract_inverted_index.exponential | 1 |
| abstract_inverted_index.management, | 13 |
| abstract_inverted_index.operational | 23, 57 |
| abstract_inverted_index.performance | 113 |
| abstract_inverted_index.streamlined | 8 |
| abstract_inverted_index.acquisition. | 83 |
| abstract_inverted_index.effortlessly | 53 |
| abstract_inverted_index.experimental | 107 |
| abstract_inverted_index.inefficient. | 38 |
| abstract_inverted_index.necessitates | 7 |
| abstract_inverted_index.performance. | 105 |
| abstract_inverted_index.underscoring | 120 |
| abstract_inverted_index.intervention, | 73 |
| abstract_inverted_index.significantly | 74 |
| abstract_inverted_index.Mobile-Agent-V | 70, 111 |
| abstract_inverted_index.Mobile-Agent-V, | 41 |
| abstract_inverted_index.Mobile-Knowledge, | 91 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 7 |
| citation_normalized_percentile |