Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2501.05020
Motion-controllable image animation is a fundamental task with a wide range of potential applications. Recent works have made progress in controlling camera or object motion via various motion representations, while they still struggle to support collaborative camera and object motion control with adaptive control granularity. To this end, we introduce 3D-aware motion representation and propose an image animation framework, called Perception-as-Control, to achieve fine-grained collaborative motion control. Specifically, we construct 3D-aware motion representation from a reference image, manipulate it based on interpreted user instructions, and perceive it from different viewpoints. In this way, camera and object motions are transformed into intuitive and consistent visual changes. Then, our framework leverages the perception results as motion control signals, enabling it to support various motion-related video synthesis tasks in a unified and flexible way. Experiments demonstrate the superiority of the proposed approach. For more details and qualitative results, please refer to our anonymous project webpage: https://chen-yingjie.github.io/projects/Perception-as-Control.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2501.05020
- https://arxiv.org/pdf/2501.05020
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4406272370
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4406272370Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2501.05020Digital Object Identifier
- Title
-
Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion RepresentationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-01-09Full publication date if available
- Authors
-
Yingjie Chen, Yifang Men, Yuan Yao, Miaomiao Cui, Liefeng BoList of authors in order
- Landing page
-
https://arxiv.org/abs/2501.05020Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2501.05020Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2501.05020Direct OA link when available
- Concepts
-
Representation (politics), Animation, Perception, Computer science, Computer vision, Motion (physics), Image (mathematics), Computer graphics (images), Artificial intelligence, Control (management), Psychology, Political science, Neuroscience, Law, PoliticsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4406272370 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2501.05020 |
| ids.doi | https://doi.org/10.48550/arxiv.2501.05020 |
| ids.openalex | https://openalex.org/W4406272370 |
| fwci | |
| type | preprint |
| title | Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10481 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9962000250816345 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1704 |
| topics[0].subfield.display_name | Computer Graphics and Computer-Aided Design |
| topics[0].display_name | Computer Graphics and Visualization Techniques |
| topics[1].id | https://openalex.org/T10531 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9878000020980835 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1707 |
| topics[1].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[1].display_name | Advanced Vision and Imaging |
| topics[2].id | https://openalex.org/T10719 |
| topics[2].field.id | https://openalex.org/fields/22 |
| topics[2].field.display_name | Engineering |
| topics[2].score | 0.9822999835014343 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2206 |
| topics[2].subfield.display_name | Computational Mechanics |
| topics[2].display_name | 3D Shape Modeling and Analysis |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2776359362 |
| concepts[0].level | 3 |
| concepts[0].score | 0.787050187587738 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q2145286 |
| concepts[0].display_name | Representation (politics) |
| concepts[1].id | https://openalex.org/C502989409 |
| concepts[1].level | 2 |
| concepts[1].score | 0.7593898773193359 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q11425 |
| concepts[1].display_name | Animation |
| concepts[2].id | https://openalex.org/C26760741 |
| concepts[2].level | 2 |
| concepts[2].score | 0.734490692615509 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q160402 |
| concepts[2].display_name | Perception |
| concepts[3].id | https://openalex.org/C41008148 |
| concepts[3].level | 0 |
| concepts[3].score | 0.6066927909851074 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[3].display_name | Computer science |
| concepts[4].id | https://openalex.org/C31972630 |
| concepts[4].level | 1 |
| concepts[4].score | 0.570254921913147 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q844240 |
| concepts[4].display_name | Computer vision |
| concepts[5].id | https://openalex.org/C104114177 |
| concepts[5].level | 2 |
| concepts[5].score | 0.5554514527320862 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q79782 |
| concepts[5].display_name | Motion (physics) |
| concepts[6].id | https://openalex.org/C115961682 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5550951361656189 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q860623 |
| concepts[6].display_name | Image (mathematics) |
| concepts[7].id | https://openalex.org/C121684516 |
| concepts[7].level | 1 |
| concepts[7].score | 0.5292341709136963 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q7600677 |
| concepts[7].display_name | Computer graphics (images) |
| concepts[8].id | https://openalex.org/C154945302 |
| concepts[8].level | 1 |
| concepts[8].score | 0.5019469261169434 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[8].display_name | Artificial intelligence |
| concepts[9].id | https://openalex.org/C2775924081 |
| concepts[9].level | 2 |
| concepts[9].score | 0.45250874757766724 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q55608371 |
| concepts[9].display_name | Control (management) |
| concepts[10].id | https://openalex.org/C15744967 |
| concepts[10].level | 0 |
| concepts[10].score | 0.19791319966316223 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q9418 |
| concepts[10].display_name | Psychology |
| concepts[11].id | https://openalex.org/C17744445 |
| concepts[11].level | 0 |
| concepts[11].score | 0.05840221047401428 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q36442 |
| concepts[11].display_name | Political science |
| concepts[12].id | https://openalex.org/C169760540 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q207011 |
| concepts[12].display_name | Neuroscience |
| concepts[13].id | https://openalex.org/C199539241 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q7748 |
| concepts[13].display_name | Law |
| concepts[14].id | https://openalex.org/C94625758 |
| concepts[14].level | 2 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q7163 |
| concepts[14].display_name | Politics |
| keywords[0].id | https://openalex.org/keywords/representation |
| keywords[0].score | 0.787050187587738 |
| keywords[0].display_name | Representation (politics) |
| keywords[1].id | https://openalex.org/keywords/animation |
| keywords[1].score | 0.7593898773193359 |
| keywords[1].display_name | Animation |
| keywords[2].id | https://openalex.org/keywords/perception |
| keywords[2].score | 0.734490692615509 |
| keywords[2].display_name | Perception |
| keywords[3].id | https://openalex.org/keywords/computer-science |
| keywords[3].score | 0.6066927909851074 |
| keywords[3].display_name | Computer science |
| keywords[4].id | https://openalex.org/keywords/computer-vision |
| keywords[4].score | 0.570254921913147 |
| keywords[4].display_name | Computer vision |
| keywords[5].id | https://openalex.org/keywords/motion |
| keywords[5].score | 0.5554514527320862 |
| keywords[5].display_name | Motion (physics) |
| keywords[6].id | https://openalex.org/keywords/image |
| keywords[6].score | 0.5550951361656189 |
| keywords[6].display_name | Image (mathematics) |
| keywords[7].id | https://openalex.org/keywords/computer-graphics |
| keywords[7].score | 0.5292341709136963 |
| keywords[7].display_name | Computer graphics (images) |
| keywords[8].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[8].score | 0.5019469261169434 |
| keywords[8].display_name | Artificial intelligence |
| keywords[9].id | https://openalex.org/keywords/control |
| keywords[9].score | 0.45250874757766724 |
| keywords[9].display_name | Control (management) |
| keywords[10].id | https://openalex.org/keywords/psychology |
| keywords[10].score | 0.19791319966316223 |
| keywords[10].display_name | Psychology |
| keywords[11].id | https://openalex.org/keywords/political-science |
| keywords[11].score | 0.05840221047401428 |
| keywords[11].display_name | Political science |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2501.05020 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2501.05020 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2501.05020 |
| locations[1].id | doi:10.48550/arxiv.2501.05020 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2501.05020 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5101769380 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-2361-3291 |
| authorships[0].author.display_name | Yingjie Chen |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Chen, Yingjie |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5036805135 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-2495-2869 |
| authorships[1].author.display_name | Yifang Men |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Men, Yifang |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5103072099 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-3616-2496 |
| authorships[2].author.display_name | Yuan Yao |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Yao, Yuan |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5101564882 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-5546-3967 |
| authorships[3].author.display_name | Miaomiao Cui |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Cui, Miaomiao |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5085032007 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Liefeng Bo |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Bo, Liefeng |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2501.05020 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10481 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9962000250816345 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1704 |
| primary_topic.subfield.display_name | Computer Graphics and Computer-Aided Design |
| primary_topic.display_name | Computer Graphics and Visualization Techniques |
| related_works | https://openalex.org/W4310844315, https://openalex.org/W2532377291, https://openalex.org/W2000013817, https://openalex.org/W4296190881, https://openalex.org/W2366362996, https://openalex.org/W2517624617, https://openalex.org/W2378422373, https://openalex.org/W2129566390, https://openalex.org/W2360905385, https://openalex.org/W2215755978 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2501.05020 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2501.05020 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2501.05020 |
| primary_location.id | pmh:oai:arXiv.org:2501.05020 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2501.05020 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2501.05020 |
| publication_date | 2025-01-09 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 4, 8, 74, 126 |
| abstract_inverted_index.In | 90 |
| abstract_inverted_index.To | 45 |
| abstract_inverted_index.an | 55 |
| abstract_inverted_index.as | 112 |
| abstract_inverted_index.in | 19, 125 |
| abstract_inverted_index.is | 3 |
| abstract_inverted_index.it | 78, 86, 117 |
| abstract_inverted_index.of | 11, 135 |
| abstract_inverted_index.on | 80 |
| abstract_inverted_index.or | 22 |
| abstract_inverted_index.to | 33, 61, 118, 147 |
| abstract_inverted_index.we | 48, 68 |
| abstract_inverted_index.For | 139 |
| abstract_inverted_index.and | 37, 53, 84, 94, 101, 128, 142 |
| abstract_inverted_index.are | 97 |
| abstract_inverted_index.our | 106, 148 |
| abstract_inverted_index.the | 109, 133, 136 |
| abstract_inverted_index.via | 25 |
| abstract_inverted_index.end, | 47 |
| abstract_inverted_index.from | 73, 87 |
| abstract_inverted_index.have | 16 |
| abstract_inverted_index.into | 99 |
| abstract_inverted_index.made | 17 |
| abstract_inverted_index.more | 140 |
| abstract_inverted_index.task | 6 |
| abstract_inverted_index.they | 30 |
| abstract_inverted_index.this | 46, 91 |
| abstract_inverted_index.user | 82 |
| abstract_inverted_index.way, | 92 |
| abstract_inverted_index.way. | 130 |
| abstract_inverted_index.wide | 9 |
| abstract_inverted_index.with | 7, 41 |
| abstract_inverted_index.Then, | 105 |
| abstract_inverted_index.based | 79 |
| abstract_inverted_index.image | 1, 56 |
| abstract_inverted_index.range | 10 |
| abstract_inverted_index.refer | 146 |
| abstract_inverted_index.still | 31 |
| abstract_inverted_index.tasks | 124 |
| abstract_inverted_index.video | 122 |
| abstract_inverted_index.while | 29 |
| abstract_inverted_index.works | 15 |
| abstract_inverted_index.Recent | 14 |
| abstract_inverted_index.called | 59 |
| abstract_inverted_index.camera | 21, 36, 93 |
| abstract_inverted_index.image, | 76 |
| abstract_inverted_index.motion | 24, 27, 39, 51, 65, 71, 113 |
| abstract_inverted_index.object | 23, 38, 95 |
| abstract_inverted_index.please | 145 |
| abstract_inverted_index.visual | 103 |
| abstract_inverted_index.achieve | 62 |
| abstract_inverted_index.control | 40, 43, 114 |
| abstract_inverted_index.details | 141 |
| abstract_inverted_index.motions | 96 |
| abstract_inverted_index.project | 150 |
| abstract_inverted_index.propose | 54 |
| abstract_inverted_index.results | 111 |
| abstract_inverted_index.support | 34, 119 |
| abstract_inverted_index.unified | 127 |
| abstract_inverted_index.various | 26, 120 |
| abstract_inverted_index.3D-aware | 50, 70 |
| abstract_inverted_index.adaptive | 42 |
| abstract_inverted_index.changes. | 104 |
| abstract_inverted_index.control. | 66 |
| abstract_inverted_index.enabling | 116 |
| abstract_inverted_index.flexible | 129 |
| abstract_inverted_index.perceive | 85 |
| abstract_inverted_index.progress | 18 |
| abstract_inverted_index.proposed | 137 |
| abstract_inverted_index.results, | 144 |
| abstract_inverted_index.signals, | 115 |
| abstract_inverted_index.struggle | 32 |
| abstract_inverted_index.webpage: | 151 |
| abstract_inverted_index.animation | 2, 57 |
| abstract_inverted_index.anonymous | 149 |
| abstract_inverted_index.approach. | 138 |
| abstract_inverted_index.construct | 69 |
| abstract_inverted_index.different | 88 |
| abstract_inverted_index.framework | 107 |
| abstract_inverted_index.introduce | 49 |
| abstract_inverted_index.intuitive | 100 |
| abstract_inverted_index.leverages | 108 |
| abstract_inverted_index.potential | 12 |
| abstract_inverted_index.reference | 75 |
| abstract_inverted_index.synthesis | 123 |
| abstract_inverted_index.consistent | 102 |
| abstract_inverted_index.framework, | 58 |
| abstract_inverted_index.manipulate | 77 |
| abstract_inverted_index.perception | 110 |
| abstract_inverted_index.Experiments | 131 |
| abstract_inverted_index.controlling | 20 |
| abstract_inverted_index.demonstrate | 132 |
| abstract_inverted_index.fundamental | 5 |
| abstract_inverted_index.interpreted | 81 |
| abstract_inverted_index.qualitative | 143 |
| abstract_inverted_index.superiority | 134 |
| abstract_inverted_index.transformed | 98 |
| abstract_inverted_index.viewpoints. | 89 |
| abstract_inverted_index.fine-grained | 63 |
| abstract_inverted_index.granularity. | 44 |
| abstract_inverted_index.Specifically, | 67 |
| abstract_inverted_index.applications. | 13 |
| abstract_inverted_index.collaborative | 35, 64 |
| abstract_inverted_index.instructions, | 83 |
| abstract_inverted_index.motion-related | 121 |
| abstract_inverted_index.representation | 52, 72 |
| abstract_inverted_index.representations, | 28 |
| abstract_inverted_index.Motion-controllable | 0 |
| abstract_inverted_index.Perception-as-Control, | 60 |
| abstract_inverted_index.https://chen-yingjie.github.io/projects/Perception-as-Control. | 152 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |