OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2411.10501
We consider the problem of text-to-video generation tasks with precise control for various applications such as camera movement control and video-to-video editing. Most methods tacking this problem rely on providing user-defined controls, such as binary masks or camera movement embeddings. In our approach we propose OnlyFlow, an approach leveraging the optical flow firstly extracted from an input video to condition the motion of generated videos. Using a text prompt and an input video, OnlyFlow allows the user to generate videos that respect the motion of the input video as well as the text prompt. This is implemented through an optical flow estimation model applied on the input video, which is then fed to a trainable optical flow encoder. The output feature maps are then injected into the text-to-video backbone model. We perform quantitative, qualitative and user preference studies to show that OnlyFlow positively compares to state-of-the-art methods on a wide range of tasks, even though OnlyFlow was not specifically trained for such tasks. OnlyFlow thus constitutes a versatile, lightweight yet efficient method for controlling motion in text-to-video generation. Models and code will be made available on GitHub and HuggingFace.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2411.10501
- https://arxiv.org/pdf/2411.10501
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4404569835
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4404569835Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2411.10501Digital Object Identifier
- Title
-
OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion ModelsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-11-15Full publication date if available
- Authors
-
M. Taha Koroglu, Hugo Caselles-Dupré, Guillaume Jeanneret Sanmiguel, Matthieu CordList of authors in order
- Landing page
-
https://arxiv.org/abs/2411.10501Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2411.10501Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2411.10501Direct OA link when available
- Concepts
-
Optical flow, Conditioning, Diffusion, Flow (mathematics), Motion (physics), Computer science, Computer vision, Mechanics, Physics, Mathematics, Thermodynamics, Statistics, Image (mathematics)Top concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4404569835 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2411.10501 |
| ids.doi | https://doi.org/10.48550/arxiv.2411.10501 |
| ids.openalex | https://openalex.org/W4404569835 |
| fwci | |
| type | preprint |
| title | OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10531 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.998199999332428 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Advanced Vision and Imaging |
| topics[1].id | https://openalex.org/T10741 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9905999898910522 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1711 |
| topics[1].subfield.display_name | Signal Processing |
| topics[1].display_name | Video Coding and Compression Technologies |
| topics[2].id | https://openalex.org/T10481 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9728999733924866 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1704 |
| topics[2].subfield.display_name | Computer Graphics and Computer-Aided Design |
| topics[2].display_name | Computer Graphics and Visualization Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C155542232 |
| concepts[0].level | 3 |
| concepts[0].score | 0.7482978701591492 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q736111 |
| concepts[0].display_name | Optical flow |
| concepts[1].id | https://openalex.org/C45262634 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6402352452278137 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q5159291 |
| concepts[1].display_name | Conditioning |
| concepts[2].id | https://openalex.org/C69357855 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6140692830085754 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q163214 |
| concepts[2].display_name | Diffusion |
| concepts[3].id | https://openalex.org/C38349280 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5533405542373657 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q1434290 |
| concepts[3].display_name | Flow (mathematics) |
| concepts[4].id | https://openalex.org/C104114177 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5325030088424683 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q79782 |
| concepts[4].display_name | Motion (physics) |
| concepts[5].id | https://openalex.org/C41008148 |
| concepts[5].level | 0 |
| concepts[5].score | 0.4518047869205475 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[5].display_name | Computer science |
| concepts[6].id | https://openalex.org/C31972630 |
| concepts[6].level | 1 |
| concepts[6].score | 0.3340630829334259 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q844240 |
| concepts[6].display_name | Computer vision |
| concepts[7].id | https://openalex.org/C57879066 |
| concepts[7].level | 1 |
| concepts[7].score | 0.26082202792167664 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q41217 |
| concepts[7].display_name | Mechanics |
| concepts[8].id | https://openalex.org/C121332964 |
| concepts[8].level | 0 |
| concepts[8].score | 0.21936443448066711 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[8].display_name | Physics |
| concepts[9].id | https://openalex.org/C33923547 |
| concepts[9].level | 0 |
| concepts[9].score | 0.16320344805717468 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[9].display_name | Mathematics |
| concepts[10].id | https://openalex.org/C97355855 |
| concepts[10].level | 1 |
| concepts[10].score | 0.16176161170005798 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q11473 |
| concepts[10].display_name | Thermodynamics |
| concepts[11].id | https://openalex.org/C105795698 |
| concepts[11].level | 1 |
| concepts[11].score | 0.08481785655021667 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q12483 |
| concepts[11].display_name | Statistics |
| concepts[12].id | https://openalex.org/C115961682 |
| concepts[12].level | 2 |
| concepts[12].score | 0.08063516020774841 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q860623 |
| concepts[12].display_name | Image (mathematics) |
| keywords[0].id | https://openalex.org/keywords/optical-flow |
| keywords[0].score | 0.7482978701591492 |
| keywords[0].display_name | Optical flow |
| keywords[1].id | https://openalex.org/keywords/conditioning |
| keywords[1].score | 0.6402352452278137 |
| keywords[1].display_name | Conditioning |
| keywords[2].id | https://openalex.org/keywords/diffusion |
| keywords[2].score | 0.6140692830085754 |
| keywords[2].display_name | Diffusion |
| keywords[3].id | https://openalex.org/keywords/flow |
| keywords[3].score | 0.5533405542373657 |
| keywords[3].display_name | Flow (mathematics) |
| keywords[4].id | https://openalex.org/keywords/motion |
| keywords[4].score | 0.5325030088424683 |
| keywords[4].display_name | Motion (physics) |
| keywords[5].id | https://openalex.org/keywords/computer-science |
| keywords[5].score | 0.4518047869205475 |
| keywords[5].display_name | Computer science |
| keywords[6].id | https://openalex.org/keywords/computer-vision |
| keywords[6].score | 0.3340630829334259 |
| keywords[6].display_name | Computer vision |
| keywords[7].id | https://openalex.org/keywords/mechanics |
| keywords[7].score | 0.26082202792167664 |
| keywords[7].display_name | Mechanics |
| keywords[8].id | https://openalex.org/keywords/physics |
| keywords[8].score | 0.21936443448066711 |
| keywords[8].display_name | Physics |
| keywords[9].id | https://openalex.org/keywords/mathematics |
| keywords[9].score | 0.16320344805717468 |
| keywords[9].display_name | Mathematics |
| keywords[10].id | https://openalex.org/keywords/thermodynamics |
| keywords[10].score | 0.16176161170005798 |
| keywords[10].display_name | Thermodynamics |
| keywords[11].id | https://openalex.org/keywords/statistics |
| keywords[11].score | 0.08481785655021667 |
| keywords[11].display_name | Statistics |
| keywords[12].id | https://openalex.org/keywords/image |
| keywords[12].score | 0.08063516020774841 |
| keywords[12].display_name | Image (mathematics) |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2411.10501 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2411.10501 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2411.10501 |
| locations[1].id | doi:10.48550/arxiv.2411.10501 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2411.10501 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5078730877 |
| authorships[0].author.orcid | https://orcid.org/0009-0002-9312-5365 |
| authorships[0].author.display_name | M. Taha Koroglu |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Koroglu, Mathis |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5025408105 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Hugo Caselles-Dupré |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Caselles-Dupré, Hugo |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5114730077 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Guillaume Jeanneret Sanmiguel |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Sanmiguel, Guillaume Jeanneret |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5108118084 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-0627-5844 |
| authorships[3].author.display_name | Matthieu Cord |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Cord, Matthieu |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2411.10501 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2024-11-21T00:00:00 |
| display_name | OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10531 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.998199999332428 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Advanced Vision and Imaging |
| related_works | https://openalex.org/W2344176346, https://openalex.org/W2052307325, https://openalex.org/W1489675464, https://openalex.org/W1881653995, https://openalex.org/W2365635896, https://openalex.org/W2062287200, https://openalex.org/W2004526709, https://openalex.org/W4286646204, https://openalex.org/W2564375980, https://openalex.org/W3092720353 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2411.10501 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2411.10501 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2411.10501 |
| primary_location.id | pmh:oai:arXiv.org:2411.10501 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2411.10501 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2411.10501 |
| publication_date | 2024-11-15 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 66, 113, 148, 166 |
| abstract_inverted_index.In | 40 |
| abstract_inverted_index.We | 0, 130 |
| abstract_inverted_index.an | 46, 55, 70, 98 |
| abstract_inverted_index.as | 15, 33, 88, 90 |
| abstract_inverted_index.be | 182 |
| abstract_inverted_index.in | 175 |
| abstract_inverted_index.is | 95, 109 |
| abstract_inverted_index.of | 4, 62, 84, 151 |
| abstract_inverted_index.on | 28, 104, 147, 185 |
| abstract_inverted_index.or | 36 |
| abstract_inverted_index.to | 58, 77, 112, 138, 144 |
| abstract_inverted_index.we | 43 |
| abstract_inverted_index.The | 118 |
| abstract_inverted_index.and | 19, 69, 134, 179, 187 |
| abstract_inverted_index.are | 122 |
| abstract_inverted_index.fed | 111 |
| abstract_inverted_index.for | 11, 160, 172 |
| abstract_inverted_index.not | 157 |
| abstract_inverted_index.our | 41 |
| abstract_inverted_index.the | 2, 49, 60, 75, 82, 85, 91, 105, 126 |
| abstract_inverted_index.was | 156 |
| abstract_inverted_index.yet | 169 |
| abstract_inverted_index.Most | 22 |
| abstract_inverted_index.This | 94 |
| abstract_inverted_index.code | 180 |
| abstract_inverted_index.even | 153 |
| abstract_inverted_index.flow | 51, 100, 116 |
| abstract_inverted_index.from | 54 |
| abstract_inverted_index.into | 125 |
| abstract_inverted_index.made | 183 |
| abstract_inverted_index.maps | 121 |
| abstract_inverted_index.rely | 27 |
| abstract_inverted_index.show | 139 |
| abstract_inverted_index.such | 14, 32, 161 |
| abstract_inverted_index.text | 67, 92 |
| abstract_inverted_index.that | 80, 140 |
| abstract_inverted_index.then | 110, 123 |
| abstract_inverted_index.this | 25 |
| abstract_inverted_index.thus | 164 |
| abstract_inverted_index.user | 76, 135 |
| abstract_inverted_index.well | 89 |
| abstract_inverted_index.wide | 149 |
| abstract_inverted_index.will | 181 |
| abstract_inverted_index.with | 8 |
| abstract_inverted_index.Using | 65 |
| abstract_inverted_index.input | 56, 71, 86, 106 |
| abstract_inverted_index.masks | 35 |
| abstract_inverted_index.model | 102 |
| abstract_inverted_index.range | 150 |
| abstract_inverted_index.tasks | 7 |
| abstract_inverted_index.video | 57, 87 |
| abstract_inverted_index.which | 108 |
| abstract_inverted_index.GitHub | 186 |
| abstract_inverted_index.Models | 178 |
| abstract_inverted_index.allows | 74 |
| abstract_inverted_index.binary | 34 |
| abstract_inverted_index.camera | 16, 37 |
| abstract_inverted_index.method | 171 |
| abstract_inverted_index.model. | 129 |
| abstract_inverted_index.motion | 61, 83, 174 |
| abstract_inverted_index.output | 119 |
| abstract_inverted_index.prompt | 68 |
| abstract_inverted_index.tasks, | 152 |
| abstract_inverted_index.tasks. | 162 |
| abstract_inverted_index.though | 154 |
| abstract_inverted_index.video, | 72, 107 |
| abstract_inverted_index.videos | 79 |
| abstract_inverted_index.applied | 103 |
| abstract_inverted_index.control | 10, 18 |
| abstract_inverted_index.feature | 120 |
| abstract_inverted_index.firstly | 52 |
| abstract_inverted_index.methods | 23, 146 |
| abstract_inverted_index.optical | 50, 99, 115 |
| abstract_inverted_index.perform | 131 |
| abstract_inverted_index.precise | 9 |
| abstract_inverted_index.problem | 3, 26 |
| abstract_inverted_index.prompt. | 93 |
| abstract_inverted_index.propose | 44 |
| abstract_inverted_index.respect | 81 |
| abstract_inverted_index.studies | 137 |
| abstract_inverted_index.tacking | 24 |
| abstract_inverted_index.through | 97 |
| abstract_inverted_index.trained | 159 |
| abstract_inverted_index.various | 12 |
| abstract_inverted_index.videos. | 64 |
| abstract_inverted_index.OnlyFlow | 73, 141, 155, 163 |
| abstract_inverted_index.approach | 42, 47 |
| abstract_inverted_index.backbone | 128 |
| abstract_inverted_index.compares | 143 |
| abstract_inverted_index.consider | 1 |
| abstract_inverted_index.editing. | 21 |
| abstract_inverted_index.encoder. | 117 |
| abstract_inverted_index.generate | 78 |
| abstract_inverted_index.injected | 124 |
| abstract_inverted_index.movement | 17, 38 |
| abstract_inverted_index.OnlyFlow, | 45 |
| abstract_inverted_index.available | 184 |
| abstract_inverted_index.condition | 59 |
| abstract_inverted_index.controls, | 31 |
| abstract_inverted_index.efficient | 170 |
| abstract_inverted_index.extracted | 53 |
| abstract_inverted_index.generated | 63 |
| abstract_inverted_index.providing | 29 |
| abstract_inverted_index.trainable | 114 |
| abstract_inverted_index.estimation | 101 |
| abstract_inverted_index.generation | 6 |
| abstract_inverted_index.leveraging | 48 |
| abstract_inverted_index.positively | 142 |
| abstract_inverted_index.preference | 136 |
| abstract_inverted_index.versatile, | 167 |
| abstract_inverted_index.constitutes | 165 |
| abstract_inverted_index.controlling | 173 |
| abstract_inverted_index.embeddings. | 39 |
| abstract_inverted_index.generation. | 177 |
| abstract_inverted_index.implemented | 96 |
| abstract_inverted_index.lightweight | 168 |
| abstract_inverted_index.qualitative | 133 |
| abstract_inverted_index.HuggingFace. | 188 |
| abstract_inverted_index.applications | 13 |
| abstract_inverted_index.specifically | 158 |
| abstract_inverted_index.user-defined | 30 |
| abstract_inverted_index.quantitative, | 132 |
| abstract_inverted_index.text-to-video | 5, 127, 176 |
| abstract_inverted_index.video-to-video | 20 |
| abstract_inverted_index.state-of-the-art | 145 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |