PoCo: Policy Composition from and for Heterogeneous Robot Learning Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2402.02511
Training general robotic policies from heterogeneous data for different tasks is a significant challenge. Existing robotic datasets vary in different modalities such as color, depth, tactile, and proprioceptive information, and collected in different domains such as simulation, real robots, and human videos. Current methods usually collect and pool all data from one domain to train a single policy to handle such heterogeneity in tasks and domains, which is prohibitively expensive and difficult. In this work, we present a flexible approach, dubbed Policy Composition, to combine information across such diverse modalities and domains for learning scene-level and task-level generalized manipulation skills, by composing different data distributions represented with diffusion models. Our method can use task-level composition for multi-task manipulation and be composed with analytic cost functions to adapt policy behaviors at inference time. We train our method on simulation, human, and real robot data and evaluate in tool-use tasks. The composed policy achieves robust and dexterous performance under varying scenes and tasks and outperforms baselines from a single data source in both simulation and real-world experiments. See https://liruiw.github.io/policycomp for more details .
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2402.02511
- https://arxiv.org/pdf/2402.02511
- OA Status
- green
- Cited By
- 1
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4391591153
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4391591153Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2402.02511Digital Object Identifier
- Title
-
PoCo: Policy Composition from and for Heterogeneous Robot LearningWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-02-04Full publication date if available
- Authors
-
Lirui Wang, Jialiang Zhao, Yilun Du, Edward H. Adelson, Russ TedrakeList of authors in order
- Landing page
-
https://arxiv.org/abs/2402.02511Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2402.02511Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2402.02511Direct OA link when available
- Concepts
-
Composition (language), Policy learning, Robot, Computer science, Artificial intelligence, Machine learning, Art, LiteratureTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4391591153 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2402.02511 |
| ids.doi | https://doi.org/10.48550/arxiv.2402.02511 |
| ids.openalex | https://openalex.org/W4391591153 |
| fwci | |
| type | preprint |
| title | PoCo: Policy Composition from and for Heterogeneous Robot Learning |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10462 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.8287000060081482 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Reinforcement Learning in Robotics |
| topics[1].id | https://openalex.org/T12423 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.7689999938011169 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1712 |
| topics[1].subfield.display_name | Software |
| topics[1].display_name | Software Reliability and Analysis Research |
| topics[2].id | https://openalex.org/T10772 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.7059000134468079 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1705 |
| topics[2].subfield.display_name | Computer Networks and Communications |
| topics[2].display_name | Distributed systems and fault tolerance |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C40231798 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7174683213233948 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1333743 |
| concepts[0].display_name | Composition (language) |
| concepts[1].id | https://openalex.org/C2779436431 |
| concepts[1].level | 2 |
| concepts[1].score | 0.502429723739624 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q30672407 |
| concepts[1].display_name | Policy learning |
| concepts[2].id | https://openalex.org/C90509273 |
| concepts[2].level | 2 |
| concepts[2].score | 0.4755953252315521 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11012 |
| concepts[2].display_name | Robot |
| concepts[3].id | https://openalex.org/C41008148 |
| concepts[3].level | 0 |
| concepts[3].score | 0.46916016936302185 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[3].display_name | Computer science |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.36503371596336365 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C119857082 |
| concepts[5].level | 1 |
| concepts[5].score | 0.15771332383155823 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[5].display_name | Machine learning |
| concepts[6].id | https://openalex.org/C142362112 |
| concepts[6].level | 0 |
| concepts[6].score | 0.08044889569282532 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q735 |
| concepts[6].display_name | Art |
| concepts[7].id | https://openalex.org/C124952713 |
| concepts[7].level | 1 |
| concepts[7].score | 0.0 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q8242 |
| concepts[7].display_name | Literature |
| keywords[0].id | https://openalex.org/keywords/composition |
| keywords[0].score | 0.7174683213233948 |
| keywords[0].display_name | Composition (language) |
| keywords[1].id | https://openalex.org/keywords/policy-learning |
| keywords[1].score | 0.502429723739624 |
| keywords[1].display_name | Policy learning |
| keywords[2].id | https://openalex.org/keywords/robot |
| keywords[2].score | 0.4755953252315521 |
| keywords[2].display_name | Robot |
| keywords[3].id | https://openalex.org/keywords/computer-science |
| keywords[3].score | 0.46916016936302185 |
| keywords[3].display_name | Computer science |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.36503371596336365 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/machine-learning |
| keywords[5].score | 0.15771332383155823 |
| keywords[5].display_name | Machine learning |
| keywords[6].id | https://openalex.org/keywords/art |
| keywords[6].score | 0.08044889569282532 |
| keywords[6].display_name | Art |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2402.02511 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2402.02511 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2402.02511 |
| locations[1].id | doi:10.48550/arxiv.2402.02511 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2402.02511 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5007415599 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-9285-7084 |
| authorships[0].author.display_name | Lirui Wang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Wang, Lirui |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5110964546 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Jialiang Zhao |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zhao, Jialiang |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5101182304 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Yilun Du |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Du, Yilun |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5021989698 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Edward H. Adelson |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Adelson, Edward H. |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5074291890 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Russ Tedrake |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Tedrake, Russ |
| authorships[4].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2402.02511 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | PoCo: Policy Composition from and for Heterogeneous Robot Learning |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10462 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.8287000060081482 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Reinforcement Learning in Robotics |
| related_works | https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W2358668433, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W2382290278, https://openalex.org/W2478288626, https://openalex.org/W2350741829, https://openalex.org/W2530322880, https://openalex.org/W1596801655 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2402.02511 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2402.02511 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2402.02511 |
| primary_location.id | pmh:oai:arXiv.org:2402.02511 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2402.02511 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2402.02511 |
| publication_date | 2024-02-04 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.. | 180 |
| abstract_inverted_index.a | 11, 55, 77, 165 |
| abstract_inverted_index.In | 72 |
| abstract_inverted_index.We | 132 |
| abstract_inverted_index.as | 22, 35 |
| abstract_inverted_index.at | 129 |
| abstract_inverted_index.be | 119 |
| abstract_inverted_index.by | 100 |
| abstract_inverted_index.in | 18, 31, 62, 145, 169 |
| abstract_inverted_index.is | 10, 67 |
| abstract_inverted_index.on | 136 |
| abstract_inverted_index.to | 53, 58, 83, 125 |
| abstract_inverted_index.we | 75 |
| abstract_inverted_index.Our | 109 |
| abstract_inverted_index.See | 175 |
| abstract_inverted_index.The | 148 |
| abstract_inverted_index.all | 48 |
| abstract_inverted_index.and | 26, 29, 39, 46, 64, 70, 90, 95, 118, 139, 143, 153, 159, 161, 172 |
| abstract_inverted_index.can | 111 |
| abstract_inverted_index.for | 7, 92, 115, 177 |
| abstract_inverted_index.one | 51 |
| abstract_inverted_index.our | 134 |
| abstract_inverted_index.use | 112 |
| abstract_inverted_index.both | 170 |
| abstract_inverted_index.cost | 123 |
| abstract_inverted_index.data | 6, 49, 103, 142, 167 |
| abstract_inverted_index.from | 4, 50, 164 |
| abstract_inverted_index.more | 178 |
| abstract_inverted_index.pool | 47 |
| abstract_inverted_index.real | 37, 140 |
| abstract_inverted_index.such | 21, 34, 60, 87 |
| abstract_inverted_index.this | 73 |
| abstract_inverted_index.vary | 17 |
| abstract_inverted_index.with | 106, 121 |
| abstract_inverted_index.adapt | 126 |
| abstract_inverted_index.human | 40 |
| abstract_inverted_index.robot | 141 |
| abstract_inverted_index.tasks | 9, 63, 160 |
| abstract_inverted_index.time. | 131 |
| abstract_inverted_index.train | 54, 133 |
| abstract_inverted_index.under | 156 |
| abstract_inverted_index.which | 66 |
| abstract_inverted_index.work, | 74 |
| abstract_inverted_index.Policy | 81 |
| abstract_inverted_index.across | 86 |
| abstract_inverted_index.color, | 23 |
| abstract_inverted_index.depth, | 24 |
| abstract_inverted_index.domain | 52 |
| abstract_inverted_index.dubbed | 80 |
| abstract_inverted_index.handle | 59 |
| abstract_inverted_index.human, | 138 |
| abstract_inverted_index.method | 110, 135 |
| abstract_inverted_index.policy | 57, 127, 150 |
| abstract_inverted_index.robust | 152 |
| abstract_inverted_index.scenes | 158 |
| abstract_inverted_index.single | 56, 166 |
| abstract_inverted_index.source | 168 |
| abstract_inverted_index.tasks. | 147 |
| abstract_inverted_index.Current | 42 |
| abstract_inverted_index.collect | 45 |
| abstract_inverted_index.combine | 84 |
| abstract_inverted_index.details | 179 |
| abstract_inverted_index.diverse | 88 |
| abstract_inverted_index.domains | 33, 91 |
| abstract_inverted_index.general | 1 |
| abstract_inverted_index.methods | 43 |
| abstract_inverted_index.models. | 108 |
| abstract_inverted_index.present | 76 |
| abstract_inverted_index.robotic | 2, 15 |
| abstract_inverted_index.robots, | 38 |
| abstract_inverted_index.skills, | 99 |
| abstract_inverted_index.usually | 44 |
| abstract_inverted_index.varying | 157 |
| abstract_inverted_index.videos. | 41 |
| abstract_inverted_index.Existing | 14 |
| abstract_inverted_index.Training | 0 |
| abstract_inverted_index.achieves | 151 |
| abstract_inverted_index.analytic | 122 |
| abstract_inverted_index.composed | 120, 149 |
| abstract_inverted_index.datasets | 16 |
| abstract_inverted_index.domains, | 65 |
| abstract_inverted_index.evaluate | 144 |
| abstract_inverted_index.flexible | 78 |
| abstract_inverted_index.learning | 93 |
| abstract_inverted_index.policies | 3 |
| abstract_inverted_index.tactile, | 25 |
| abstract_inverted_index.tool-use | 146 |
| abstract_inverted_index.approach, | 79 |
| abstract_inverted_index.baselines | 163 |
| abstract_inverted_index.behaviors | 128 |
| abstract_inverted_index.collected | 30 |
| abstract_inverted_index.composing | 101 |
| abstract_inverted_index.dexterous | 154 |
| abstract_inverted_index.different | 8, 19, 32, 102 |
| abstract_inverted_index.diffusion | 107 |
| abstract_inverted_index.expensive | 69 |
| abstract_inverted_index.functions | 124 |
| abstract_inverted_index.inference | 130 |
| abstract_inverted_index.challenge. | 13 |
| abstract_inverted_index.difficult. | 71 |
| abstract_inverted_index.modalities | 20, 89 |
| abstract_inverted_index.multi-task | 116 |
| abstract_inverted_index.real-world | 173 |
| abstract_inverted_index.simulation | 171 |
| abstract_inverted_index.task-level | 96, 113 |
| abstract_inverted_index.composition | 114 |
| abstract_inverted_index.generalized | 97 |
| abstract_inverted_index.information | 85 |
| abstract_inverted_index.outperforms | 162 |
| abstract_inverted_index.performance | 155 |
| abstract_inverted_index.represented | 105 |
| abstract_inverted_index.scene-level | 94 |
| abstract_inverted_index.significant | 12 |
| abstract_inverted_index.simulation, | 36, 137 |
| abstract_inverted_index.Composition, | 82 |
| abstract_inverted_index.experiments. | 174 |
| abstract_inverted_index.information, | 28 |
| abstract_inverted_index.manipulation | 98, 117 |
| abstract_inverted_index.distributions | 104 |
| abstract_inverted_index.heterogeneity | 61 |
| abstract_inverted_index.heterogeneous | 5 |
| abstract_inverted_index.prohibitively | 68 |
| abstract_inverted_index.proprioceptive | 27 |
| abstract_inverted_index.https://liruiw.github.io/policycomp | 176 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |