Hunyuan3D 1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2411.02293
While 3D generative models have greatly improved artists' workflows, the existing diffusion models for 3D generation suffer from slow generation and poor generalization. To address this issue, we propose a two-stage approach named Hunyuan3D 1.0 including a lite version and a standard version, that both support text- and image-conditioned generation. In the first stage, we employ a multi-view diffusion model that efficiently generates multi-view RGB in approximately 4 seconds. These multi-view images capture rich details of the 3D asset from different viewpoints, relaxing the tasks from single-view to multi-view reconstruction. In the second stage, we introduce a feed-forward reconstruction model that rapidly and faithfully reconstructs the 3D asset given the generated multi-view images in approximately 7 seconds. The reconstruction network learns to handle noises and in-consistency introduced by the multi-view diffusion and leverages the available information from the condition image to efficiently recover the 3D structure. Our framework involves the text-to-image model, i.e., Hunyuan-DiT, making it a unified framework to support both text- and image-conditioned 3D generation. Our standard version has 3x more parameters than our lite and other existing model. Our Hunyuan3D 1.0 achieves an impressive balance between speed and quality, significantly reducing generation time while maintaining the quality and diversity of the produced assets.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2411.02293
- https://arxiv.org/pdf/2411.02293
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4404355685
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4404355685Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2411.02293Digital Object Identifier
- Title
-
Hunyuan3D 1.0: A Unified Framework for Text-to-3D and Image-to-3D GenerationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-11-04Full publication date if available
- Authors
-
Xianghui Yang, Huiwen Shi, B. Zhang, Fan Yang, Jiacheng Wang, Hongxu Zhao, Xinhai Liu, Xinzhou Wang, Qingxiang Lin, Jiaao Yu, Lifu Wang, Zhuo Chen, Sicong Liu, Yuhong Liu, Yong Yang, Di Wang, Jie Jiang, Chunchao GuoList of authors in order
- Landing page
-
https://arxiv.org/abs/2411.02293Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2411.02293Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2411.02293Direct OA link when available
- Concepts
-
Image (mathematics), Computer science, Computer visionTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4404355685 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2411.02293 |
| ids.doi | https://doi.org/10.48550/arxiv.2411.02293 |
| ids.openalex | https://openalex.org/W4404355685 |
| fwci | |
| type | preprint |
| title | Hunyuan3D 1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T14339 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9427000284194946 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Image Processing and 3D Reconstruction |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C115961682 |
| concepts[0].level | 2 |
| concepts[0].score | 0.5429449081420898 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q860623 |
| concepts[0].display_name | Image (mathematics) |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.48970988392829895 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C31972630 |
| concepts[2].level | 1 |
| concepts[2].score | 0.3310452103614807 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q844240 |
| concepts[2].display_name | Computer vision |
| keywords[0].id | https://openalex.org/keywords/image |
| keywords[0].score | 0.5429449081420898 |
| keywords[0].display_name | Image (mathematics) |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.48970988392829895 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/computer-vision |
| keywords[2].score | 0.3310452103614807 |
| keywords[2].display_name | Computer vision |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2411.02293 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2411.02293 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2411.02293 |
| locations[1].id | doi:10.48550/arxiv.2411.02293 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2411.02293 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5103089178 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-3308-3108 |
| authorships[0].author.display_name | Xianghui Yang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Yang, Xianghui |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5026199715 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-2713-9507 |
| authorships[1].author.display_name | Huiwen Shi |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Shi, Huiwen |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5101577028 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-9726-6707 |
| authorships[2].author.display_name | B. Zhang |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Zhang, Bowen |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5087949383 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-5972-5466 |
| authorships[3].author.display_name | Fan Yang |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Yang, Fan |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5018181104 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-2595-265X |
| authorships[4].author.display_name | Jiacheng Wang |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Wang, Jiacheng |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5071831230 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Hongxu Zhao |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Zhao, Hongxu |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5007736125 |
| authorships[6].author.orcid | https://orcid.org/0000-0003-4200-4862 |
| authorships[6].author.display_name | Xinhai Liu |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Liu, Xinhai |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5063636252 |
| authorships[7].author.orcid | https://orcid.org/0000-0003-3972-3073 |
| authorships[7].author.display_name | Xinzhou Wang |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Wang, Xinzhou |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5066989412 |
| authorships[8].author.orcid | |
| authorships[8].author.display_name | Qingxiang Lin |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Lin, Qingxiang |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5050610245 |
| authorships[9].author.orcid | https://orcid.org/0009-0008-2243-5933 |
| authorships[9].author.display_name | Jiaao Yu |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | Yu, Jiaao |
| authorships[9].is_corresponding | False |
| authorships[10].author.id | https://openalex.org/A5101922066 |
| authorships[10].author.orcid | https://orcid.org/0000-0001-5172-9932 |
| authorships[10].author.display_name | Lifu Wang |
| authorships[10].author_position | middle |
| authorships[10].raw_author_name | Wang, Lifu |
| authorships[10].is_corresponding | False |
| authorships[11].author.id | https://openalex.org/A5100345101 |
| authorships[11].author.orcid | https://orcid.org/0000-0003-1597-9058 |
| authorships[11].author.display_name | Zhuo Chen |
| authorships[11].author_position | middle |
| authorships[11].raw_author_name | Chen, Zhuo |
| authorships[11].is_corresponding | False |
| authorships[12].author.id | https://openalex.org/A5100684828 |
| authorships[12].author.orcid | https://orcid.org/0000-0003-4402-1260 |
| authorships[12].author.display_name | Sicong Liu |
| authorships[12].author_position | middle |
| authorships[12].raw_author_name | Liu, Sicong |
| authorships[12].is_corresponding | False |
| authorships[13].author.id | https://openalex.org/A5005799407 |
| authorships[13].author.orcid | https://orcid.org/0000-0002-2565-8701 |
| authorships[13].author.display_name | Yuhong Liu |
| authorships[13].author_position | middle |
| authorships[13].raw_author_name | Liu, Yuhong |
| authorships[13].is_corresponding | False |
| authorships[14].author.id | https://openalex.org/A5103660861 |
| authorships[14].author.orcid | |
| authorships[14].author.display_name | Yong Yang |
| authorships[14].author_position | middle |
| authorships[14].raw_author_name | Yang, Yong |
| authorships[14].is_corresponding | False |
| authorships[15].author.id | https://openalex.org/A5100401482 |
| authorships[15].author.orcid | https://orcid.org/0000-0003-4908-0243 |
| authorships[15].author.display_name | Di Wang |
| authorships[15].author_position | middle |
| authorships[15].raw_author_name | Wang, Di |
| authorships[15].is_corresponding | False |
| authorships[16].author.id | https://openalex.org/A5061877876 |
| authorships[16].author.orcid | https://orcid.org/0000-0001-9182-335X |
| authorships[16].author.display_name | Jie Jiang |
| authorships[16].author_position | middle |
| authorships[16].raw_author_name | Jiang, Jie |
| authorships[16].is_corresponding | False |
| authorships[17].author.id | https://openalex.org/A5034317074 |
| authorships[17].author.orcid | |
| authorships[17].author.display_name | Chunchao Guo |
| authorships[17].author_position | last |
| authorships[17].raw_author_name | Guo, Chunchao |
| authorships[17].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2411.02293 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2024-11-15T00:00:00 |
| display_name | Hunyuan3D 1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T14339 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9427000284194946 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Image Processing and 3D Reconstruction |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4396696052 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2411.02293 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2411.02293 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2411.02293 |
| primary_location.id | pmh:oai:arXiv.org:2411.02293 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2411.02293 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2411.02293 |
| publication_date | 2024-11-04 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.4 | 67 |
| abstract_inverted_index.7 | 115 |
| abstract_inverted_index.a | 29, 36, 40, 56, 96, 156 |
| abstract_inverted_index.3D | 1, 14, 77, 106, 144, 165 |
| abstract_inverted_index.3x | 171 |
| abstract_inverted_index.In | 50, 90 |
| abstract_inverted_index.To | 23 |
| abstract_inverted_index.an | 185 |
| abstract_inverted_index.by | 127 |
| abstract_inverted_index.in | 65, 113 |
| abstract_inverted_index.it | 155 |
| abstract_inverted_index.of | 75, 202 |
| abstract_inverted_index.to | 87, 121, 140, 159 |
| abstract_inverted_index.we | 27, 54, 94 |
| abstract_inverted_index.1.0 | 34, 183 |
| abstract_inverted_index.Our | 146, 167, 181 |
| abstract_inverted_index.RGB | 64 |
| abstract_inverted_index.The | 117 |
| abstract_inverted_index.and | 20, 39, 47, 102, 124, 131, 163, 177, 190, 200 |
| abstract_inverted_index.for | 13 |
| abstract_inverted_index.has | 170 |
| abstract_inverted_index.our | 175 |
| abstract_inverted_index.the | 9, 51, 76, 83, 91, 105, 109, 128, 133, 137, 143, 149, 198, 203 |
| abstract_inverted_index.both | 44, 161 |
| abstract_inverted_index.from | 17, 79, 85, 136 |
| abstract_inverted_index.have | 4 |
| abstract_inverted_index.lite | 37, 176 |
| abstract_inverted_index.more | 172 |
| abstract_inverted_index.poor | 21 |
| abstract_inverted_index.rich | 73 |
| abstract_inverted_index.slow | 18 |
| abstract_inverted_index.than | 174 |
| abstract_inverted_index.that | 43, 60, 100 |
| abstract_inverted_index.this | 25 |
| abstract_inverted_index.time | 195 |
| abstract_inverted_index.These | 69 |
| abstract_inverted_index.While | 0 |
| abstract_inverted_index.asset | 78, 107 |
| abstract_inverted_index.first | 52 |
| abstract_inverted_index.given | 108 |
| abstract_inverted_index.i.e., | 152 |
| abstract_inverted_index.image | 139 |
| abstract_inverted_index.model | 59, 99 |
| abstract_inverted_index.named | 32 |
| abstract_inverted_index.other | 178 |
| abstract_inverted_index.speed | 189 |
| abstract_inverted_index.tasks | 84 |
| abstract_inverted_index.text- | 46, 162 |
| abstract_inverted_index.while | 196 |
| abstract_inverted_index.employ | 55 |
| abstract_inverted_index.handle | 122 |
| abstract_inverted_index.images | 71, 112 |
| abstract_inverted_index.issue, | 26 |
| abstract_inverted_index.learns | 120 |
| abstract_inverted_index.making | 154 |
| abstract_inverted_index.model, | 151 |
| abstract_inverted_index.model. | 180 |
| abstract_inverted_index.models | 3, 12 |
| abstract_inverted_index.noises | 123 |
| abstract_inverted_index.second | 92 |
| abstract_inverted_index.stage, | 53, 93 |
| abstract_inverted_index.suffer | 16 |
| abstract_inverted_index.address | 24 |
| abstract_inverted_index.assets. | 205 |
| abstract_inverted_index.balance | 187 |
| abstract_inverted_index.between | 188 |
| abstract_inverted_index.capture | 72 |
| abstract_inverted_index.details | 74 |
| abstract_inverted_index.greatly | 5 |
| abstract_inverted_index.network | 119 |
| abstract_inverted_index.propose | 28 |
| abstract_inverted_index.quality | 199 |
| abstract_inverted_index.rapidly | 101 |
| abstract_inverted_index.recover | 142 |
| abstract_inverted_index.support | 45, 160 |
| abstract_inverted_index.unified | 157 |
| abstract_inverted_index.version | 38, 169 |
| abstract_inverted_index.achieves | 184 |
| abstract_inverted_index.approach | 31 |
| abstract_inverted_index.artists' | 7 |
| abstract_inverted_index.existing | 10, 179 |
| abstract_inverted_index.improved | 6 |
| abstract_inverted_index.involves | 148 |
| abstract_inverted_index.produced | 204 |
| abstract_inverted_index.quality, | 191 |
| abstract_inverted_index.reducing | 193 |
| abstract_inverted_index.relaxing | 82 |
| abstract_inverted_index.seconds. | 68, 116 |
| abstract_inverted_index.standard | 41, 168 |
| abstract_inverted_index.version, | 42 |
| abstract_inverted_index.Hunyuan3D | 33, 182 |
| abstract_inverted_index.available | 134 |
| abstract_inverted_index.condition | 138 |
| abstract_inverted_index.different | 80 |
| abstract_inverted_index.diffusion | 11, 58, 130 |
| abstract_inverted_index.diversity | 201 |
| abstract_inverted_index.framework | 147, 158 |
| abstract_inverted_index.generated | 110 |
| abstract_inverted_index.generates | 62 |
| abstract_inverted_index.including | 35 |
| abstract_inverted_index.introduce | 95 |
| abstract_inverted_index.leverages | 132 |
| abstract_inverted_index.two-stage | 30 |
| abstract_inverted_index.faithfully | 103 |
| abstract_inverted_index.generation | 15, 19, 194 |
| abstract_inverted_index.generative | 2 |
| abstract_inverted_index.impressive | 186 |
| abstract_inverted_index.introduced | 126 |
| abstract_inverted_index.multi-view | 57, 63, 70, 88, 111, 129 |
| abstract_inverted_index.parameters | 173 |
| abstract_inverted_index.structure. | 145 |
| abstract_inverted_index.workflows, | 8 |
| abstract_inverted_index.efficiently | 61, 141 |
| abstract_inverted_index.generation. | 49, 166 |
| abstract_inverted_index.information | 135 |
| abstract_inverted_index.maintaining | 197 |
| abstract_inverted_index.single-view | 86 |
| abstract_inverted_index.viewpoints, | 81 |
| abstract_inverted_index.Hunyuan-DiT, | 153 |
| abstract_inverted_index.feed-forward | 97 |
| abstract_inverted_index.reconstructs | 104 |
| abstract_inverted_index.approximately | 66, 114 |
| abstract_inverted_index.significantly | 192 |
| abstract_inverted_index.text-to-image | 150 |
| abstract_inverted_index.in-consistency | 125 |
| abstract_inverted_index.reconstruction | 98, 118 |
| abstract_inverted_index.generalization. | 22 |
| abstract_inverted_index.reconstruction. | 89 |
| abstract_inverted_index.image-conditioned | 48, 164 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 18 |
| citation_normalized_percentile |