MegaFold: System-Level Optimizations for Accelerating Protein Structure Prediction Models Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2506.20686
Protein structure prediction models such as AlphaFold3 (AF3) push the frontier of biomolecular modeling by incorporating science-informed architectural changes to the transformer architecture. However, these advances come at a steep system cost, introducing: compute- and memory-intensive operators, 2D attention mechanisms, and retrieval-augmented data pipelines, which collectively hinder the scalability of AF3 training. In this work, we present MegaFold, a cross-platform system to accelerate AF3 training. MegaFold tackles key bottlenecks through ahead-of-time caching to eliminate GPU idle time from the retrieval-augmented data pipeline, Triton-based kernels for memory-efficient EvoAttention on heterogeneous devices, and deep fusion for common and critical small operators in AF3. Evaluation on both NVIDIA H200 and AMD MI250 GPUs shows that MegaFold reduces peak memory usage of AF3 training by up to 1.23$\times$ and improves per-iteration training time by up-to 1.73$\times$ and 1.62$\times$ respectively. More importantly, MegaFold enables training on 1.35$\times$ longer sequence lengths compared to PyTorch baselines without running out-of-memory, significantly improving the scalability of modern protein folding models. We open source our code at https://github.com/Supercomputing-System-AI-Lab/MegaFold/.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2506.20686
- https://arxiv.org/pdf/2506.20686
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415164655
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415164655Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2506.20686Digital Object Identifier
- Title
-
MegaFold: System-Level Optimizations for Accelerating Protein Structure Prediction ModelsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-06-24Full publication date if available
- Authors
-
Hoa La, Aman Gupta, Alex Morehead, Jianlin Cheng, Michael H. ZhangList of authors in order
- Landing page
-
https://arxiv.org/abs/2506.20686Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2506.20686Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2506.20686Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415164655 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2506.20686 |
| ids.doi | https://doi.org/10.48550/arxiv.2506.20686 |
| ids.openalex | https://openalex.org/W4415164655 |
| fwci | |
| type | preprint |
| title | MegaFold: System-Level Optimizations for Accelerating Protein Structure Prediction Models |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10044 |
| topics[0].field.id | https://openalex.org/fields/13 |
| topics[0].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[0].score | 0.9822999835014343 |
| topics[0].domain.id | https://openalex.org/domains/1 |
| topics[0].domain.display_name | Life Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1312 |
| topics[0].subfield.display_name | Molecular Biology |
| topics[0].display_name | Protein Structure and Dynamics |
| topics[1].id | https://openalex.org/T10932 |
| topics[1].field.id | https://openalex.org/fields/13 |
| topics[1].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[1].score | 0.9388999938964844 |
| topics[1].domain.id | https://openalex.org/domains/1 |
| topics[1].domain.display_name | Life Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1312 |
| topics[1].subfield.display_name | Molecular Biology |
| topics[1].display_name | Microbial Metabolic Engineering and Bioproduction |
| topics[2].id | https://openalex.org/T12254 |
| topics[2].field.id | https://openalex.org/fields/13 |
| topics[2].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[2].score | 0.9261000156402588 |
| topics[2].domain.id | https://openalex.org/domains/1 |
| topics[2].domain.display_name | Life Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1312 |
| topics[2].subfield.display_name | Molecular Biology |
| topics[2].display_name | Machine Learning in Bioinformatics |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2506.20686 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2506.20686 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2506.20686 |
| locations[1].id | doi:10.48550/arxiv.2506.20686 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2506.20686 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5119991517 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Hoa La |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | La, Hoa |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5087918614 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-0725-2166 |
| authorships[1].author.display_name | Aman Gupta |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Gupta, Ahan |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5072705391 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-0586-6191 |
| authorships[2].author.display_name | Alex Morehead |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Morehead, Alex |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5044354277 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-0305-2853 |
| authorships[3].author.display_name | Jianlin Cheng |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Cheng, Jianlin |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5036851105 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-8774-5256 |
| authorships[4].author.display_name | Michael H. Zhang |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Zhang, Minjia |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2506.20686 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-15T00:00:00 |
| display_name | MegaFold: System-Level Optimizations for Accelerating Protein Structure Prediction Models |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10044 |
| primary_topic.field.id | https://openalex.org/fields/13 |
| primary_topic.field.display_name | Biochemistry, Genetics and Molecular Biology |
| primary_topic.score | 0.9822999835014343 |
| primary_topic.domain.id | https://openalex.org/domains/1 |
| primary_topic.domain.display_name | Life Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1312 |
| primary_topic.subfield.display_name | Molecular Biology |
| primary_topic.display_name | Protein Structure and Dynamics |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2506.20686 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2506.20686 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2506.20686 |
| primary_location.id | pmh:oai:arXiv.org:2506.20686 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2506.20686 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2506.20686 |
| publication_date | 2025-06-24 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 28, 58 |
| abstract_inverted_index.2D | 37 |
| abstract_inverted_index.In | 52 |
| abstract_inverted_index.We | 161 |
| abstract_inverted_index.as | 5 |
| abstract_inverted_index.at | 27, 166 |
| abstract_inverted_index.by | 14, 120, 129 |
| abstract_inverted_index.in | 99 |
| abstract_inverted_index.of | 11, 49, 117, 156 |
| abstract_inverted_index.on | 87, 102, 140 |
| abstract_inverted_index.to | 19, 61, 72, 122, 146 |
| abstract_inverted_index.up | 121 |
| abstract_inverted_index.we | 55 |
| abstract_inverted_index.AF3 | 50, 63, 118 |
| abstract_inverted_index.AMD | 107 |
| abstract_inverted_index.GPU | 74 |
| abstract_inverted_index.and | 34, 40, 90, 95, 106, 124, 132 |
| abstract_inverted_index.for | 84, 93 |
| abstract_inverted_index.key | 67 |
| abstract_inverted_index.our | 164 |
| abstract_inverted_index.the | 9, 20, 47, 78, 154 |
| abstract_inverted_index.AF3. | 100 |
| abstract_inverted_index.GPUs | 109 |
| abstract_inverted_index.H200 | 105 |
| abstract_inverted_index.More | 135 |
| abstract_inverted_index.both | 103 |
| abstract_inverted_index.code | 165 |
| abstract_inverted_index.come | 26 |
| abstract_inverted_index.data | 42, 80 |
| abstract_inverted_index.deep | 91 |
| abstract_inverted_index.from | 77 |
| abstract_inverted_index.idle | 75 |
| abstract_inverted_index.open | 162 |
| abstract_inverted_index.peak | 114 |
| abstract_inverted_index.push | 8 |
| abstract_inverted_index.such | 4 |
| abstract_inverted_index.that | 111 |
| abstract_inverted_index.this | 53 |
| abstract_inverted_index.time | 76, 128 |
| abstract_inverted_index.(AF3) | 7 |
| abstract_inverted_index.MI250 | 108 |
| abstract_inverted_index.cost, | 31 |
| abstract_inverted_index.shows | 110 |
| abstract_inverted_index.small | 97 |
| abstract_inverted_index.steep | 29 |
| abstract_inverted_index.these | 24 |
| abstract_inverted_index.up-to | 130 |
| abstract_inverted_index.usage | 116 |
| abstract_inverted_index.which | 44 |
| abstract_inverted_index.work, | 54 |
| abstract_inverted_index.NVIDIA | 104 |
| abstract_inverted_index.common | 94 |
| abstract_inverted_index.fusion | 92 |
| abstract_inverted_index.hinder | 46 |
| abstract_inverted_index.longer | 142 |
| abstract_inverted_index.memory | 115 |
| abstract_inverted_index.models | 3 |
| abstract_inverted_index.modern | 157 |
| abstract_inverted_index.source | 163 |
| abstract_inverted_index.system | 30, 60 |
| abstract_inverted_index.Protein | 0 |
| abstract_inverted_index.PyTorch | 147 |
| abstract_inverted_index.caching | 71 |
| abstract_inverted_index.changes | 18 |
| abstract_inverted_index.enables | 138 |
| abstract_inverted_index.folding | 159 |
| abstract_inverted_index.kernels | 83 |
| abstract_inverted_index.lengths | 144 |
| abstract_inverted_index.models. | 160 |
| abstract_inverted_index.present | 56 |
| abstract_inverted_index.protein | 158 |
| abstract_inverted_index.reduces | 113 |
| abstract_inverted_index.running | 150 |
| abstract_inverted_index.tackles | 66 |
| abstract_inverted_index.through | 69 |
| abstract_inverted_index.without | 149 |
| abstract_inverted_index.However, | 23 |
| abstract_inverted_index.MegaFold | 65, 112, 137 |
| abstract_inverted_index.advances | 25 |
| abstract_inverted_index.compared | 145 |
| abstract_inverted_index.compute- | 33 |
| abstract_inverted_index.critical | 96 |
| abstract_inverted_index.devices, | 89 |
| abstract_inverted_index.frontier | 10 |
| abstract_inverted_index.improves | 125 |
| abstract_inverted_index.modeling | 13 |
| abstract_inverted_index.sequence | 143 |
| abstract_inverted_index.training | 119, 127, 139 |
| abstract_inverted_index.MegaFold, | 57 |
| abstract_inverted_index.attention | 38 |
| abstract_inverted_index.baselines | 148 |
| abstract_inverted_index.eliminate | 73 |
| abstract_inverted_index.improving | 153 |
| abstract_inverted_index.operators | 98 |
| abstract_inverted_index.pipeline, | 81 |
| abstract_inverted_index.structure | 1 |
| abstract_inverted_index.training. | 51, 64 |
| abstract_inverted_index.AlphaFold3 | 6 |
| abstract_inverted_index.Evaluation | 101 |
| abstract_inverted_index.accelerate | 62 |
| abstract_inverted_index.operators, | 36 |
| abstract_inverted_index.pipelines, | 43 |
| abstract_inverted_index.prediction | 2 |
| abstract_inverted_index.bottlenecks | 68 |
| abstract_inverted_index.mechanisms, | 39 |
| abstract_inverted_index.scalability | 48, 155 |
| abstract_inverted_index.transformer | 21 |
| abstract_inverted_index.1.23$\times$ | 123 |
| abstract_inverted_index.1.35$\times$ | 141 |
| abstract_inverted_index.1.62$\times$ | 133 |
| abstract_inverted_index.1.73$\times$ | 131 |
| abstract_inverted_index.EvoAttention | 86 |
| abstract_inverted_index.Triton-based | 82 |
| abstract_inverted_index.biomolecular | 12 |
| abstract_inverted_index.collectively | 45 |
| abstract_inverted_index.importantly, | 136 |
| abstract_inverted_index.introducing: | 32 |
| abstract_inverted_index.ahead-of-time | 70 |
| abstract_inverted_index.architectural | 17 |
| abstract_inverted_index.architecture. | 22 |
| abstract_inverted_index.heterogeneous | 88 |
| abstract_inverted_index.incorporating | 15 |
| abstract_inverted_index.per-iteration | 126 |
| abstract_inverted_index.respectively. | 134 |
| abstract_inverted_index.significantly | 152 |
| abstract_inverted_index.cross-platform | 59 |
| abstract_inverted_index.out-of-memory, | 151 |
| abstract_inverted_index.memory-efficient | 85 |
| abstract_inverted_index.memory-intensive | 35 |
| abstract_inverted_index.science-informed | 16 |
| abstract_inverted_index.retrieval-augmented | 41, 79 |
| abstract_inverted_index.https://github.com/Supercomputing-System-AI-Lab/MegaFold/. | 167 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |