DeepFlow: A Cross-Stack Pathfinding Framework for Distributed AI Systems Article Swipe
YOU?
·
· 2022
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2211.03309
Over the past decade, machine learning model complexity has grown at an extraordinary rate, as has the scale of the systems training such large models. However there is an alarmingly low hardware utilization (5-20%) in large scale AI systems. The low system utilization is a cumulative effect of minor losses across different layers of the stack, exacerbated by the disconnect between engineers designing different layers spanning across different industries. We propose CrossFlow, a novel framework that enables cross-layer analysis all the way from the technology layer to the algorithmic layer. We also propose DeepFlow (built on top of CrossFlow using machine learning techniques) to automate the design space exploration and co-optimization across different layers of the stack. We have validated CrossFlow accuracy with distributed training on real commercial hardware and showcase several DeepFlow case studies demonstrating pitfalls of not optimizing across the technology-hardware-software stack for what is likely, the most important workload driving large development investments in all aspects of computing stack.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2211.03309
- https://arxiv.org/pdf/2211.03309
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4308614636
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4308614636Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2211.03309Digital Object Identifier
- Title
-
DeepFlow: A Cross-Stack Pathfinding Framework for Distributed AI SystemsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2022Year of publication
- Publication date
-
2022-11-07Full publication date if available
- Authors
-
Newsha Ardalani, Saptadeep Pal, Puneet GuptaList of authors in order
- Landing page
-
https://arxiv.org/abs/2211.03309Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2211.03309Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2211.03309Direct OA link when available
- Concepts
-
Stack (abstract data type), Computer science, Workload, Layer (electronics), Distributed computing, Software, Operating system, Chemistry, Organic chemistryTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4308614636 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2211.03309 |
| ids.doi | https://doi.org/10.48550/arxiv.2211.03309 |
| ids.openalex | https://openalex.org/W4308614636 |
| fwci | |
| type | preprint |
| title | DeepFlow: A Cross-Stack Pathfinding Framework for Distributed AI Systems |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T12808 |
| topics[0].field.id | https://openalex.org/fields/22 |
| topics[0].field.display_name | Engineering |
| topics[0].score | 0.9965999722480774 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2208 |
| topics[0].subfield.display_name | Electrical and Electronic Engineering |
| topics[0].display_name | Ferroelectric and Negative Capacitance Devices |
| topics[1].id | https://openalex.org/T10502 |
| topics[1].field.id | https://openalex.org/fields/22 |
| topics[1].field.display_name | Engineering |
| topics[1].score | 0.9945999979972839 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2208 |
| topics[1].subfield.display_name | Electrical and Electronic Engineering |
| topics[1].display_name | Advanced Memory and Neural Computing |
| topics[2].id | https://openalex.org/T10273 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9937000274658203 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1705 |
| topics[2].subfield.display_name | Computer Networks and Communications |
| topics[2].display_name | IoT and Edge/Fog Computing |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C9395851 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8181943893432617 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q177929 |
| concepts[0].display_name | Stack (abstract data type) |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.6682251691818237 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C2778476105 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6242268681526184 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q628539 |
| concepts[2].display_name | Workload |
| concepts[3].id | https://openalex.org/C2779227376 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5961121916770935 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q6505497 |
| concepts[3].display_name | Layer (electronics) |
| concepts[4].id | https://openalex.org/C120314980 |
| concepts[4].level | 1 |
| concepts[4].score | 0.50809645652771 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q180634 |
| concepts[4].display_name | Distributed computing |
| concepts[5].id | https://openalex.org/C2777904410 |
| concepts[5].level | 2 |
| concepts[5].score | 0.4877501130104065 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q7397 |
| concepts[5].display_name | Software |
| concepts[6].id | https://openalex.org/C111919701 |
| concepts[6].level | 1 |
| concepts[6].score | 0.20411700010299683 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[6].display_name | Operating system |
| concepts[7].id | https://openalex.org/C185592680 |
| concepts[7].level | 0 |
| concepts[7].score | 0.0 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[7].display_name | Chemistry |
| concepts[8].id | https://openalex.org/C178790620 |
| concepts[8].level | 1 |
| concepts[8].score | 0.0 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q11351 |
| concepts[8].display_name | Organic chemistry |
| keywords[0].id | https://openalex.org/keywords/stack |
| keywords[0].score | 0.8181943893432617 |
| keywords[0].display_name | Stack (abstract data type) |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.6682251691818237 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/workload |
| keywords[2].score | 0.6242268681526184 |
| keywords[2].display_name | Workload |
| keywords[3].id | https://openalex.org/keywords/layer |
| keywords[3].score | 0.5961121916770935 |
| keywords[3].display_name | Layer (electronics) |
| keywords[4].id | https://openalex.org/keywords/distributed-computing |
| keywords[4].score | 0.50809645652771 |
| keywords[4].display_name | Distributed computing |
| keywords[5].id | https://openalex.org/keywords/software |
| keywords[5].score | 0.4877501130104065 |
| keywords[5].display_name | Software |
| keywords[6].id | https://openalex.org/keywords/operating-system |
| keywords[6].score | 0.20411700010299683 |
| keywords[6].display_name | Operating system |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2211.03309 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2211.03309 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2211.03309 |
| locations[1].id | doi:10.48550/arxiv.2211.03309 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2211.03309 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5031627270 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-9975-4819 |
| authorships[0].author.display_name | Newsha Ardalani |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Ardalani, Newsha |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5036103216 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-8777-8573 |
| authorships[1].author.display_name | Saptadeep Pal |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Pal, Saptadeep |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5084229134 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-6188-1134 |
| authorships[2].author.display_name | Puneet Gupta |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Gupta, Puneet |
| authorships[2].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2211.03309 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | DeepFlow: A Cross-Stack Pathfinding Framework for Distributed AI Systems |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T12808 |
| primary_topic.field.id | https://openalex.org/fields/22 |
| primary_topic.field.display_name | Engineering |
| primary_topic.score | 0.9965999722480774 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2208 |
| primary_topic.subfield.display_name | Electrical and Electronic Engineering |
| primary_topic.display_name | Ferroelectric and Negative Capacitance Devices |
| related_works | https://openalex.org/W2000785801, https://openalex.org/W986318368, https://openalex.org/W2384410913, https://openalex.org/W2352878646, https://openalex.org/W2004734601, https://openalex.org/W2130149817, https://openalex.org/W2990194547, https://openalex.org/W1480123525, https://openalex.org/W2620865396, https://openalex.org/W2382601015 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2211.03309 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2211.03309 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2211.03309 |
| primary_location.id | pmh:oai:arXiv.org:2211.03309 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2211.03309 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2211.03309 |
| publication_date | 2022-11-07 |
| publication_year | 2022 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 44, 72 |
| abstract_inverted_index.AI | 37 |
| abstract_inverted_index.We | 69, 90, 117 |
| abstract_inverted_index.an | 11, 28 |
| abstract_inverted_index.as | 14 |
| abstract_inverted_index.at | 10 |
| abstract_inverted_index.by | 57 |
| abstract_inverted_index.in | 34, 156 |
| abstract_inverted_index.is | 27, 43, 146 |
| abstract_inverted_index.of | 18, 47, 53, 97, 114, 137, 159 |
| abstract_inverted_index.on | 95, 125 |
| abstract_inverted_index.to | 86, 103 |
| abstract_inverted_index.The | 39 |
| abstract_inverted_index.all | 79, 157 |
| abstract_inverted_index.and | 109, 129 |
| abstract_inverted_index.for | 144 |
| abstract_inverted_index.has | 8, 15 |
| abstract_inverted_index.low | 30, 40 |
| abstract_inverted_index.not | 138 |
| abstract_inverted_index.the | 1, 16, 19, 54, 58, 80, 83, 87, 105, 115, 141, 148 |
| abstract_inverted_index.top | 96 |
| abstract_inverted_index.way | 81 |
| abstract_inverted_index.Over | 0 |
| abstract_inverted_index.also | 91 |
| abstract_inverted_index.case | 133 |
| abstract_inverted_index.from | 82 |
| abstract_inverted_index.have | 118 |
| abstract_inverted_index.most | 149 |
| abstract_inverted_index.past | 2 |
| abstract_inverted_index.real | 126 |
| abstract_inverted_index.such | 22 |
| abstract_inverted_index.that | 75 |
| abstract_inverted_index.what | 145 |
| abstract_inverted_index.with | 122 |
| abstract_inverted_index.grown | 9 |
| abstract_inverted_index.large | 23, 35, 153 |
| abstract_inverted_index.layer | 85 |
| abstract_inverted_index.minor | 48 |
| abstract_inverted_index.model | 6 |
| abstract_inverted_index.novel | 73 |
| abstract_inverted_index.rate, | 13 |
| abstract_inverted_index.scale | 17, 36 |
| abstract_inverted_index.space | 107 |
| abstract_inverted_index.stack | 143 |
| abstract_inverted_index.there | 26 |
| abstract_inverted_index.using | 99 |
| abstract_inverted_index.(built | 94 |
| abstract_inverted_index.across | 50, 66, 111, 140 |
| abstract_inverted_index.design | 106 |
| abstract_inverted_index.effect | 46 |
| abstract_inverted_index.layer. | 89 |
| abstract_inverted_index.layers | 52, 64, 113 |
| abstract_inverted_index.losses | 49 |
| abstract_inverted_index.stack, | 55 |
| abstract_inverted_index.stack. | 116, 161 |
| abstract_inverted_index.system | 41 |
| abstract_inverted_index.(5-20%) | 33 |
| abstract_inverted_index.However | 25 |
| abstract_inverted_index.aspects | 158 |
| abstract_inverted_index.between | 60 |
| abstract_inverted_index.decade, | 3 |
| abstract_inverted_index.driving | 152 |
| abstract_inverted_index.enables | 76 |
| abstract_inverted_index.likely, | 147 |
| abstract_inverted_index.machine | 4, 100 |
| abstract_inverted_index.models. | 24 |
| abstract_inverted_index.propose | 70, 92 |
| abstract_inverted_index.several | 131 |
| abstract_inverted_index.studies | 134 |
| abstract_inverted_index.systems | 20 |
| abstract_inverted_index.DeepFlow | 93, 132 |
| abstract_inverted_index.accuracy | 121 |
| abstract_inverted_index.analysis | 78 |
| abstract_inverted_index.automate | 104 |
| abstract_inverted_index.hardware | 31, 128 |
| abstract_inverted_index.learning | 5, 101 |
| abstract_inverted_index.pitfalls | 136 |
| abstract_inverted_index.showcase | 130 |
| abstract_inverted_index.spanning | 65 |
| abstract_inverted_index.systems. | 38 |
| abstract_inverted_index.training | 21, 124 |
| abstract_inverted_index.workload | 151 |
| abstract_inverted_index.CrossFlow | 98, 120 |
| abstract_inverted_index.computing | 160 |
| abstract_inverted_index.designing | 62 |
| abstract_inverted_index.different | 51, 63, 67, 112 |
| abstract_inverted_index.engineers | 61 |
| abstract_inverted_index.framework | 74 |
| abstract_inverted_index.important | 150 |
| abstract_inverted_index.validated | 119 |
| abstract_inverted_index.CrossFlow, | 71 |
| abstract_inverted_index.alarmingly | 29 |
| abstract_inverted_index.commercial | 127 |
| abstract_inverted_index.complexity | 7 |
| abstract_inverted_index.cumulative | 45 |
| abstract_inverted_index.disconnect | 59 |
| abstract_inverted_index.optimizing | 139 |
| abstract_inverted_index.technology | 84 |
| abstract_inverted_index.algorithmic | 88 |
| abstract_inverted_index.cross-layer | 77 |
| abstract_inverted_index.development | 154 |
| abstract_inverted_index.distributed | 123 |
| abstract_inverted_index.exacerbated | 56 |
| abstract_inverted_index.exploration | 108 |
| abstract_inverted_index.industries. | 68 |
| abstract_inverted_index.investments | 155 |
| abstract_inverted_index.techniques) | 102 |
| abstract_inverted_index.utilization | 32, 42 |
| abstract_inverted_index.demonstrating | 135 |
| abstract_inverted_index.extraordinary | 12 |
| abstract_inverted_index.co-optimization | 110 |
| abstract_inverted_index.technology-hardware-software | 142 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/9 |
| sustainable_development_goals[0].score | 0.6399999856948853 |
| sustainable_development_goals[0].display_name | Industry, innovation and infrastructure |
| citation_normalized_percentile |