Profiling Apple Silicon Performance for ML Training Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2501.14925
Apple Silicon has attracted much attention for its performance and role in machine learning (ML) training. Unlike NVIDIA GPUs, which have traditionally dominated ML training, Apple Silicon has a significant difference in memory architecture. It uses Unified Memory, which integrates CPU and GPU memory instead of separate CPU memory and GPU VRAM. However, it is difficult to tell whether Unified Memory means more performance benefits. This paper investigates the performance differences by training several large language model (LLM) workloads end-to-end under different memory scenarios. The results show a significant performance gap between Apple Silicon and NVIDIA GPUs. This paper attributes this gap to system-level factors such as page faults, power consumption, and kernel launch time. In addition, the performance difference of basic linear algebra subprograms (BLAS) on the NVIDIA GPUs and Apple Silicon chips is analyzed to further explain the observed gap.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2501.14925
- https://arxiv.org/pdf/2501.14925
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4406879736
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4406879736Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2501.14925Digital Object Identifier
- Title
-
Profiling Apple Silicon Performance for ML TrainingWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-01-24Full publication date if available
- Authors
-
Dawei Feng, Zhiming Xu, W. Walkowiak, Felix Xiaozhu LinList of authors in order
- Landing page
-
https://arxiv.org/abs/2501.14925Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2501.14925Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2501.14925Direct OA link when available
- Concepts
-
Profiling (computer programming), Silicon, Computer science, Materials science, Optoelectronics, Operating systemTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4406879736 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2501.14925 |
| ids.doi | https://doi.org/10.48550/arxiv.2501.14925 |
| ids.openalex | https://openalex.org/W4406879736 |
| fwci | |
| type | preprint |
| title | Profiling Apple Silicon Performance for ML Training |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10494 |
| topics[0].field.id | https://openalex.org/fields/11 |
| topics[0].field.display_name | Agricultural and Biological Sciences |
| topics[0].score | 0.3549000024795532 |
| topics[0].domain.id | https://openalex.org/domains/1 |
| topics[0].domain.display_name | Life Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1110 |
| topics[0].subfield.display_name | Plant Science |
| topics[0].display_name | Plant Virus Research Studies |
| topics[1].id | https://openalex.org/T12795 |
| topics[1].field.id | https://openalex.org/fields/11 |
| topics[1].field.display_name | Agricultural and Biological Sciences |
| topics[1].score | 0.3546000123023987 |
| topics[1].domain.id | https://openalex.org/domains/1 |
| topics[1].domain.display_name | Life Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1110 |
| topics[1].subfield.display_name | Plant Science |
| topics[1].display_name | Banana Cultivation and Research |
| topics[2].id | https://openalex.org/T11750 |
| topics[2].field.id | https://openalex.org/fields/11 |
| topics[2].field.display_name | Agricultural and Biological Sciences |
| topics[2].score | 0.302700012922287 |
| topics[2].domain.id | https://openalex.org/domains/1 |
| topics[2].domain.display_name | Life Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1110 |
| topics[2].subfield.display_name | Plant Science |
| topics[2].display_name | Phytoplasmas and Hemiptera pathogens |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C187191949 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8463044166564941 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1138496 |
| concepts[0].display_name | Profiling (computer programming) |
| concepts[1].id | https://openalex.org/C544956773 |
| concepts[1].level | 2 |
| concepts[1].score | 0.4997248649597168 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q670 |
| concepts[1].display_name | Silicon |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.43206527829170227 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C192562407 |
| concepts[3].level | 0 |
| concepts[3].score | 0.26339656114578247 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q228736 |
| concepts[3].display_name | Materials science |
| concepts[4].id | https://openalex.org/C49040817 |
| concepts[4].level | 1 |
| concepts[4].score | 0.17729753255844116 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q193091 |
| concepts[4].display_name | Optoelectronics |
| concepts[5].id | https://openalex.org/C111919701 |
| concepts[5].level | 1 |
| concepts[5].score | 0.12484273314476013 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[5].display_name | Operating system |
| keywords[0].id | https://openalex.org/keywords/profiling |
| keywords[0].score | 0.8463044166564941 |
| keywords[0].display_name | Profiling (computer programming) |
| keywords[1].id | https://openalex.org/keywords/silicon |
| keywords[1].score | 0.4997248649597168 |
| keywords[1].display_name | Silicon |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.43206527829170227 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/materials-science |
| keywords[3].score | 0.26339656114578247 |
| keywords[3].display_name | Materials science |
| keywords[4].id | https://openalex.org/keywords/optoelectronics |
| keywords[4].score | 0.17729753255844116 |
| keywords[4].display_name | Optoelectronics |
| keywords[5].id | https://openalex.org/keywords/operating-system |
| keywords[5].score | 0.12484273314476013 |
| keywords[5].display_name | Operating system |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2501.14925 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2501.14925 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2501.14925 |
| locations[1].id | doi:10.48550/arxiv.2501.14925 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2501.14925 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5009991479 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-6310-0066 |
| authorships[0].author.display_name | Dawei Feng |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Feng, Dahua |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5026967592 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-1654-7361 |
| authorships[1].author.display_name | Zhiming Xu |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Xu, Zhiming |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5031830492 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-0385-3784 |
| authorships[2].author.display_name | W. Walkowiak |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Wang, Rongxiang |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5025585492 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-1615-6419 |
| authorships[3].author.display_name | Felix Xiaozhu Lin |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Lin, Felix Xiaozhu |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2501.14925 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Profiling Apple Silicon Performance for ML Training |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10494 |
| primary_topic.field.id | https://openalex.org/fields/11 |
| primary_topic.field.display_name | Agricultural and Biological Sciences |
| primary_topic.score | 0.3549000024795532 |
| primary_topic.domain.id | https://openalex.org/domains/1 |
| primary_topic.domain.display_name | Life Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1110 |
| primary_topic.subfield.display_name | Plant Science |
| primary_topic.display_name | Plant Virus Research Studies |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4396696052 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2501.14925 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2501.14925 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2501.14925 |
| primary_location.id | pmh:oai:arXiv.org:2501.14925 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2501.14925 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2501.14925 |
| publication_date | 2025-01-24 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 28, 87 |
| abstract_inverted_index.In | 115 |
| abstract_inverted_index.It | 34 |
| abstract_inverted_index.ML | 23 |
| abstract_inverted_index.as | 106 |
| abstract_inverted_index.by | 71 |
| abstract_inverted_index.in | 11, 31 |
| abstract_inverted_index.is | 54, 134 |
| abstract_inverted_index.it | 53 |
| abstract_inverted_index.of | 45, 120 |
| abstract_inverted_index.on | 126 |
| abstract_inverted_index.to | 56, 102, 136 |
| abstract_inverted_index.CPU | 40, 47 |
| abstract_inverted_index.GPU | 42, 50 |
| abstract_inverted_index.The | 84 |
| abstract_inverted_index.and | 9, 41, 49, 94, 111, 130 |
| abstract_inverted_index.for | 6 |
| abstract_inverted_index.gap | 90, 101 |
| abstract_inverted_index.has | 2, 27 |
| abstract_inverted_index.its | 7 |
| abstract_inverted_index.the | 68, 117, 127, 139 |
| abstract_inverted_index.(ML) | 14 |
| abstract_inverted_index.GPUs | 129 |
| abstract_inverted_index.This | 65, 97 |
| abstract_inverted_index.gap. | 141 |
| abstract_inverted_index.have | 20 |
| abstract_inverted_index.more | 62 |
| abstract_inverted_index.much | 4 |
| abstract_inverted_index.page | 107 |
| abstract_inverted_index.role | 10 |
| abstract_inverted_index.show | 86 |
| abstract_inverted_index.such | 105 |
| abstract_inverted_index.tell | 57 |
| abstract_inverted_index.this | 100 |
| abstract_inverted_index.uses | 35 |
| abstract_inverted_index.(LLM) | 77 |
| abstract_inverted_index.Apple | 0, 25, 92, 131 |
| abstract_inverted_index.GPUs, | 18 |
| abstract_inverted_index.GPUs. | 96 |
| abstract_inverted_index.VRAM. | 51 |
| abstract_inverted_index.basic | 121 |
| abstract_inverted_index.chips | 133 |
| abstract_inverted_index.large | 74 |
| abstract_inverted_index.means | 61 |
| abstract_inverted_index.model | 76 |
| abstract_inverted_index.paper | 66, 98 |
| abstract_inverted_index.power | 109 |
| abstract_inverted_index.time. | 114 |
| abstract_inverted_index.under | 80 |
| abstract_inverted_index.which | 19, 38 |
| abstract_inverted_index.(BLAS) | 125 |
| abstract_inverted_index.Memory | 60 |
| abstract_inverted_index.NVIDIA | 17, 95, 128 |
| abstract_inverted_index.Unlike | 16 |
| abstract_inverted_index.kernel | 112 |
| abstract_inverted_index.launch | 113 |
| abstract_inverted_index.linear | 122 |
| abstract_inverted_index.memory | 32, 43, 48, 82 |
| abstract_inverted_index.Memory, | 37 |
| abstract_inverted_index.Silicon | 1, 26, 93, 132 |
| abstract_inverted_index.Unified | 36, 59 |
| abstract_inverted_index.algebra | 123 |
| abstract_inverted_index.between | 91 |
| abstract_inverted_index.explain | 138 |
| abstract_inverted_index.factors | 104 |
| abstract_inverted_index.faults, | 108 |
| abstract_inverted_index.further | 137 |
| abstract_inverted_index.instead | 44 |
| abstract_inverted_index.machine | 12 |
| abstract_inverted_index.results | 85 |
| abstract_inverted_index.several | 73 |
| abstract_inverted_index.whether | 58 |
| abstract_inverted_index.However, | 52 |
| abstract_inverted_index.analyzed | 135 |
| abstract_inverted_index.language | 75 |
| abstract_inverted_index.learning | 13 |
| abstract_inverted_index.observed | 140 |
| abstract_inverted_index.separate | 46 |
| abstract_inverted_index.training | 72 |
| abstract_inverted_index.addition, | 116 |
| abstract_inverted_index.attention | 5 |
| abstract_inverted_index.attracted | 3 |
| abstract_inverted_index.benefits. | 64 |
| abstract_inverted_index.different | 81 |
| abstract_inverted_index.difficult | 55 |
| abstract_inverted_index.dominated | 22 |
| abstract_inverted_index.training, | 24 |
| abstract_inverted_index.training. | 15 |
| abstract_inverted_index.workloads | 78 |
| abstract_inverted_index.attributes | 99 |
| abstract_inverted_index.difference | 30, 119 |
| abstract_inverted_index.end-to-end | 79 |
| abstract_inverted_index.integrates | 39 |
| abstract_inverted_index.scenarios. | 83 |
| abstract_inverted_index.differences | 70 |
| abstract_inverted_index.performance | 8, 63, 69, 89, 118 |
| abstract_inverted_index.significant | 29, 88 |
| abstract_inverted_index.subprograms | 124 |
| abstract_inverted_index.consumption, | 110 |
| abstract_inverted_index.investigates | 67 |
| abstract_inverted_index.system-level | 103 |
| abstract_inverted_index.architecture. | 33 |
| abstract_inverted_index.traditionally | 21 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |