Fast Neural Tangent Kernel Alignment, Norm and Effective Rank via Trace Estimation Article Swipe
The Neural Tangent Kernel (NTK) characterizes how a model's state evolves over Gradient Descent. Computing the full NTK matrix is often infeasible, especially for recurrent architectures. Here, we introduce a matrix-free perspective, using trace estimation to rapidly analyze the empirical, finite-width NTK. This enables fast computation of the NTK's trace, Frobenius norm, effective rank, and alignment. We provide numerical recipes based on the Hutch++ trace estimator with provably fast convergence guarantees. In addition, we show that, due to the structure of the NTK, one can compute the trace using only forward- or reverse-mode automatic differentiation, not requiring both modes. We show these so-called one-sided estimators can outperform Hutch++ in the low-sample regime, especially when the gap between the model state and parameter count is large. In total, our results demonstrate that matrix-free randomized approaches can yield speedups of many orders of magnitude, leading to faster analysis and applications of the NTK.
Related Topics
- Type
- preprint
- Landing Page
- https://doi.org/10.48550/arxiv.2511.10796
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W7105897366
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W7105897366Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2511.10796Digital Object Identifier
- Title
-
Fast Neural Tangent Kernel Alignment, Norm and Effective Rank via Trace EstimationWork title
- Type
-
preprintOpenAlex work type
- Publication year
-
2025Year of publication
- Publication date
-
2025-11-13Full publication date if available
- Authors
-
Hazelden, JamesList of authors in order
- Landing page
-
https://doi.org/10.48550/arxiv.2511.10796Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://doi.org/10.48550/arxiv.2511.10796Direct OA link when available
- Concepts
-
Estimator, TRACE (psycholinguistics), Computation, Algorithm, Kernel (algebra), Tangent, Convergence (economics), Rank (graph theory), Mathematics, Computer science, Artificial neural network, Applied mathematics, Matrix (chemical analysis), Matrix norm, Norm (philosophy), Kernel method, Kernel density estimation, Mathematical optimization, Low-rank approximation, State (computer science), Rate of convergence, Determinantal point process, Estimation theory, Generalization, Regularization (linguistics), Numerical linear algebra, Tangent vector, Positive-definite matrixTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W7105897366 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2511.10796 |
| ids.doi | https://doi.org/10.48550/arxiv.2511.10796 |
| ids.openalex | https://openalex.org/W7105897366 |
| fwci | 0.0 |
| type | preprint |
| title | Fast Neural Tangent Kernel Alignment, Norm and Effective Rank via Trace Estimation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C185429906 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7338117957115173 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1130160 |
| concepts[0].display_name | Estimator |
| concepts[1].id | https://openalex.org/C75291252 |
| concepts[1].level | 2 |
| concepts[1].score | 0.7029515504837036 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q1315756 |
| concepts[1].display_name | TRACE (psycholinguistics) |
| concepts[2].id | https://openalex.org/C45374587 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6162897944450378 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q12525525 |
| concepts[2].display_name | Computation |
| concepts[3].id | https://openalex.org/C11413529 |
| concepts[3].level | 1 |
| concepts[3].score | 0.5913434624671936 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[3].display_name | Algorithm |
| concepts[4].id | https://openalex.org/C74193536 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5536198616027832 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q574844 |
| concepts[4].display_name | Kernel (algebra) |
| concepts[5].id | https://openalex.org/C138187205 |
| concepts[5].level | 2 |
| concepts[5].score | 0.5074775218963623 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q131251 |
| concepts[5].display_name | Tangent |
| concepts[6].id | https://openalex.org/C2777303404 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5065313577651978 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q759757 |
| concepts[6].display_name | Convergence (economics) |
| concepts[7].id | https://openalex.org/C164226766 |
| concepts[7].level | 2 |
| concepts[7].score | 0.48469027876853943 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q7293202 |
| concepts[7].display_name | Rank (graph theory) |
| concepts[8].id | https://openalex.org/C33923547 |
| concepts[8].level | 0 |
| concepts[8].score | 0.478423148393631 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[8].display_name | Mathematics |
| concepts[9].id | https://openalex.org/C41008148 |
| concepts[9].level | 0 |
| concepts[9].score | 0.457979291677475 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[9].display_name | Computer science |
| concepts[10].id | https://openalex.org/C50644808 |
| concepts[10].level | 2 |
| concepts[10].score | 0.4226260483264923 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q192776 |
| concepts[10].display_name | Artificial neural network |
| concepts[11].id | https://openalex.org/C28826006 |
| concepts[11].level | 1 |
| concepts[11].score | 0.4154963493347168 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q33521 |
| concepts[11].display_name | Applied mathematics |
| concepts[12].id | https://openalex.org/C106487976 |
| concepts[12].level | 2 |
| concepts[12].score | 0.3821755051612854 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q685816 |
| concepts[12].display_name | Matrix (chemical analysis) |
| concepts[13].id | https://openalex.org/C92207270 |
| concepts[13].level | 3 |
| concepts[13].score | 0.3791339695453644 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q939253 |
| concepts[13].display_name | Matrix norm |
| concepts[14].id | https://openalex.org/C191795146 |
| concepts[14].level | 2 |
| concepts[14].score | 0.3763812780380249 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q3878446 |
| concepts[14].display_name | Norm (philosophy) |
| concepts[15].id | https://openalex.org/C122280245 |
| concepts[15].level | 3 |
| concepts[15].score | 0.3687765896320343 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q620622 |
| concepts[15].display_name | Kernel method |
| concepts[16].id | https://openalex.org/C71134354 |
| concepts[16].level | 3 |
| concepts[16].score | 0.34446263313293457 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q458825 |
| concepts[16].display_name | Kernel density estimation |
| concepts[17].id | https://openalex.org/C126255220 |
| concepts[17].level | 1 |
| concepts[17].score | 0.33303776383399963 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q141495 |
| concepts[17].display_name | Mathematical optimization |
| concepts[18].id | https://openalex.org/C90199385 |
| concepts[18].level | 3 |
| concepts[18].score | 0.3129071295261383 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q6692777 |
| concepts[18].display_name | Low-rank approximation |
| concepts[19].id | https://openalex.org/C48103436 |
| concepts[19].level | 2 |
| concepts[19].score | 0.3042488992214203 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q599031 |
| concepts[19].display_name | State (computer science) |
| concepts[20].id | https://openalex.org/C57869625 |
| concepts[20].level | 3 |
| concepts[20].score | 0.30343925952911377 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q1783502 |
| concepts[20].display_name | Rate of convergence |
| concepts[21].id | https://openalex.org/C72010251 |
| concepts[21].level | 4 |
| concepts[21].score | 0.2910251021385193 |
| concepts[21].wikidata | https://www.wikidata.org/wiki/Q5265688 |
| concepts[21].display_name | Determinantal point process |
| concepts[22].id | https://openalex.org/C167928553 |
| concepts[22].level | 2 |
| concepts[22].score | 0.2767001986503601 |
| concepts[22].wikidata | https://www.wikidata.org/wiki/Q1376021 |
| concepts[22].display_name | Estimation theory |
| concepts[23].id | https://openalex.org/C177148314 |
| concepts[23].level | 2 |
| concepts[23].score | 0.2677038013935089 |
| concepts[23].wikidata | https://www.wikidata.org/wiki/Q170084 |
| concepts[23].display_name | Generalization |
| concepts[24].id | https://openalex.org/C2776135515 |
| concepts[24].level | 2 |
| concepts[24].score | 0.2635955512523651 |
| concepts[24].wikidata | https://www.wikidata.org/wiki/Q17143721 |
| concepts[24].display_name | Regularization (linguistics) |
| concepts[25].id | https://openalex.org/C163834973 |
| concepts[25].level | 3 |
| concepts[25].score | 0.2635423243045807 |
| concepts[25].wikidata | https://www.wikidata.org/wiki/Q2004891 |
| concepts[25].display_name | Numerical linear algebra |
| concepts[26].id | https://openalex.org/C47890412 |
| concepts[26].level | 3 |
| concepts[26].score | 0.2609897255897522 |
| concepts[26].wikidata | https://www.wikidata.org/wiki/Q1179296 |
| concepts[26].display_name | Tangent vector |
| concepts[27].id | https://openalex.org/C49712288 |
| concepts[27].level | 3 |
| concepts[27].score | 0.26000282168388367 |
| concepts[27].wikidata | https://www.wikidata.org/wiki/Q77601250 |
| concepts[27].display_name | Positive-definite matrix |
| keywords[0].id | https://openalex.org/keywords/estimator |
| keywords[0].score | 0.7338117957115173 |
| keywords[0].display_name | Estimator |
| keywords[1].id | https://openalex.org/keywords/trace |
| keywords[1].score | 0.7029515504837036 |
| keywords[1].display_name | TRACE (psycholinguistics) |
| keywords[2].id | https://openalex.org/keywords/computation |
| keywords[2].score | 0.6162897944450378 |
| keywords[2].display_name | Computation |
| keywords[3].id | https://openalex.org/keywords/kernel |
| keywords[3].score | 0.5536198616027832 |
| keywords[3].display_name | Kernel (algebra) |
| keywords[4].id | https://openalex.org/keywords/tangent |
| keywords[4].score | 0.5074775218963623 |
| keywords[4].display_name | Tangent |
| keywords[5].id | https://openalex.org/keywords/convergence |
| keywords[5].score | 0.5065313577651978 |
| keywords[5].display_name | Convergence (economics) |
| keywords[6].id | https://openalex.org/keywords/rank |
| keywords[6].score | 0.48469027876853943 |
| keywords[6].display_name | Rank (graph theory) |
| keywords[7].id | https://openalex.org/keywords/artificial-neural-network |
| keywords[7].score | 0.4226260483264923 |
| keywords[7].display_name | Artificial neural network |
| keywords[8].id | https://openalex.org/keywords/matrix |
| keywords[8].score | 0.3821755051612854 |
| keywords[8].display_name | Matrix (chemical analysis) |
| language | |
| locations[0].id | doi:10.48550/arxiv.2511.10796 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | |
| locations[0].version | |
| locations[0].raw_type | article |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.48550/arxiv.2511.10796 |
| indexed_in | datacite |
| authorships[0].author.id | |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Hazelden, James |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Hazelden, James |
| authorships[0].is_corresponding | True |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://doi.org/10.48550/arxiv.2511.10796 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-11-18T00:00:00 |
| display_name | Fast Neural Tangent Kernel Alignment, Norm and Effective Rank via Trace Estimation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-18T23:46:17.205004 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | doi:10.48550/arxiv.2511.10796 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | |
| best_oa_location.version | |
| best_oa_location.raw_type | article |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.48550/arxiv.2511.10796 |
| primary_location.id | doi:10.48550/arxiv.2511.10796 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | |
| primary_location.version | |
| primary_location.raw_type | article |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.48550/arxiv.2511.10796 |
| publication_date | 2025-11-13 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 7, 29 |
| abstract_inverted_index.In | 71, 125 |
| abstract_inverted_index.We | 56, 99 |
| abstract_inverted_index.in | 108 |
| abstract_inverted_index.is | 19, 123 |
| abstract_inverted_index.of | 46, 80, 137, 140, 148 |
| abstract_inverted_index.on | 61 |
| abstract_inverted_index.or | 91 |
| abstract_inverted_index.to | 35, 77, 143 |
| abstract_inverted_index.we | 27, 73 |
| abstract_inverted_index.NTK | 17 |
| abstract_inverted_index.The | 0 |
| abstract_inverted_index.and | 54, 120, 146 |
| abstract_inverted_index.can | 84, 105, 134 |
| abstract_inverted_index.due | 76 |
| abstract_inverted_index.for | 23 |
| abstract_inverted_index.gap | 115 |
| abstract_inverted_index.how | 6 |
| abstract_inverted_index.not | 95 |
| abstract_inverted_index.one | 83 |
| abstract_inverted_index.our | 127 |
| abstract_inverted_index.the | 15, 38, 47, 62, 78, 81, 86, 109, 114, 117, 149 |
| abstract_inverted_index.NTK, | 82 |
| abstract_inverted_index.NTK. | 41, 150 |
| abstract_inverted_index.This | 42 |
| abstract_inverted_index.both | 97 |
| abstract_inverted_index.fast | 44, 68 |
| abstract_inverted_index.full | 16 |
| abstract_inverted_index.many | 138 |
| abstract_inverted_index.only | 89 |
| abstract_inverted_index.over | 11 |
| abstract_inverted_index.show | 74, 100 |
| abstract_inverted_index.that | 130 |
| abstract_inverted_index.when | 113 |
| abstract_inverted_index.with | 66 |
| abstract_inverted_index.(NTK) | 4 |
| abstract_inverted_index.Here, | 26 |
| abstract_inverted_index.NTK's | 48 |
| abstract_inverted_index.based | 60 |
| abstract_inverted_index.count | 122 |
| abstract_inverted_index.model | 118 |
| abstract_inverted_index.norm, | 51 |
| abstract_inverted_index.often | 20 |
| abstract_inverted_index.rank, | 53 |
| abstract_inverted_index.state | 9, 119 |
| abstract_inverted_index.that, | 75 |
| abstract_inverted_index.these | 101 |
| abstract_inverted_index.trace | 33, 64, 87 |
| abstract_inverted_index.using | 32, 88 |
| abstract_inverted_index.yield | 135 |
| abstract_inverted_index.Kernel | 3 |
| abstract_inverted_index.Neural | 1 |
| abstract_inverted_index.faster | 144 |
| abstract_inverted_index.large. | 124 |
| abstract_inverted_index.matrix | 18 |
| abstract_inverted_index.modes. | 98 |
| abstract_inverted_index.orders | 139 |
| abstract_inverted_index.total, | 126 |
| abstract_inverted_index.trace, | 49 |
| abstract_inverted_index.Hutch++ | 63, 107 |
| abstract_inverted_index.Tangent | 2 |
| abstract_inverted_index.analyze | 37 |
| abstract_inverted_index.between | 116 |
| abstract_inverted_index.compute | 85 |
| abstract_inverted_index.enables | 43 |
| abstract_inverted_index.evolves | 10 |
| abstract_inverted_index.leading | 142 |
| abstract_inverted_index.model's | 8 |
| abstract_inverted_index.provide | 57 |
| abstract_inverted_index.rapidly | 36 |
| abstract_inverted_index.recipes | 59 |
| abstract_inverted_index.regime, | 111 |
| abstract_inverted_index.results | 128 |
| abstract_inverted_index.Descent. | 13 |
| abstract_inverted_index.Gradient | 12 |
| abstract_inverted_index.analysis | 145 |
| abstract_inverted_index.forward- | 90 |
| abstract_inverted_index.provably | 67 |
| abstract_inverted_index.speedups | 136 |
| abstract_inverted_index.Computing | 14 |
| abstract_inverted_index.Frobenius | 50 |
| abstract_inverted_index.addition, | 72 |
| abstract_inverted_index.automatic | 93 |
| abstract_inverted_index.effective | 52 |
| abstract_inverted_index.estimator | 65 |
| abstract_inverted_index.introduce | 28 |
| abstract_inverted_index.numerical | 58 |
| abstract_inverted_index.one-sided | 103 |
| abstract_inverted_index.parameter | 121 |
| abstract_inverted_index.recurrent | 24 |
| abstract_inverted_index.requiring | 96 |
| abstract_inverted_index.so-called | 102 |
| abstract_inverted_index.structure | 79 |
| abstract_inverted_index.alignment. | 55 |
| abstract_inverted_index.approaches | 133 |
| abstract_inverted_index.empirical, | 39 |
| abstract_inverted_index.especially | 22, 112 |
| abstract_inverted_index.estimation | 34 |
| abstract_inverted_index.estimators | 104 |
| abstract_inverted_index.low-sample | 110 |
| abstract_inverted_index.magnitude, | 141 |
| abstract_inverted_index.outperform | 106 |
| abstract_inverted_index.randomized | 132 |
| abstract_inverted_index.computation | 45 |
| abstract_inverted_index.convergence | 69 |
| abstract_inverted_index.demonstrate | 129 |
| abstract_inverted_index.guarantees. | 70 |
| abstract_inverted_index.infeasible, | 21 |
| abstract_inverted_index.matrix-free | 30, 131 |
| abstract_inverted_index.applications | 147 |
| abstract_inverted_index.finite-width | 40 |
| abstract_inverted_index.perspective, | 31 |
| abstract_inverted_index.reverse-mode | 92 |
| abstract_inverted_index.characterizes | 5 |
| abstract_inverted_index.architectures. | 25 |
| abstract_inverted_index.differentiation, | 94 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 1 |
| citation_normalized_percentile |