What Algorithms can Transformers Learn? A Study in Length Generalization Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2310.16028
Large language models exhibit surprising emergent generalization properties, yet also struggle on many simple reasoning tasks such as arithmetic and parity. This raises the question of if and when Transformer models can learn the true algorithm for solving a task. We study the scope of Transformers' abilities in the specific setting of length generalization on algorithmic tasks. Here, we propose a unifying framework to understand when and how Transformers can exhibit strong length generalization on a given task. Specifically, we leverage RASP (Weiss et al., 2021) -- a programming language designed for the computational model of a Transformer -- and introduce the RASP-Generalization Conjecture: Transformers tend to length generalize on a task if the task can be solved by a short RASP program which works for all input lengths. This simple conjecture remarkably captures most known instances of length generalization on algorithmic tasks. Moreover, we leverage our insights to drastically improve generalization performance on traditionally hard tasks (such as parity and addition). On the theoretical side, we give a simple example where the "min-degree-interpolator" model of learning from Abbe et al. (2023) does not correctly predict Transformers' out-of-distribution behavior, but our conjecture does. Overall, our work provides a novel perspective on the mechanisms of compositional generalization and the algorithmic capabilities of Transformers.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2310.16028
- https://arxiv.org/pdf/2310.16028
- OA Status
- green
- Cited By
- 6
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4387947637
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4387947637Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2310.16028Digital Object Identifier
- Title
-
What Algorithms can Transformers Learn? A Study in Length GeneralizationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-10-24Full publication date if available
- Authors
-
Hattie Zhou, Arwen Bradley, Etai Littwin, Noam Razin, Omid Saremi, Josh Susskind, Samy Bengio, Preetum NakkiranList of authors in order
- Landing page
-
https://arxiv.org/abs/2310.16028Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2310.16028Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2310.16028Direct OA link when available
- Concepts
-
Transformer, Generalization, Conjecture, Computer science, Algorithm, Leverage (statistics), Theoretical computer science, Arithmetic, Artificial intelligence, Mathematics, Discrete mathematics, Engineering, Mathematical analysis, Electrical engineering, VoltageTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
6Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 5, 2024: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4387947637 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2310.16028 |
| ids.doi | https://doi.org/10.48550/arxiv.2310.16028 |
| ids.openalex | https://openalex.org/W4387947637 |
| fwci | |
| type | preprint |
| title | What Algorithms can Transformers Learn? A Study in Length Generalization |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10181 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9990000128746033 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Natural Language Processing Techniques |
| topics[1].id | https://openalex.org/T10028 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9990000128746033 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Topic Modeling |
| topics[2].id | https://openalex.org/T13629 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9897000193595886 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Text Readability and Simplification |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C66322947 |
| concepts[0].level | 3 |
| concepts[0].score | 0.6675440073013306 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q11658 |
| concepts[0].display_name | Transformer |
| concepts[1].id | https://openalex.org/C177148314 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6389241814613342 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q170084 |
| concepts[1].display_name | Generalization |
| concepts[2].id | https://openalex.org/C2780990831 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6199935674667358 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q319141 |
| concepts[2].display_name | Conjecture |
| concepts[3].id | https://openalex.org/C41008148 |
| concepts[3].level | 0 |
| concepts[3].score | 0.5977458953857422 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[3].display_name | Computer science |
| concepts[4].id | https://openalex.org/C11413529 |
| concepts[4].level | 1 |
| concepts[4].score | 0.5013082027435303 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[4].display_name | Algorithm |
| concepts[5].id | https://openalex.org/C153083717 |
| concepts[5].level | 2 |
| concepts[5].score | 0.49433112144470215 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q6535263 |
| concepts[5].display_name | Leverage (statistics) |
| concepts[6].id | https://openalex.org/C80444323 |
| concepts[6].level | 1 |
| concepts[6].score | 0.4129173457622528 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q2878974 |
| concepts[6].display_name | Theoretical computer science |
| concepts[7].id | https://openalex.org/C94375191 |
| concepts[7].level | 1 |
| concepts[7].score | 0.34316760301589966 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q11205 |
| concepts[7].display_name | Arithmetic |
| concepts[8].id | https://openalex.org/C154945302 |
| concepts[8].level | 1 |
| concepts[8].score | 0.30090826749801636 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[8].display_name | Artificial intelligence |
| concepts[9].id | https://openalex.org/C33923547 |
| concepts[9].level | 0 |
| concepts[9].score | 0.28169864416122437 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[9].display_name | Mathematics |
| concepts[10].id | https://openalex.org/C118615104 |
| concepts[10].level | 1 |
| concepts[10].score | 0.2113114297389984 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q121416 |
| concepts[10].display_name | Discrete mathematics |
| concepts[11].id | https://openalex.org/C127413603 |
| concepts[11].level | 0 |
| concepts[11].score | 0.06454560160636902 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q11023 |
| concepts[11].display_name | Engineering |
| concepts[12].id | https://openalex.org/C134306372 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q7754 |
| concepts[12].display_name | Mathematical analysis |
| concepts[13].id | https://openalex.org/C119599485 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q43035 |
| concepts[13].display_name | Electrical engineering |
| concepts[14].id | https://openalex.org/C165801399 |
| concepts[14].level | 2 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q25428 |
| concepts[14].display_name | Voltage |
| keywords[0].id | https://openalex.org/keywords/transformer |
| keywords[0].score | 0.6675440073013306 |
| keywords[0].display_name | Transformer |
| keywords[1].id | https://openalex.org/keywords/generalization |
| keywords[1].score | 0.6389241814613342 |
| keywords[1].display_name | Generalization |
| keywords[2].id | https://openalex.org/keywords/conjecture |
| keywords[2].score | 0.6199935674667358 |
| keywords[2].display_name | Conjecture |
| keywords[3].id | https://openalex.org/keywords/computer-science |
| keywords[3].score | 0.5977458953857422 |
| keywords[3].display_name | Computer science |
| keywords[4].id | https://openalex.org/keywords/algorithm |
| keywords[4].score | 0.5013082027435303 |
| keywords[4].display_name | Algorithm |
| keywords[5].id | https://openalex.org/keywords/leverage |
| keywords[5].score | 0.49433112144470215 |
| keywords[5].display_name | Leverage (statistics) |
| keywords[6].id | https://openalex.org/keywords/theoretical-computer-science |
| keywords[6].score | 0.4129173457622528 |
| keywords[6].display_name | Theoretical computer science |
| keywords[7].id | https://openalex.org/keywords/arithmetic |
| keywords[7].score | 0.34316760301589966 |
| keywords[7].display_name | Arithmetic |
| keywords[8].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[8].score | 0.30090826749801636 |
| keywords[8].display_name | Artificial intelligence |
| keywords[9].id | https://openalex.org/keywords/mathematics |
| keywords[9].score | 0.28169864416122437 |
| keywords[9].display_name | Mathematics |
| keywords[10].id | https://openalex.org/keywords/discrete-mathematics |
| keywords[10].score | 0.2113114297389984 |
| keywords[10].display_name | Discrete mathematics |
| keywords[11].id | https://openalex.org/keywords/engineering |
| keywords[11].score | 0.06454560160636902 |
| keywords[11].display_name | Engineering |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2310.16028 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2310.16028 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2310.16028 |
| locations[1].id | doi:10.48550/arxiv.2310.16028 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2310.16028 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5075759362 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Hattie Zhou |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Zhou, Hattie |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5076803200 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Arwen Bradley |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Bradley, Arwen |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5049929304 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Etai Littwin |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Littwin, Etai |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5030945372 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Noam Razin |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Razin, Noam |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5011650393 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-6718-093X |
| authorships[4].author.display_name | Omid Saremi |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Saremi, Omid |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5043808400 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Josh Susskind |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Susskind, Josh |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5017529415 |
| authorships[6].author.orcid | |
| authorships[6].author.display_name | Samy Bengio |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Bengio, Samy |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5110716768 |
| authorships[7].author.orcid | |
| authorships[7].author.display_name | Preetum Nakkiran |
| authorships[7].author_position | last |
| authorships[7].raw_author_name | Nakkiran, Preetum |
| authorships[7].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2310.16028 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | What Algorithms can Transformers Learn? A Study in Length Generalization |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10181 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9990000128746033 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Natural Language Processing Techniques |
| related_works | https://openalex.org/W3162204513, https://openalex.org/W4300899577, https://openalex.org/W2371138613, https://openalex.org/W3194471551, https://openalex.org/W2048963458, https://openalex.org/W4375956809, https://openalex.org/W1581373162, https://openalex.org/W43109613, https://openalex.org/W2359952343, https://openalex.org/W4302778552 |
| cited_by_count | 6 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 5 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2310.16028 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2310.16028 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2310.16028 |
| primary_location.id | pmh:oai:arXiv.org:2310.16028 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2310.16028 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2310.16028 |
| publication_date | 2023-10-24 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 38, 60, 75, 87, 96, 110, 119, 168, 197 |
| abstract_inverted_index.-- | 86, 98 |
| abstract_inverted_index.On | 162 |
| abstract_inverted_index.We | 40 |
| abstract_inverted_index.as | 17, 158 |
| abstract_inverted_index.be | 116 |
| abstract_inverted_index.by | 118 |
| abstract_inverted_index.et | 83, 179 |
| abstract_inverted_index.if | 26, 112 |
| abstract_inverted_index.in | 47 |
| abstract_inverted_index.of | 25, 44, 51, 95, 137, 175, 203, 210 |
| abstract_inverted_index.on | 11, 54, 74, 109, 140, 153, 200 |
| abstract_inverted_index.to | 63, 106, 148 |
| abstract_inverted_index.we | 58, 79, 144, 166 |
| abstract_inverted_index.al. | 180 |
| abstract_inverted_index.all | 126 |
| abstract_inverted_index.and | 19, 27, 66, 99, 160, 206 |
| abstract_inverted_index.but | 189 |
| abstract_inverted_index.can | 31, 69, 115 |
| abstract_inverted_index.for | 36, 91, 125 |
| abstract_inverted_index.how | 67 |
| abstract_inverted_index.not | 183 |
| abstract_inverted_index.our | 146, 190, 194 |
| abstract_inverted_index.the | 23, 33, 42, 48, 92, 101, 113, 163, 172, 201, 207 |
| abstract_inverted_index.yet | 8 |
| abstract_inverted_index.Abbe | 178 |
| abstract_inverted_index.RASP | 81, 121 |
| abstract_inverted_index.This | 21, 129 |
| abstract_inverted_index.al., | 84 |
| abstract_inverted_index.also | 9 |
| abstract_inverted_index.does | 182 |
| abstract_inverted_index.from | 177 |
| abstract_inverted_index.give | 167 |
| abstract_inverted_index.hard | 155 |
| abstract_inverted_index.many | 12 |
| abstract_inverted_index.most | 134 |
| abstract_inverted_index.such | 16 |
| abstract_inverted_index.task | 111, 114 |
| abstract_inverted_index.tend | 105 |
| abstract_inverted_index.true | 34 |
| abstract_inverted_index.when | 28, 65 |
| abstract_inverted_index.work | 195 |
| abstract_inverted_index.(such | 157 |
| abstract_inverted_index.2021) | 85 |
| abstract_inverted_index.Here, | 57 |
| abstract_inverted_index.Large | 0 |
| abstract_inverted_index.does. | 192 |
| abstract_inverted_index.given | 76 |
| abstract_inverted_index.input | 127 |
| abstract_inverted_index.known | 135 |
| abstract_inverted_index.learn | 32 |
| abstract_inverted_index.model | 94, 174 |
| abstract_inverted_index.novel | 198 |
| abstract_inverted_index.scope | 43 |
| abstract_inverted_index.short | 120 |
| abstract_inverted_index.side, | 165 |
| abstract_inverted_index.study | 41 |
| abstract_inverted_index.task. | 39, 77 |
| abstract_inverted_index.tasks | 15, 156 |
| abstract_inverted_index.where | 171 |
| abstract_inverted_index.which | 123 |
| abstract_inverted_index.works | 124 |
| abstract_inverted_index.(2023) | 181 |
| abstract_inverted_index.(Weiss | 82 |
| abstract_inverted_index.length | 52, 72, 107, 138 |
| abstract_inverted_index.models | 2, 30 |
| abstract_inverted_index.parity | 159 |
| abstract_inverted_index.raises | 22 |
| abstract_inverted_index.simple | 13, 130, 169 |
| abstract_inverted_index.solved | 117 |
| abstract_inverted_index.strong | 71 |
| abstract_inverted_index.tasks. | 56, 142 |
| abstract_inverted_index.example | 170 |
| abstract_inverted_index.exhibit | 3, 70 |
| abstract_inverted_index.improve | 150 |
| abstract_inverted_index.parity. | 20 |
| abstract_inverted_index.predict | 185 |
| abstract_inverted_index.program | 122 |
| abstract_inverted_index.propose | 59 |
| abstract_inverted_index.setting | 50 |
| abstract_inverted_index.solving | 37 |
| abstract_inverted_index.Overall, | 193 |
| abstract_inverted_index.captures | 133 |
| abstract_inverted_index.designed | 90 |
| abstract_inverted_index.emergent | 5 |
| abstract_inverted_index.insights | 147 |
| abstract_inverted_index.language | 1, 89 |
| abstract_inverted_index.learning | 176 |
| abstract_inverted_index.lengths. | 128 |
| abstract_inverted_index.leverage | 80, 145 |
| abstract_inverted_index.provides | 196 |
| abstract_inverted_index.question | 24 |
| abstract_inverted_index.specific | 49 |
| abstract_inverted_index.struggle | 10 |
| abstract_inverted_index.unifying | 61 |
| abstract_inverted_index.Moreover, | 143 |
| abstract_inverted_index.abilities | 46 |
| abstract_inverted_index.algorithm | 35 |
| abstract_inverted_index.behavior, | 188 |
| abstract_inverted_index.correctly | 184 |
| abstract_inverted_index.framework | 62 |
| abstract_inverted_index.instances | 136 |
| abstract_inverted_index.introduce | 100 |
| abstract_inverted_index.reasoning | 14 |
| abstract_inverted_index.addition). | 161 |
| abstract_inverted_index.arithmetic | 18 |
| abstract_inverted_index.conjecture | 131, 191 |
| abstract_inverted_index.generalize | 108 |
| abstract_inverted_index.mechanisms | 202 |
| abstract_inverted_index.remarkably | 132 |
| abstract_inverted_index.surprising | 4 |
| abstract_inverted_index.understand | 64 |
| abstract_inverted_index.Conjecture: | 103 |
| abstract_inverted_index.Transformer | 29, 97 |
| abstract_inverted_index.algorithmic | 55, 141, 208 |
| abstract_inverted_index.drastically | 149 |
| abstract_inverted_index.performance | 152 |
| abstract_inverted_index.perspective | 199 |
| abstract_inverted_index.programming | 88 |
| abstract_inverted_index.properties, | 7 |
| abstract_inverted_index.theoretical | 164 |
| abstract_inverted_index.Transformers | 68, 104 |
| abstract_inverted_index.capabilities | 209 |
| abstract_inverted_index.Specifically, | 78 |
| abstract_inverted_index.Transformers' | 45, 186 |
| abstract_inverted_index.Transformers. | 211 |
| abstract_inverted_index.compositional | 204 |
| abstract_inverted_index.computational | 93 |
| abstract_inverted_index.traditionally | 154 |
| abstract_inverted_index.generalization | 6, 53, 73, 139, 151, 205 |
| abstract_inverted_index.RASP-Generalization | 102 |
| abstract_inverted_index.out-of-distribution | 187 |
| abstract_inverted_index."min-degree-interpolator" | 173 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 8 |
| citation_normalized_percentile |