SLAP: A Split Latency Adaptive VLIW pipeline architecture which enables\n on-the-fly variable SIMD vector-length Article Swipe
YOU?
·
· 2021
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2102.13301
Over the last decade the relative latency of access to shared memory by\nmulticore increased as wire resistance dominated latency and low wire density\nlayout pushed multiport memories farther away from their ports. Various\ntechniques were deployed to improve average memory access latencies, such as\nspeculative pre-fetching and branch-prediction, often leading to high variance\nin execution time which is unacceptable in real time systems. Smart DMAs can be\nused to directly copy data into a layer1 SRAM, but with overhead. The VLIW\narchitecture, the de facto signal processing engine, suffers badly from a\nbreakdown in lockstep execution of scalar and vector instructions. We describe\nthe Split Latency Adaptive Pipeline (SLAP) VLIW architecture, a cache\nperformance improvement technology that requires zero change to object code,\nwhile removing smart DMAs and their overhead. SLAP builds on the Decoupled\nAccess and Execute concept by 1) breaking lockstep execution of functional\nunits, 2) enabling variable vector length for variable data level parallelism,\nand 3) adding a novel triangular load mechanism. We discuss the SLAP\narchitecture and demonstrate the performance benefits on real traces from a\nwireless baseband system (where even the most compute intensive functions\nsuffer from an Amdahls law limitation due to a mixture of scalar and vector\nprocessing).\n
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2102.13301
- https://arxiv.org/pdf/2102.13301
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4287323403
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4287323403Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2102.13301Digital Object Identifier
- Title
-
SLAP: A Split Latency Adaptive VLIW pipeline architecture which enables\n on-the-fly variable SIMD vector-lengthWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2021Year of publication
- Publication date
-
2021-02-25Full publication date if available
- Authors
-
Ashish Shrivastava, Alan Gatherer, Tong Sun, Sushma Wokhlu, Alex ChandraList of authors in order
- Landing page
-
https://arxiv.org/abs/2102.13301Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2102.13301Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2102.13301Direct OA link when available
- Concepts
-
Very long instruction word, Computer science, Parallel computing, Cache, Latency (audio), Pipeline (software), SIMD, Embedded system, Operating system, TelecommunicationsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4287323403 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2102.13301 |
| ids.openalex | https://openalex.org/W4287323403 |
| fwci | 0.0 |
| type | preprint |
| title | SLAP: A Split Latency Adaptive VLIW pipeline architecture which enables\n on-the-fly variable SIMD vector-length |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10054 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9998000264167786 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1708 |
| topics[0].subfield.display_name | Hardware and Architecture |
| topics[0].display_name | Parallel Computing and Optimization Techniques |
| topics[1].id | https://openalex.org/T12808 |
| topics[1].field.id | https://openalex.org/fields/22 |
| topics[1].field.display_name | Engineering |
| topics[1].score | 0.9998000264167786 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2208 |
| topics[1].subfield.display_name | Electrical and Electronic Engineering |
| topics[1].display_name | Ferroelectric and Negative Capacitance Devices |
| topics[2].id | https://openalex.org/T11181 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9995999932289124 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1705 |
| topics[2].subfield.display_name | Computer Networks and Communications |
| topics[2].display_name | Advanced Data Storage Technologies |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C170595534 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8242573142051697 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q249743 |
| concepts[0].display_name | Very long instruction word |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.8219065070152283 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C173608175 |
| concepts[2].level | 1 |
| concepts[2].score | 0.6080173254013062 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q232661 |
| concepts[2].display_name | Parallel computing |
| concepts[3].id | https://openalex.org/C115537543 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5657479166984558 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q165596 |
| concepts[3].display_name | Cache |
| concepts[4].id | https://openalex.org/C82876162 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5048957467079163 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q17096504 |
| concepts[4].display_name | Latency (audio) |
| concepts[5].id | https://openalex.org/C43521106 |
| concepts[5].level | 2 |
| concepts[5].score | 0.4467332363128662 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q2165493 |
| concepts[5].display_name | Pipeline (software) |
| concepts[6].id | https://openalex.org/C150552126 |
| concepts[6].level | 2 |
| concepts[6].score | 0.42420607805252075 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q339387 |
| concepts[6].display_name | SIMD |
| concepts[7].id | https://openalex.org/C149635348 |
| concepts[7].level | 1 |
| concepts[7].score | 0.40564990043640137 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q193040 |
| concepts[7].display_name | Embedded system |
| concepts[8].id | https://openalex.org/C111919701 |
| concepts[8].level | 1 |
| concepts[8].score | 0.152139812707901 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[8].display_name | Operating system |
| concepts[9].id | https://openalex.org/C76155785 |
| concepts[9].level | 1 |
| concepts[9].score | 0.0 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q418 |
| concepts[9].display_name | Telecommunications |
| keywords[0].id | https://openalex.org/keywords/very-long-instruction-word |
| keywords[0].score | 0.8242573142051697 |
| keywords[0].display_name | Very long instruction word |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.8219065070152283 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/parallel-computing |
| keywords[2].score | 0.6080173254013062 |
| keywords[2].display_name | Parallel computing |
| keywords[3].id | https://openalex.org/keywords/cache |
| keywords[3].score | 0.5657479166984558 |
| keywords[3].display_name | Cache |
| keywords[4].id | https://openalex.org/keywords/latency |
| keywords[4].score | 0.5048957467079163 |
| keywords[4].display_name | Latency (audio) |
| keywords[5].id | https://openalex.org/keywords/pipeline |
| keywords[5].score | 0.4467332363128662 |
| keywords[5].display_name | Pipeline (software) |
| keywords[6].id | https://openalex.org/keywords/simd |
| keywords[6].score | 0.42420607805252075 |
| keywords[6].display_name | SIMD |
| keywords[7].id | https://openalex.org/keywords/embedded-system |
| keywords[7].score | 0.40564990043640137 |
| keywords[7].display_name | Embedded system |
| keywords[8].id | https://openalex.org/keywords/operating-system |
| keywords[8].score | 0.152139812707901 |
| keywords[8].display_name | Operating system |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2102.13301 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | public-domain |
| locations[0].pdf_url | https://arxiv.org/pdf/2102.13301 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | https://openalex.org/licenses/public-domain |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2102.13301 |
| indexed_in | arxiv |
| authorships[0].author.id | https://openalex.org/A5002740493 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-2465-0708 |
| authorships[0].author.display_name | Ashish Shrivastava |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Shrivastava, Ashish |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5002590438 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-5937-8599 |
| authorships[1].author.display_name | Alan Gatherer |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Gatherer, Alan |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5100372581 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-3861-8933 |
| authorships[2].author.display_name | Tong Sun |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Sun, Tong |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5038146132 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Sushma Wokhlu |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Wokhlu, Sushma |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5012738469 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Alex Chandra |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Chandra, Alex |
| authorships[4].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2102.13301 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | SLAP: A Split Latency Adaptive VLIW pipeline architecture which enables\n on-the-fly variable SIMD vector-length |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10054 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9998000264167786 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1708 |
| primary_topic.subfield.display_name | Hardware and Architecture |
| primary_topic.display_name | Parallel Computing and Optimization Techniques |
| related_works | https://openalex.org/W2115688358, https://openalex.org/W1503212777, https://openalex.org/W2072728786, https://openalex.org/W3131666633, https://openalex.org/W2158867373, https://openalex.org/W2146636354, https://openalex.org/W1896855786, https://openalex.org/W4379115909, https://openalex.org/W2161750270, https://openalex.org/W2066454338 |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | pmh:oai:arXiv.org:2102.13301 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | public-domain |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2102.13301 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | https://openalex.org/licenses/public-domain |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2102.13301 |
| primary_location.id | pmh:oai:arXiv.org:2102.13301 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | public-domain |
| primary_location.pdf_url | https://arxiv.org/pdf/2102.13301 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | https://openalex.org/licenses/public-domain |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2102.13301 |
| publication_date | 2021-02-25 |
| publication_year | 2021 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 68, 103, 147, 182 |
| abstract_inverted_index.1) | 129 |
| abstract_inverted_index.2) | 135 |
| abstract_inverted_index.3) | 145 |
| abstract_inverted_index.We | 94, 152 |
| abstract_inverted_index.an | 176 |
| abstract_inverted_index.as | 14 |
| abstract_inverted_index.by | 128 |
| abstract_inverted_index.de | 77 |
| abstract_inverted_index.in | 55, 86 |
| abstract_inverted_index.is | 53 |
| abstract_inverted_index.of | 7, 89, 133, 184 |
| abstract_inverted_index.on | 122, 161 |
| abstract_inverted_index.to | 9, 34, 47, 63, 111, 181 |
| abstract_inverted_index.The | 74 |
| abstract_inverted_index.and | 19, 43, 91, 117, 125, 156, 186 |
| abstract_inverted_index.but | 71 |
| abstract_inverted_index.can | 61 |
| abstract_inverted_index.due | 180 |
| abstract_inverted_index.for | 140 |
| abstract_inverted_index.law | 178 |
| abstract_inverted_index.low | 20 |
| abstract_inverted_index.the | 1, 4, 76, 123, 154, 158, 170 |
| abstract_inverted_index.DMAs | 60, 116 |
| abstract_inverted_index.Over | 0 |
| abstract_inverted_index.SLAP | 120 |
| abstract_inverted_index.VLIW | 101 |
| abstract_inverted_index.away | 27 |
| abstract_inverted_index.copy | 65 |
| abstract_inverted_index.data | 66, 142 |
| abstract_inverted_index.even | 169 |
| abstract_inverted_index.from | 28, 84, 164, 175 |
| abstract_inverted_index.high | 48 |
| abstract_inverted_index.into | 67 |
| abstract_inverted_index.last | 2 |
| abstract_inverted_index.load | 150 |
| abstract_inverted_index.most | 171 |
| abstract_inverted_index.real | 56, 162 |
| abstract_inverted_index.such | 40 |
| abstract_inverted_index.that | 107 |
| abstract_inverted_index.time | 51, 57 |
| abstract_inverted_index.were | 32 |
| abstract_inverted_index.wire | 15, 21 |
| abstract_inverted_index.with | 72 |
| abstract_inverted_index.zero | 109 |
| abstract_inverted_index.SRAM, | 70 |
| abstract_inverted_index.Smart | 59 |
| abstract_inverted_index.Split | 96 |
| abstract_inverted_index.badly | 83 |
| abstract_inverted_index.facto | 78 |
| abstract_inverted_index.level | 143 |
| abstract_inverted_index.novel | 148 |
| abstract_inverted_index.often | 45 |
| abstract_inverted_index.smart | 115 |
| abstract_inverted_index.their | 29, 118 |
| abstract_inverted_index.which | 52 |
| abstract_inverted_index.(SLAP) | 100 |
| abstract_inverted_index.(where | 168 |
| abstract_inverted_index.access | 8, 38 |
| abstract_inverted_index.adding | 146 |
| abstract_inverted_index.builds | 121 |
| abstract_inverted_index.change | 110 |
| abstract_inverted_index.decade | 3 |
| abstract_inverted_index.layer1 | 69 |
| abstract_inverted_index.length | 139 |
| abstract_inverted_index.memory | 11, 37 |
| abstract_inverted_index.object | 112 |
| abstract_inverted_index.ports. | 30 |
| abstract_inverted_index.pushed | 23 |
| abstract_inverted_index.scalar | 90, 185 |
| abstract_inverted_index.shared | 10 |
| abstract_inverted_index.signal | 79 |
| abstract_inverted_index.system | 167 |
| abstract_inverted_index.traces | 163 |
| abstract_inverted_index.vector | 92, 138 |
| abstract_inverted_index.Amdahls | 177 |
| abstract_inverted_index.Execute | 126 |
| abstract_inverted_index.Latency | 97 |
| abstract_inverted_index.average | 36 |
| abstract_inverted_index.compute | 172 |
| abstract_inverted_index.concept | 127 |
| abstract_inverted_index.discuss | 153 |
| abstract_inverted_index.engine, | 81 |
| abstract_inverted_index.farther | 26 |
| abstract_inverted_index.improve | 35 |
| abstract_inverted_index.latency | 6, 18 |
| abstract_inverted_index.leading | 46 |
| abstract_inverted_index.mixture | 183 |
| abstract_inverted_index.suffers | 82 |
| abstract_inverted_index.Adaptive | 98 |
| abstract_inverted_index.Pipeline | 99 |
| abstract_inverted_index.baseband | 166 |
| abstract_inverted_index.be\nused | 62 |
| abstract_inverted_index.benefits | 160 |
| abstract_inverted_index.breaking | 130 |
| abstract_inverted_index.deployed | 33 |
| abstract_inverted_index.directly | 64 |
| abstract_inverted_index.enabling | 136 |
| abstract_inverted_index.lockstep | 87, 131 |
| abstract_inverted_index.memories | 25 |
| abstract_inverted_index.relative | 5 |
| abstract_inverted_index.removing | 114 |
| abstract_inverted_index.requires | 108 |
| abstract_inverted_index.systems. | 58 |
| abstract_inverted_index.variable | 137, 141 |
| abstract_inverted_index.dominated | 17 |
| abstract_inverted_index.execution | 50, 88, 132 |
| abstract_inverted_index.increased | 13 |
| abstract_inverted_index.intensive | 173 |
| abstract_inverted_index.multiport | 24 |
| abstract_inverted_index.overhead. | 73, 119 |
| abstract_inverted_index.latencies, | 39 |
| abstract_inverted_index.limitation | 179 |
| abstract_inverted_index.mechanism. | 151 |
| abstract_inverted_index.processing | 80 |
| abstract_inverted_index.resistance | 16 |
| abstract_inverted_index.technology | 106 |
| abstract_inverted_index.triangular | 149 |
| abstract_inverted_index.a\nwireless | 165 |
| abstract_inverted_index.demonstrate | 157 |
| abstract_inverted_index.improvement | 105 |
| abstract_inverted_index.performance | 159 |
| abstract_inverted_index.a\nbreakdown | 85 |
| abstract_inverted_index.code,\nwhile | 113 |
| abstract_inverted_index.pre-fetching | 42 |
| abstract_inverted_index.unacceptable | 54 |
| abstract_inverted_index.variance\nin | 49 |
| abstract_inverted_index.architecture, | 102 |
| abstract_inverted_index.by\nmulticore | 12 |
| abstract_inverted_index.describe\nthe | 95 |
| abstract_inverted_index.instructions. | 93 |
| abstract_inverted_index.as\nspeculative | 41 |
| abstract_inverted_index.density\nlayout | 22 |
| abstract_inverted_index.Decoupled\nAccess | 124 |
| abstract_inverted_index.functions\nsuffer | 174 |
| abstract_inverted_index.parallelism,\nand | 144 |
| abstract_inverted_index.SLAP\narchitecture | 155 |
| abstract_inverted_index.branch-prediction, | 44 |
| abstract_inverted_index.cache\nperformance | 104 |
| abstract_inverted_index.functional\nunits, | 134 |
| abstract_inverted_index.VLIW\narchitecture, | 75 |
| abstract_inverted_index.Various\ntechniques | 31 |
| abstract_inverted_index.vector\nprocessing).\n | 187 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile.value | 0.27823266 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |