Barrier-Free Large-Scale Sparse Tensor Accelerator (BARISTA) For\n Convolutional Neural Networks Article Swipe
YOU?
·
· 2021
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2104.08734
Convolutional neural networks (CNNs) are emerging as powerful tools for\nvisual recognition. Recent architecture proposals for sparse CNNs exploit zeros\nin the feature maps and filters for performance and energy without losing\naccuracy. Sparse architectures that exploit two-sided sparsity in both feature\nmaps and filters have been studied only at small scales (e.g., 1K\nmultiply-accumulate(MAC) units). However, to realize their advantages in full,\nthe sparse architectures have to be scaled up to levels of the dense\narchitectures (e.g., 32K MACs in the TPU). Such scaling is challenging since\nachieving reuse through broadcasts incurs implicit barrier cost raises the\ninter-related issues of load imbalance, buffering, and on-chip bandwidth\ndemand. SparTen, a previous scheme, addresses one aspect of load balancing but\nnot other aspects, nor the other issues of buffering and bandwidth. To that\nend, we propose the barrier-free large-scale sparse tensor accelerator\n(BARISTA). BARISTA (1) is the first architecture for scaling up sparse CNN\naccelerators; (2) reduces on-chip bandwidth demand by telescoping\nrequest-combining the input map requests and snarfing the filter requests; (3)\nreduces buffering via basic buffer sharing and avoids the ensuing barriers\nbetween consecutive input maps by coloring the output buffers; (4) load\nbalances intra-filter work via dynamic round-robin work assignment; and (5)\nemploys hierarchical buffering which achieves high cache bandwidth via a few,\nwide, shared buffers and low buffering via narrower, private buffers at the\ncompute. Our simulations show that, on average, barista performs 5.4x, 2.2x,\n1.7x, 2.5x better than a dense, a one-sided, a naively-scaled two-sided, and an\niso-area two-sided architecture, respectively. Using 45-nm technology, ASIC\nsynthesis of our RTL design for four clusters of 8K MACs at 1 GHz clock speed,\nreports 213 mm$^2$ area and 170 W power.\n
Related Topics
- Type
- preprint
- Landing Page
- http://arxiv.org/abs/2104.08734
- https://arxiv.org/pdf/2104.08734
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4287207682
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4287207682Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2104.08734Digital Object Identifier
- Title
-
Barrier-Free Large-Scale Sparse Tensor Accelerator (BARISTA) For\n Convolutional Neural NetworksWork title
- Type
-
preprintOpenAlex work type
- Publication year
-
2021Year of publication
- Publication date
-
2021-04-18Full publication date if available
- Authors
-
Ashish Gondimalla, Sree Charan Gundabolu, T. N. Vijaykumar, Mithuna ThottethodiList of authors in order
- Landing page
-
https://arxiv.org/abs/2104.08734Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2104.08734Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2104.08734Direct OA link when available
- Concepts
-
Computer science, Convolutional neural network, Exploit, Bandwidth (computing), Parallel computing, Cache, Scaling, Computer engineering, Memory bandwidth, Reuse, Artificial intelligence, Computer network, Geometry, Computer security, Mathematics, Ecology, BiologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4287207682 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2104.08734 |
| ids.openalex | https://openalex.org/W4287207682 |
| fwci | 0.0 |
| type | preprint |
| title | Barrier-Free Large-Scale Sparse Tensor Accelerator (BARISTA) For\n Convolutional Neural Networks |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T12303 |
| topics[0].field.id | https://openalex.org/fields/26 |
| topics[0].field.display_name | Mathematics |
| topics[0].score | 0.9955999851226807 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2605 |
| topics[0].subfield.display_name | Computational Mathematics |
| topics[0].display_name | Tensor decomposition and applications |
| topics[1].id | https://openalex.org/T10036 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.951200008392334 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1707 |
| topics[1].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[1].display_name | Advanced Neural Network Applications |
| topics[2].id | https://openalex.org/T10054 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9254000186920166 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1708 |
| topics[2].subfield.display_name | Hardware and Architecture |
| topics[2].display_name | Parallel Computing and Optimization Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.8267182111740112 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C81363708 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6739274263381958 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q17084460 |
| concepts[1].display_name | Convolutional neural network |
| concepts[2].id | https://openalex.org/C165696696 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6509492993354797 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11287 |
| concepts[2].display_name | Exploit |
| concepts[3].id | https://openalex.org/C2776257435 |
| concepts[3].level | 2 |
| concepts[3].score | 0.6358124613761902 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q1576430 |
| concepts[3].display_name | Bandwidth (computing) |
| concepts[4].id | https://openalex.org/C173608175 |
| concepts[4].level | 1 |
| concepts[4].score | 0.5374183654785156 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q232661 |
| concepts[4].display_name | Parallel computing |
| concepts[5].id | https://openalex.org/C115537543 |
| concepts[5].level | 2 |
| concepts[5].score | 0.50034499168396 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q165596 |
| concepts[5].display_name | Cache |
| concepts[6].id | https://openalex.org/C99844830 |
| concepts[6].level | 2 |
| concepts[6].score | 0.4523800015449524 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q102441924 |
| concepts[6].display_name | Scaling |
| concepts[7].id | https://openalex.org/C113775141 |
| concepts[7].level | 1 |
| concepts[7].score | 0.4364683926105499 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q428691 |
| concepts[7].display_name | Computer engineering |
| concepts[8].id | https://openalex.org/C188045654 |
| concepts[8].level | 2 |
| concepts[8].score | 0.43367788195610046 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q17148339 |
| concepts[8].display_name | Memory bandwidth |
| concepts[9].id | https://openalex.org/C206588197 |
| concepts[9].level | 2 |
| concepts[9].score | 0.43071386218070984 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q846574 |
| concepts[9].display_name | Reuse |
| concepts[10].id | https://openalex.org/C154945302 |
| concepts[10].level | 1 |
| concepts[10].score | 0.21360698342323303 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[10].display_name | Artificial intelligence |
| concepts[11].id | https://openalex.org/C31258907 |
| concepts[11].level | 1 |
| concepts[11].score | 0.14665672183036804 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q1301371 |
| concepts[11].display_name | Computer network |
| concepts[12].id | https://openalex.org/C2524010 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[12].display_name | Geometry |
| concepts[13].id | https://openalex.org/C38652104 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q3510521 |
| concepts[13].display_name | Computer security |
| concepts[14].id | https://openalex.org/C33923547 |
| concepts[14].level | 0 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[14].display_name | Mathematics |
| concepts[15].id | https://openalex.org/C18903297 |
| concepts[15].level | 1 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q7150 |
| concepts[15].display_name | Ecology |
| concepts[16].id | https://openalex.org/C86803240 |
| concepts[16].level | 0 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q420 |
| concepts[16].display_name | Biology |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.8267182111740112 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/convolutional-neural-network |
| keywords[1].score | 0.6739274263381958 |
| keywords[1].display_name | Convolutional neural network |
| keywords[2].id | https://openalex.org/keywords/exploit |
| keywords[2].score | 0.6509492993354797 |
| keywords[2].display_name | Exploit |
| keywords[3].id | https://openalex.org/keywords/bandwidth |
| keywords[3].score | 0.6358124613761902 |
| keywords[3].display_name | Bandwidth (computing) |
| keywords[4].id | https://openalex.org/keywords/parallel-computing |
| keywords[4].score | 0.5374183654785156 |
| keywords[4].display_name | Parallel computing |
| keywords[5].id | https://openalex.org/keywords/cache |
| keywords[5].score | 0.50034499168396 |
| keywords[5].display_name | Cache |
| keywords[6].id | https://openalex.org/keywords/scaling |
| keywords[6].score | 0.4523800015449524 |
| keywords[6].display_name | Scaling |
| keywords[7].id | https://openalex.org/keywords/computer-engineering |
| keywords[7].score | 0.4364683926105499 |
| keywords[7].display_name | Computer engineering |
| keywords[8].id | https://openalex.org/keywords/memory-bandwidth |
| keywords[8].score | 0.43367788195610046 |
| keywords[8].display_name | Memory bandwidth |
| keywords[9].id | https://openalex.org/keywords/reuse |
| keywords[9].score | 0.43071386218070984 |
| keywords[9].display_name | Reuse |
| keywords[10].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[10].score | 0.21360698342323303 |
| keywords[10].display_name | Artificial intelligence |
| keywords[11].id | https://openalex.org/keywords/computer-network |
| keywords[11].score | 0.14665672183036804 |
| keywords[11].display_name | Computer network |
| language | |
| locations[0].id | pmh:oai:arXiv.org:2104.08734 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2104.08734 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2104.08734 |
| indexed_in | arxiv |
| authorships[0].author.id | https://openalex.org/A5001458388 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-3370-3576 |
| authorships[0].author.display_name | Ashish Gondimalla |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Gondimalla, Ashish |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5060960403 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Sree Charan Gundabolu |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Gundabolu, Sree Charan |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5103145581 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-6624-4372 |
| authorships[2].author.display_name | T. N. Vijaykumar |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Vijaykumar, T. N. |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5069139257 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-4164-4542 |
| authorships[3].author.display_name | Mithuna Thottethodi |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Thottethodi, Mithuna |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2104.08734 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2022-07-25T00:00:00 |
| display_name | Barrier-Free Large-Scale Sparse Tensor Accelerator (BARISTA) For\n Convolutional Neural Networks |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-10-24T21:50:52.558619 |
| primary_topic.id | https://openalex.org/T12303 |
| primary_topic.field.id | https://openalex.org/fields/26 |
| primary_topic.field.display_name | Mathematics |
| primary_topic.score | 0.9955999851226807 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2605 |
| primary_topic.subfield.display_name | Computational Mathematics |
| primary_topic.display_name | Tensor decomposition and applications |
| related_works | https://openalex.org/W17155033, https://openalex.org/W3207760230, https://openalex.org/W1496222301, https://openalex.org/W1590307681, https://openalex.org/W2536018345, https://openalex.org/W4312814274, https://openalex.org/W4285370786, https://openalex.org/W2296488620, https://openalex.org/W2358353312, https://openalex.org/W2353836703 |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | pmh:oai:arXiv.org:2104.08734 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2104.08734 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2104.08734 |
| primary_location.id | pmh:oai:arXiv.org:2104.08734 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2104.08734 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2104.08734 |
| publication_date | 2021-04-18 |
| publication_year | 2021 |
| referenced_works_count | 0 |
| abstract_inverted_index.1 | 247 |
| abstract_inverted_index.W | 256 |
| abstract_inverted_index.a | 99, 194, 220, 222, 224 |
| abstract_inverted_index.8K | 244 |
| abstract_inverted_index.To | 119 |
| abstract_inverted_index.as | 6 |
| abstract_inverted_index.at | 45, 205, 246 |
| abstract_inverted_index.be | 62 |
| abstract_inverted_index.by | 145, 170 |
| abstract_inverted_index.in | 36, 56, 73 |
| abstract_inverted_index.is | 78, 131 |
| abstract_inverted_index.of | 67, 91, 105, 115, 236, 243 |
| abstract_inverted_index.on | 211 |
| abstract_inverted_index.to | 52, 61, 65 |
| abstract_inverted_index.up | 64, 137 |
| abstract_inverted_index.we | 121 |
| abstract_inverted_index.(1) | 130 |
| abstract_inverted_index.(2) | 140 |
| abstract_inverted_index.(4) | 175 |
| abstract_inverted_index.170 | 255 |
| abstract_inverted_index.213 | 251 |
| abstract_inverted_index.32K | 71 |
| abstract_inverted_index.GHz | 248 |
| abstract_inverted_index.Our | 207 |
| abstract_inverted_index.RTL | 238 |
| abstract_inverted_index.and | 22, 26, 39, 95, 117, 151, 162, 184, 198, 227, 254 |
| abstract_inverted_index.are | 4 |
| abstract_inverted_index.for | 14, 24, 135, 240 |
| abstract_inverted_index.low | 199 |
| abstract_inverted_index.map | 149 |
| abstract_inverted_index.nor | 111 |
| abstract_inverted_index.one | 103 |
| abstract_inverted_index.our | 237 |
| abstract_inverted_index.the | 19, 68, 74, 112, 123, 132, 147, 153, 164, 172 |
| abstract_inverted_index.via | 158, 179, 193, 201 |
| abstract_inverted_index.2.5x | 217 |
| abstract_inverted_index.CNNs | 16 |
| abstract_inverted_index.MACs | 72, 245 |
| abstract_inverted_index.Such | 76 |
| abstract_inverted_index.area | 253 |
| abstract_inverted_index.been | 42 |
| abstract_inverted_index.both | 37 |
| abstract_inverted_index.cost | 87 |
| abstract_inverted_index.four | 241 |
| abstract_inverted_index.have | 41, 60 |
| abstract_inverted_index.high | 190 |
| abstract_inverted_index.load | 92, 106 |
| abstract_inverted_index.maps | 21, 169 |
| abstract_inverted_index.only | 44 |
| abstract_inverted_index.show | 209 |
| abstract_inverted_index.than | 219 |
| abstract_inverted_index.that | 32 |
| abstract_inverted_index.work | 178, 182 |
| abstract_inverted_index.45-nm | 233 |
| abstract_inverted_index.5.4x, | 215 |
| abstract_inverted_index.TPU). | 75 |
| abstract_inverted_index.Using | 232 |
| abstract_inverted_index.basic | 159 |
| abstract_inverted_index.cache | 191 |
| abstract_inverted_index.clock | 249 |
| abstract_inverted_index.first | 133 |
| abstract_inverted_index.input | 148, 168 |
| abstract_inverted_index.other | 109, 113 |
| abstract_inverted_index.reuse | 81 |
| abstract_inverted_index.small | 46 |
| abstract_inverted_index.that, | 210 |
| abstract_inverted_index.their | 54 |
| abstract_inverted_index.tools | 8 |
| abstract_inverted_index.which | 188 |
| abstract_inverted_index.(CNNs) | 3 |
| abstract_inverted_index.(e.g., | 48, 70 |
| abstract_inverted_index.Recent | 11 |
| abstract_inverted_index.Sparse | 30 |
| abstract_inverted_index.aspect | 104 |
| abstract_inverted_index.avoids | 163 |
| abstract_inverted_index.better | 218 |
| abstract_inverted_index.buffer | 160 |
| abstract_inverted_index.demand | 144 |
| abstract_inverted_index.dense, | 221 |
| abstract_inverted_index.design | 239 |
| abstract_inverted_index.energy | 27 |
| abstract_inverted_index.filter | 154 |
| abstract_inverted_index.incurs | 84 |
| abstract_inverted_index.issues | 90, 114 |
| abstract_inverted_index.levels | 66 |
| abstract_inverted_index.mm$^2$ | 252 |
| abstract_inverted_index.neural | 1 |
| abstract_inverted_index.output | 173 |
| abstract_inverted_index.raises | 88 |
| abstract_inverted_index.scaled | 63 |
| abstract_inverted_index.scales | 47 |
| abstract_inverted_index.shared | 196 |
| abstract_inverted_index.sparse | 15, 58, 126, 138 |
| abstract_inverted_index.tensor | 127 |
| abstract_inverted_index.BARISTA | 129 |
| abstract_inverted_index.barista | 213 |
| abstract_inverted_index.barrier | 86 |
| abstract_inverted_index.buffers | 197, 204 |
| abstract_inverted_index.dynamic | 180 |
| abstract_inverted_index.ensuing | 165 |
| abstract_inverted_index.exploit | 17, 33 |
| abstract_inverted_index.feature | 20 |
| abstract_inverted_index.filters | 23, 40 |
| abstract_inverted_index.on-chip | 96, 142 |
| abstract_inverted_index.private | 203 |
| abstract_inverted_index.propose | 122 |
| abstract_inverted_index.realize | 53 |
| abstract_inverted_index.reduces | 141 |
| abstract_inverted_index.scaling | 77, 136 |
| abstract_inverted_index.scheme, | 101 |
| abstract_inverted_index.sharing | 161 |
| abstract_inverted_index.studied | 43 |
| abstract_inverted_index.through | 82 |
| abstract_inverted_index.units). | 50 |
| abstract_inverted_index.without | 28 |
| abstract_inverted_index.However, | 51 |
| abstract_inverted_index.SparTen, | 98 |
| abstract_inverted_index.achieves | 189 |
| abstract_inverted_index.aspects, | 110 |
| abstract_inverted_index.average, | 212 |
| abstract_inverted_index.buffers; | 174 |
| abstract_inverted_index.but\nnot | 108 |
| abstract_inverted_index.clusters | 242 |
| abstract_inverted_index.coloring | 171 |
| abstract_inverted_index.emerging | 5 |
| abstract_inverted_index.implicit | 85 |
| abstract_inverted_index.networks | 2 |
| abstract_inverted_index.performs | 214 |
| abstract_inverted_index.power.\n | 257 |
| abstract_inverted_index.powerful | 7 |
| abstract_inverted_index.previous | 100 |
| abstract_inverted_index.requests | 150 |
| abstract_inverted_index.snarfing | 152 |
| abstract_inverted_index.sparsity | 35 |
| abstract_inverted_index.addresses | 102 |
| abstract_inverted_index.balancing | 107 |
| abstract_inverted_index.bandwidth | 143, 192 |
| abstract_inverted_index.buffering | 116, 157, 187, 200 |
| abstract_inverted_index.narrower, | 202 |
| abstract_inverted_index.proposals | 13 |
| abstract_inverted_index.requests; | 155 |
| abstract_inverted_index.two-sided | 34, 229 |
| abstract_inverted_index.zeros\nin | 18 |
| abstract_inverted_index.advantages | 55 |
| abstract_inverted_index.bandwidth. | 118 |
| abstract_inverted_index.broadcasts | 83 |
| abstract_inverted_index.buffering, | 94 |
| abstract_inverted_index.full,\nthe | 57 |
| abstract_inverted_index.imbalance, | 93 |
| abstract_inverted_index.one-sided, | 223 |
| abstract_inverted_index.that\nend, | 120 |
| abstract_inverted_index.two-sided, | 226 |
| abstract_inverted_index.assignment; | 183 |
| abstract_inverted_index.challenging | 79 |
| abstract_inverted_index.consecutive | 167 |
| abstract_inverted_index.few,\nwide, | 195 |
| abstract_inverted_index.for\nvisual | 9 |
| abstract_inverted_index.large-scale | 125 |
| abstract_inverted_index.performance | 25 |
| abstract_inverted_index.round-robin | 181 |
| abstract_inverted_index.simulations | 208 |
| abstract_inverted_index.technology, | 234 |
| abstract_inverted_index.(3)\nreduces | 156 |
| abstract_inverted_index.(5)\nemploys | 185 |
| abstract_inverted_index.2.2x,\n1.7x, | 216 |
| abstract_inverted_index.an\niso-area | 228 |
| abstract_inverted_index.architecture | 12, 134 |
| abstract_inverted_index.barrier-free | 124 |
| abstract_inverted_index.hierarchical | 186 |
| abstract_inverted_index.intra-filter | 177 |
| abstract_inverted_index.recognition. | 10 |
| abstract_inverted_index.Convolutional | 0 |
| abstract_inverted_index.architecture, | 230 |
| abstract_inverted_index.architectures | 31, 59 |
| abstract_inverted_index.feature\nmaps | 38 |
| abstract_inverted_index.respectively. | 231 |
| abstract_inverted_index.the\ncompute. | 206 |
| abstract_inverted_index.load\nbalances | 176 |
| abstract_inverted_index.naively-scaled | 225 |
| abstract_inverted_index.ASIC\nsynthesis | 235 |
| abstract_inverted_index.speed,\nreports | 250 |
| abstract_inverted_index.since\nachieving | 80 |
| abstract_inverted_index.barriers\nbetween | 166 |
| abstract_inverted_index.losing\naccuracy. | 29 |
| abstract_inverted_index.CNN\naccelerators; | 139 |
| abstract_inverted_index.bandwidth\ndemand. | 97 |
| abstract_inverted_index.the\ninter-related | 89 |
| abstract_inverted_index.dense\narchitectures | 69 |
| abstract_inverted_index.accelerator\n(BARISTA). | 128 |
| abstract_inverted_index.1K\nmultiply-accumulate(MAC) | 49 |
| abstract_inverted_index.telescoping\nrequest-combining | 146 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/7 |
| sustainable_development_goals[0].score | 0.8199999928474426 |
| sustainable_development_goals[0].display_name | Affordable and clean energy |
| citation_normalized_percentile.value | 0.1875 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |