Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification,\n and Local Computations Article Swipe
YOU?
·
· 2019
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.1906.02367
Communication bottleneck has been identified as a significant issue in\ndistributed optimization of large-scale learning models. Recently, several\napproaches to mitigate this problem have been proposed, including different\nforms of gradient compression or computing local models and mixing them\niteratively. In this paper, we propose \\emph{Qsparse-local-SGD} algorithm,\nwhich combines aggressive sparsification with quantization and local\ncomputation along with error compensation, by keeping track of the difference\nbetween the true and compressed gradients. We propose both synchronous and\nasynchronous implementations of \\emph{Qsparse-local-SGD}. We analyze\nconvergence for \\emph{Qsparse-local-SGD} in the \\emph{distributed} setting for\nsmooth non-convex and convex objective functions. We demonstrate that\n\\emph{Qsparse-local-SGD} converges at the same rate as vanilla distributed SGD\nfor many important classes of sparsifiers and quantizers. We use\n\\emph{Qsparse-local-SGD} to train ResNet-50 on ImageNet and show that it\nresults in significant savings over the state-of-the-art, in the number of bits\ntransmitted to reach target accuracy.\n
Related Topics
- Type
- preprint
- Landing Page
- http://arxiv.org/abs/1906.02367
- https://arxiv.org/pdf/1906.02367
- OA Status
- green
- Cited By
- 1
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4288335809
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4288335809Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.1906.02367Digital Object Identifier
- Title
-
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification,\n and Local ComputationsWork title
- Type
-
preprintOpenAlex work type
- Publication year
-
2019Year of publication
- Publication date
-
2019-06-05Full publication date if available
- Authors
-
Debraj Basu, Deepesh Data, Can Karakus, Suhas DiggaviList of authors in order
- Landing page
-
https://arxiv.org/abs/1906.02367Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/1906.02367Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/1906.02367Direct OA link when available
- Concepts
-
Bottleneck, Computation, Quantization (signal processing), Computer science, Asynchronous communication, Rate of convergence, Regular polygon, Algorithm, Theoretical computer science, Mathematical optimization, Mathematics, Key (lock), Geometry, Embedded system, Computer network, Computer securityTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2023: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4288335809 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.1906.02367 |
| ids.openalex | https://openalex.org/W4288335809 |
| fwci | 0.0 |
| type | preprint |
| title | Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification,\n and Local Computations |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10500 |
| topics[0].field.id | https://openalex.org/fields/22 |
| topics[0].field.display_name | Engineering |
| topics[0].score | 0.9987999796867371 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2206 |
| topics[0].subfield.display_name | Computational Mechanics |
| topics[0].display_name | Sparse and Compressive Sensing Techniques |
| topics[1].id | https://openalex.org/T11612 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9966999888420105 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Stochastic Gradient Optimization Techniques |
| topics[2].id | https://openalex.org/T10129 |
| topics[2].field.id | https://openalex.org/fields/27 |
| topics[2].field.display_name | Medicine |
| topics[2].score | 0.9959999918937683 |
| topics[2].domain.id | https://openalex.org/domains/4 |
| topics[2].domain.display_name | Health Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2716 |
| topics[2].subfield.display_name | Genetics |
| topics[2].display_name | Glioma Diagnosis and Treatment |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2780513914 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6348944306373596 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q18210350 |
| concepts[0].display_name | Bottleneck |
| concepts[1].id | https://openalex.org/C45374587 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6312220692634583 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q12525525 |
| concepts[1].display_name | Computation |
| concepts[2].id | https://openalex.org/C28855332 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6137451529502869 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q198099 |
| concepts[2].display_name | Quantization (signal processing) |
| concepts[3].id | https://openalex.org/C41008148 |
| concepts[3].level | 0 |
| concepts[3].score | 0.6035768389701843 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[3].display_name | Computer science |
| concepts[4].id | https://openalex.org/C151319957 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5242000818252563 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q752739 |
| concepts[4].display_name | Asynchronous communication |
| concepts[5].id | https://openalex.org/C57869625 |
| concepts[5].level | 3 |
| concepts[5].score | 0.49476784467697144 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q1783502 |
| concepts[5].display_name | Rate of convergence |
| concepts[6].id | https://openalex.org/C112680207 |
| concepts[6].level | 2 |
| concepts[6].score | 0.48405593633651733 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q714886 |
| concepts[6].display_name | Regular polygon |
| concepts[7].id | https://openalex.org/C11413529 |
| concepts[7].level | 1 |
| concepts[7].score | 0.4145740866661072 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[7].display_name | Algorithm |
| concepts[8].id | https://openalex.org/C80444323 |
| concepts[8].level | 1 |
| concepts[8].score | 0.332227885723114 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q2878974 |
| concepts[8].display_name | Theoretical computer science |
| concepts[9].id | https://openalex.org/C126255220 |
| concepts[9].level | 1 |
| concepts[9].score | 0.32286179065704346 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q141495 |
| concepts[9].display_name | Mathematical optimization |
| concepts[10].id | https://openalex.org/C33923547 |
| concepts[10].level | 0 |
| concepts[10].score | 0.3058509826660156 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[10].display_name | Mathematics |
| concepts[11].id | https://openalex.org/C26517878 |
| concepts[11].level | 2 |
| concepts[11].score | 0.1363551914691925 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q228039 |
| concepts[11].display_name | Key (lock) |
| concepts[12].id | https://openalex.org/C2524010 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0886954665184021 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[12].display_name | Geometry |
| concepts[13].id | https://openalex.org/C149635348 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q193040 |
| concepts[13].display_name | Embedded system |
| concepts[14].id | https://openalex.org/C31258907 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q1301371 |
| concepts[14].display_name | Computer network |
| concepts[15].id | https://openalex.org/C38652104 |
| concepts[15].level | 1 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q3510521 |
| concepts[15].display_name | Computer security |
| keywords[0].id | https://openalex.org/keywords/bottleneck |
| keywords[0].score | 0.6348944306373596 |
| keywords[0].display_name | Bottleneck |
| keywords[1].id | https://openalex.org/keywords/computation |
| keywords[1].score | 0.6312220692634583 |
| keywords[1].display_name | Computation |
| keywords[2].id | https://openalex.org/keywords/quantization |
| keywords[2].score | 0.6137451529502869 |
| keywords[2].display_name | Quantization (signal processing) |
| keywords[3].id | https://openalex.org/keywords/computer-science |
| keywords[3].score | 0.6035768389701843 |
| keywords[3].display_name | Computer science |
| keywords[4].id | https://openalex.org/keywords/asynchronous-communication |
| keywords[4].score | 0.5242000818252563 |
| keywords[4].display_name | Asynchronous communication |
| keywords[5].id | https://openalex.org/keywords/rate-of-convergence |
| keywords[5].score | 0.49476784467697144 |
| keywords[5].display_name | Rate of convergence |
| keywords[6].id | https://openalex.org/keywords/regular-polygon |
| keywords[6].score | 0.48405593633651733 |
| keywords[6].display_name | Regular polygon |
| keywords[7].id | https://openalex.org/keywords/algorithm |
| keywords[7].score | 0.4145740866661072 |
| keywords[7].display_name | Algorithm |
| keywords[8].id | https://openalex.org/keywords/theoretical-computer-science |
| keywords[8].score | 0.332227885723114 |
| keywords[8].display_name | Theoretical computer science |
| keywords[9].id | https://openalex.org/keywords/mathematical-optimization |
| keywords[9].score | 0.32286179065704346 |
| keywords[9].display_name | Mathematical optimization |
| keywords[10].id | https://openalex.org/keywords/mathematics |
| keywords[10].score | 0.3058509826660156 |
| keywords[10].display_name | Mathematics |
| keywords[11].id | https://openalex.org/keywords/key |
| keywords[11].score | 0.1363551914691925 |
| keywords[11].display_name | Key (lock) |
| keywords[12].id | https://openalex.org/keywords/geometry |
| keywords[12].score | 0.0886954665184021 |
| keywords[12].display_name | Geometry |
| language | |
| locations[0].id | pmh:oai:arXiv.org:1906.02367 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/1906.02367 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/1906.02367 |
| indexed_in | arxiv |
| authorships[0].author.id | https://openalex.org/A5020834141 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-8731-690X |
| authorships[0].author.display_name | Debraj Basu |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Basu, Debraj |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5023777978 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-3544-8414 |
| authorships[1].author.display_name | Deepesh Data |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Data, Deepesh |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5103177913 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-6878-6984 |
| authorships[2].author.display_name | Can Karakus |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Karakus, Can |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5083980887 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-7313-9861 |
| authorships[3].author.display_name | Suhas Diggavi |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Diggavi, Suhas |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/1906.02367 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2022-07-29T00:00:00 |
| display_name | Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification,\n and Local Computations |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-10-24T21:50:52.558619 |
| primary_topic.id | https://openalex.org/T10500 |
| primary_topic.field.id | https://openalex.org/fields/22 |
| primary_topic.field.display_name | Engineering |
| primary_topic.score | 0.9987999796867371 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2206 |
| primary_topic.subfield.display_name | Computational Mechanics |
| primary_topic.display_name | Sparse and Compressive Sensing Techniques |
| related_works | https://openalex.org/W2595172197, https://openalex.org/W2084856301, https://openalex.org/W2127970246, https://openalex.org/W4382618745, https://openalex.org/W2885125400, https://openalex.org/W1001352512, https://openalex.org/W1989889224, https://openalex.org/W1973775000, https://openalex.org/W2748922771, https://openalex.org/W2114711060 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2023 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 1 |
| best_oa_location.id | pmh:oai:arXiv.org:1906.02367 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/1906.02367 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/1906.02367 |
| primary_location.id | pmh:oai:arXiv.org:1906.02367 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/1906.02367 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/1906.02367 |
| publication_date | 2019-06-05 |
| publication_year | 2019 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 6 |
| abstract_inverted_index.In | 36 |
| abstract_inverted_index.We | 65, 73, 87, 106 |
| abstract_inverted_index.as | 5, 95 |
| abstract_inverted_index.at | 91 |
| abstract_inverted_index.by | 54 |
| abstract_inverted_index.in | 77, 117, 123 |
| abstract_inverted_index.of | 11, 26, 57, 71, 102, 126 |
| abstract_inverted_index.on | 111 |
| abstract_inverted_index.or | 29 |
| abstract_inverted_index.to | 17, 108, 128 |
| abstract_inverted_index.we | 39 |
| abstract_inverted_index.and | 33, 48, 62, 83, 104, 113 |
| abstract_inverted_index.for | 75 |
| abstract_inverted_index.has | 2 |
| abstract_inverted_index.the | 58, 60, 78, 92, 121, 124 |
| abstract_inverted_index.been | 3, 22 |
| abstract_inverted_index.both | 67 |
| abstract_inverted_index.have | 21 |
| abstract_inverted_index.many | 99 |
| abstract_inverted_index.over | 120 |
| abstract_inverted_index.rate | 94 |
| abstract_inverted_index.same | 93 |
| abstract_inverted_index.show | 114 |
| abstract_inverted_index.that | 115 |
| abstract_inverted_index.this | 19, 37 |
| abstract_inverted_index.true | 61 |
| abstract_inverted_index.with | 46, 51 |
| abstract_inverted_index.along | 50 |
| abstract_inverted_index.error | 52 |
| abstract_inverted_index.issue | 8 |
| abstract_inverted_index.local | 31 |
| abstract_inverted_index.reach | 129 |
| abstract_inverted_index.track | 56 |
| abstract_inverted_index.train | 109 |
| abstract_inverted_index.convex | 84 |
| abstract_inverted_index.mixing | 34 |
| abstract_inverted_index.models | 32 |
| abstract_inverted_index.number | 125 |
| abstract_inverted_index.paper, | 38 |
| abstract_inverted_index.target | 130 |
| abstract_inverted_index.classes | 101 |
| abstract_inverted_index.keeping | 55 |
| abstract_inverted_index.models. | 14 |
| abstract_inverted_index.problem | 20 |
| abstract_inverted_index.propose | 40, 66 |
| abstract_inverted_index.savings | 119 |
| abstract_inverted_index.setting | 80 |
| abstract_inverted_index.vanilla | 96 |
| abstract_inverted_index.ImageNet | 112 |
| abstract_inverted_index.SGD\nfor | 98 |
| abstract_inverted_index.combines | 43 |
| abstract_inverted_index.gradient | 27 |
| abstract_inverted_index.learning | 13 |
| abstract_inverted_index.mitigate | 18 |
| abstract_inverted_index.Recently, | 15 |
| abstract_inverted_index.ResNet-50 | 110 |
| abstract_inverted_index.computing | 30 |
| abstract_inverted_index.converges | 90 |
| abstract_inverted_index.important | 100 |
| abstract_inverted_index.including | 24 |
| abstract_inverted_index.objective | 85 |
| abstract_inverted_index.proposed, | 23 |
| abstract_inverted_index.aggressive | 44 |
| abstract_inverted_index.bottleneck | 1 |
| abstract_inverted_index.compressed | 63 |
| abstract_inverted_index.functions. | 86 |
| abstract_inverted_index.gradients. | 64 |
| abstract_inverted_index.identified | 4 |
| abstract_inverted_index.non-convex | 82 |
| abstract_inverted_index.accuracy.\n | 131 |
| abstract_inverted_index.compression | 28 |
| abstract_inverted_index.demonstrate | 88 |
| abstract_inverted_index.distributed | 97 |
| abstract_inverted_index.for\nsmooth | 81 |
| abstract_inverted_index.it\nresults | 116 |
| abstract_inverted_index.large-scale | 12 |
| abstract_inverted_index.quantizers. | 105 |
| abstract_inverted_index.significant | 7, 118 |
| abstract_inverted_index.sparsifiers | 103 |
| abstract_inverted_index.synchronous | 68 |
| abstract_inverted_index.optimization | 10 |
| abstract_inverted_index.quantization | 47 |
| abstract_inverted_index.Communication | 0 |
| abstract_inverted_index.compensation, | 53 |
| abstract_inverted_index.sparsification | 45 |
| abstract_inverted_index.implementations | 70 |
| abstract_inverted_index.in\ndistributed | 9 |
| abstract_inverted_index.different\nforms | 25 |
| abstract_inverted_index.algorithm,\nwhich | 42 |
| abstract_inverted_index.and\nasynchronous | 69 |
| abstract_inverted_index.bits\ntransmitted | 127 |
| abstract_inverted_index.state-of-the-art, | 122 |
| abstract_inverted_index.local\ncomputation | 49 |
| abstract_inverted_index.them\niteratively. | 35 |
| abstract_inverted_index.\\emph{distributed} | 79 |
| abstract_inverted_index.difference\nbetween | 59 |
| abstract_inverted_index.several\napproaches | 16 |
| abstract_inverted_index.analyze\nconvergence | 74 |
| abstract_inverted_index.\\emph{Qsparse-local-SGD} | 41, 76 |
| abstract_inverted_index.\\emph{Qsparse-local-SGD}. | 72 |
| abstract_inverted_index.use\n\\emph{Qsparse-local-SGD} | 107 |
| abstract_inverted_index.that\n\\emph{Qsparse-local-SGD} | 89 |
| cited_by_percentile_year.max | 94 |
| cited_by_percentile_year.min | 89 |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile.value | 0.35293825 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |