SSVQ: Unleashing the Potential of Vector Quantization with Sign-Splitting Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2503.08668
Vector Quantization (VQ) has emerged as a prominent weight compression technique, showcasing substantially lower quantization errors than uniform quantization across diverse models, particularly in extreme compression scenarios. However, its efficacy during fine-tuning is limited by the constraint of the compression format, where weight vectors assigned to the same codeword are restricted to updates in the same direction. Consequently, many quantized weights are compelled to move in directions contrary to their local gradient information. To mitigate this issue, we introduce a novel VQ paradigm, Sign-Splitting VQ (SSVQ), which decouples the sign bit of weights from the codebook. Our approach involves extracting the sign bits of uncompressed weights and performing clustering and compression on all-positive weights. We then introduce latent variables for the sign bit and jointly optimize both the signs and the codebook. Additionally, we implement a progressive freezing strategy for the learnable sign to ensure training stability. Extensive experiments on various modern models and tasks demonstrate that SSVQ achieves a significantly superior compression-accuracy trade-off compared to conventional VQ. Furthermore, we validate our algorithm on a hardware accelerator, showing that SSVQ achieves a 3$\times$ speedup over the 8-bit compressed model by reducing memory access. Our code is available at https://github.com/list0830/SSVQ.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2503.08668
- https://arxiv.org/pdf/2503.08668
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4414578663
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4414578663Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2503.08668Digital Object Identifier
- Title
-
SSVQ: Unleashing the Potential of Vector Quantization with Sign-SplittingWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-03-11Full publication date if available
- Authors
-
Shuaiting Li, Juncan Deng, Chenxuan Wang, Kedong Xu, Robert H. Deng, Hong Gu, Haibin Shen, Kejie HuangList of authors in order
- Landing page
-
https://arxiv.org/abs/2503.08668Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2503.08668Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2503.08668Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4414578663 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2503.08668 |
| ids.doi | https://doi.org/10.48550/arxiv.2503.08668 |
| ids.openalex | https://openalex.org/W4414578663 |
| fwci | |
| type | preprint |
| title | SSVQ: Unleashing the Potential of Vector Quantization with Sign-Splitting |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10052 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.5170000195503235 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Medical Image Segmentation Techniques |
| topics[1].id | https://openalex.org/T10320 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.4691999852657318 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Neural Networks and Applications |
| topics[2].id | https://openalex.org/T10531 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.45739999413490295 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1707 |
| topics[2].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[2].display_name | Advanced Vision and Imaging |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2503.08668 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2503.08668 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2503.08668 |
| locations[1].id | doi:10.48550/arxiv.2503.08668 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2503.08668 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5019049217 |
| authorships[0].author.orcid | https://orcid.org/0009-0002-7726-4883 |
| authorships[0].author.display_name | Shuaiting Li |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Li, Shuaiting |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5065813707 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-0860-4442 |
| authorships[1].author.display_name | Juncan Deng |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Deng, Juncan |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5101484908 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-6045-7908 |
| authorships[2].author.display_name | Chenxuan Wang |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Wang, Chenxuan |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5111345577 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Kedong Xu |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Xu, Kedong |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5001712801 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-3491-8146 |
| authorships[4].author.display_name | Robert H. Deng |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Deng, Rongtao |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5101434406 |
| authorships[5].author.orcid | https://orcid.org/0000-0002-8224-146X |
| authorships[5].author.display_name | Hong Gu |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Gu, Hong |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5029202186 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-5431-609X |
| authorships[6].author.display_name | Haibin Shen |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Shen, Haibin |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5091073331 |
| authorships[7].author.orcid | https://orcid.org/0000-0003-3722-9979 |
| authorships[7].author.display_name | Kejie Huang |
| authorships[7].author_position | last |
| authorships[7].raw_author_name | Huang, Kejie |
| authorships[7].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2503.08668 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | SSVQ: Unleashing the Potential of Vector Quantization with Sign-Splitting |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10052 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.5170000195503235 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Medical Image Segmentation Techniques |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2503.08668 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2503.08668 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2503.08668 |
| primary_location.id | pmh:oai:arXiv.org:2503.08668 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2503.08668 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2503.08668 |
| publication_date | 2025-03-11 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 6, 79, 135, 159, 174, 181 |
| abstract_inverted_index.To | 73 |
| abstract_inverted_index.VQ | 81, 84 |
| abstract_inverted_index.We | 114 |
| abstract_inverted_index.as | 5 |
| abstract_inverted_index.at | 197 |
| abstract_inverted_index.by | 34, 189 |
| abstract_inverted_index.in | 23, 53, 65 |
| abstract_inverted_index.is | 32, 195 |
| abstract_inverted_index.of | 37, 91, 103 |
| abstract_inverted_index.on | 111, 149, 173 |
| abstract_inverted_index.to | 45, 51, 63, 68, 143, 165 |
| abstract_inverted_index.we | 77, 133, 169 |
| abstract_inverted_index.Our | 96, 193 |
| abstract_inverted_index.VQ. | 167 |
| abstract_inverted_index.and | 106, 109, 123, 129, 153 |
| abstract_inverted_index.are | 49, 61 |
| abstract_inverted_index.bit | 90, 122 |
| abstract_inverted_index.for | 119, 139 |
| abstract_inverted_index.has | 3 |
| abstract_inverted_index.its | 28 |
| abstract_inverted_index.our | 171 |
| abstract_inverted_index.the | 35, 38, 46, 54, 88, 94, 100, 120, 127, 130, 140, 185 |
| abstract_inverted_index.(VQ) | 2 |
| abstract_inverted_index.SSVQ | 157, 179 |
| abstract_inverted_index.bits | 102 |
| abstract_inverted_index.both | 126 |
| abstract_inverted_index.code | 194 |
| abstract_inverted_index.from | 93 |
| abstract_inverted_index.many | 58 |
| abstract_inverted_index.move | 64 |
| abstract_inverted_index.over | 184 |
| abstract_inverted_index.same | 47, 55 |
| abstract_inverted_index.sign | 89, 101, 121, 142 |
| abstract_inverted_index.than | 16 |
| abstract_inverted_index.that | 156, 178 |
| abstract_inverted_index.then | 115 |
| abstract_inverted_index.this | 75 |
| abstract_inverted_index.8-bit | 186 |
| abstract_inverted_index.local | 70 |
| abstract_inverted_index.lower | 13 |
| abstract_inverted_index.model | 188 |
| abstract_inverted_index.novel | 80 |
| abstract_inverted_index.signs | 128 |
| abstract_inverted_index.tasks | 154 |
| abstract_inverted_index.their | 69 |
| abstract_inverted_index.where | 41 |
| abstract_inverted_index.which | 86 |
| abstract_inverted_index.Vector | 0 |
| abstract_inverted_index.across | 19 |
| abstract_inverted_index.during | 30 |
| abstract_inverted_index.ensure | 144 |
| abstract_inverted_index.errors | 15 |
| abstract_inverted_index.issue, | 76 |
| abstract_inverted_index.latent | 117 |
| abstract_inverted_index.memory | 191 |
| abstract_inverted_index.models | 152 |
| abstract_inverted_index.modern | 151 |
| abstract_inverted_index.weight | 8, 42 |
| abstract_inverted_index.(SSVQ), | 85 |
| abstract_inverted_index.access. | 192 |
| abstract_inverted_index.diverse | 20 |
| abstract_inverted_index.emerged | 4 |
| abstract_inverted_index.extreme | 24 |
| abstract_inverted_index.format, | 40 |
| abstract_inverted_index.jointly | 124 |
| abstract_inverted_index.limited | 33 |
| abstract_inverted_index.models, | 21 |
| abstract_inverted_index.showing | 177 |
| abstract_inverted_index.speedup | 183 |
| abstract_inverted_index.uniform | 17 |
| abstract_inverted_index.updates | 52 |
| abstract_inverted_index.various | 150 |
| abstract_inverted_index.vectors | 43 |
| abstract_inverted_index.weights | 60, 92, 105 |
| abstract_inverted_index.However, | 27 |
| abstract_inverted_index.achieves | 158, 180 |
| abstract_inverted_index.approach | 97 |
| abstract_inverted_index.assigned | 44 |
| abstract_inverted_index.codeword | 48 |
| abstract_inverted_index.compared | 164 |
| abstract_inverted_index.contrary | 67 |
| abstract_inverted_index.efficacy | 29 |
| abstract_inverted_index.freezing | 137 |
| abstract_inverted_index.gradient | 71 |
| abstract_inverted_index.hardware | 175 |
| abstract_inverted_index.involves | 98 |
| abstract_inverted_index.mitigate | 74 |
| abstract_inverted_index.optimize | 125 |
| abstract_inverted_index.reducing | 190 |
| abstract_inverted_index.strategy | 138 |
| abstract_inverted_index.superior | 161 |
| abstract_inverted_index.training | 145 |
| abstract_inverted_index.validate | 170 |
| abstract_inverted_index.weights. | 113 |
| abstract_inverted_index.3$\times$ | 182 |
| abstract_inverted_index.Extensive | 147 |
| abstract_inverted_index.algorithm | 172 |
| abstract_inverted_index.available | 196 |
| abstract_inverted_index.codebook. | 95, 131 |
| abstract_inverted_index.compelled | 62 |
| abstract_inverted_index.decouples | 87 |
| abstract_inverted_index.implement | 134 |
| abstract_inverted_index.introduce | 78, 116 |
| abstract_inverted_index.learnable | 141 |
| abstract_inverted_index.paradigm, | 82 |
| abstract_inverted_index.prominent | 7 |
| abstract_inverted_index.quantized | 59 |
| abstract_inverted_index.trade-off | 163 |
| abstract_inverted_index.variables | 118 |
| abstract_inverted_index.clustering | 108 |
| abstract_inverted_index.compressed | 187 |
| abstract_inverted_index.constraint | 36 |
| abstract_inverted_index.direction. | 56 |
| abstract_inverted_index.directions | 66 |
| abstract_inverted_index.extracting | 99 |
| abstract_inverted_index.performing | 107 |
| abstract_inverted_index.restricted | 50 |
| abstract_inverted_index.scenarios. | 26 |
| abstract_inverted_index.showcasing | 11 |
| abstract_inverted_index.stability. | 146 |
| abstract_inverted_index.technique, | 10 |
| abstract_inverted_index.compression | 9, 25, 39, 110 |
| abstract_inverted_index.demonstrate | 155 |
| abstract_inverted_index.experiments | 148 |
| abstract_inverted_index.fine-tuning | 31 |
| abstract_inverted_index.progressive | 136 |
| abstract_inverted_index.Furthermore, | 168 |
| abstract_inverted_index.Quantization | 1 |
| abstract_inverted_index.accelerator, | 176 |
| abstract_inverted_index.all-positive | 112 |
| abstract_inverted_index.conventional | 166 |
| abstract_inverted_index.information. | 72 |
| abstract_inverted_index.particularly | 22 |
| abstract_inverted_index.quantization | 14, 18 |
| abstract_inverted_index.uncompressed | 104 |
| abstract_inverted_index.Additionally, | 132 |
| abstract_inverted_index.Consequently, | 57 |
| abstract_inverted_index.significantly | 160 |
| abstract_inverted_index.substantially | 12 |
| abstract_inverted_index.Sign-Splitting | 83 |
| abstract_inverted_index.compression-accuracy | 162 |
| abstract_inverted_index.https://github.com/list0830/SSVQ. | 198 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 8 |
| citation_normalized_percentile |