TruncFormer: Private LLM Inference Using Only Truncations Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2412.01042
Private inference (PI) serves an important role in guaranteeing the privacy of user data when interfacing with proprietary machine learning models such as LLMs. However, PI remains practically intractable due to the massive latency costs associated with nonlinear functions present in LLMs. Existing works have focused on improving latency of specific LLM nonlinearities (such as the Softmax, or the GeLU) via approximations. However, new types of nonlinearities are regularly introduced with new LLM architectures, and this has led to a constant game of catch-up where PI researchers attempt to optimize the newest nonlinear function. We introduce TruncFormer, a framework for taking any LLM and transforming it into a plaintext emulation of PI. Our framework leverages the fact that nonlinearities in LLMs are differentiable and can be accurately approximated with a sequence of additions, multiplications, and truncations. Further, we decouple the add/multiply and truncation operations, and statically determine where truncations should be inserted based on a given field size and input representation size. This leads to latency improvements over existing cryptographic protocols that enforce truncation after every multiplication operation. We open source our code for community use.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2412.01042
- https://arxiv.org/pdf/2412.01042
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4405033862
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4405033862Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2412.01042Digital Object Identifier
- Title
-
TruncFormer: Private LLM Inference Using Only TruncationsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-12-02Full publication date if available
- Authors
-
Patrick Yubeaton, Jianqiao Mo, Karthik Garimella, Nandan Kumar Jha, Brandon Reagen, Chinmay Hegde, Siddharth GargList of authors in order
- Landing page
-
https://arxiv.org/abs/2412.01042Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2412.01042Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2412.01042Direct OA link when available
- Concepts
-
Inference, Computer science, Artificial intelligenceTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4405033862 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2412.01042 |
| ids.doi | https://doi.org/10.48550/arxiv.2412.01042 |
| ids.openalex | https://openalex.org/W4405033862 |
| fwci | |
| type | preprint |
| title | TruncFormer: Private LLM Inference Using Only Truncations |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11762 |
| topics[0].field.id | https://openalex.org/fields/20 |
| topics[0].field.display_name | Economics, Econometrics and Finance |
| topics[0].score | 0.9074000120162964 |
| topics[0].domain.id | https://openalex.org/domains/2 |
| topics[0].domain.display_name | Social Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2002 |
| topics[0].subfield.display_name | Economics and Econometrics |
| topics[0].display_name | Law, Economics, and Judicial Systems |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2776214188 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7139106392860413 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q408386 |
| concepts[0].display_name | Inference |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.470562219619751 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C154945302 |
| concepts[2].level | 1 |
| concepts[2].score | 0.26027193665504456 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[2].display_name | Artificial intelligence |
| keywords[0].id | https://openalex.org/keywords/inference |
| keywords[0].score | 0.7139106392860413 |
| keywords[0].display_name | Inference |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.470562219619751 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[2].score | 0.26027193665504456 |
| keywords[2].display_name | Artificial intelligence |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2412.01042 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2412.01042 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2412.01042 |
| locations[1].id | doi:10.48550/arxiv.2412.01042 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2412.01042 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5114482444 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Patrick Yubeaton |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Yubeaton, Patrick |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5016851675 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-9533-8183 |
| authorships[1].author.display_name | Jianqiao Mo |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Mo, Jianqiao Cambridge |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5032975353 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-1914-4907 |
| authorships[2].author.display_name | Karthik Garimella |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Garimella, Karthik |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5012767033 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-2090-8094 |
| authorships[3].author.display_name | Nandan Kumar Jha |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Jha, Nandan Kumar |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5089173037 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-1932-2750 |
| authorships[4].author.display_name | Brandon Reagen |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Reagen, Brandon |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5066142047 |
| authorships[5].author.orcid | https://orcid.org/0000-0003-4574-8066 |
| authorships[5].author.display_name | Chinmay Hegde |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Hegde, Chinmay |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5010950688 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-6158-9512 |
| authorships[6].author.display_name | Siddharth Garg |
| authorships[6].author_position | last |
| authorships[6].raw_author_name | Garg, Siddharth |
| authorships[6].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2412.01042 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | TruncFormer: Private LLM Inference Using Only Truncations |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11762 |
| primary_topic.field.id | https://openalex.org/fields/20 |
| primary_topic.field.display_name | Economics, Econometrics and Finance |
| primary_topic.score | 0.9074000120162964 |
| primary_topic.domain.id | https://openalex.org/domains/2 |
| primary_topic.domain.display_name | Social Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2002 |
| primary_topic.subfield.display_name | Economics and Econometrics |
| primary_topic.display_name | Law, Economics, and Judicial Systems |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4396696052 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2412.01042 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2412.01042 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2412.01042 |
| primary_location.id | pmh:oai:arXiv.org:2412.01042 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2412.01042 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2412.01042 |
| publication_date | 2024-12-02 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 79, 97, 107, 129, 154 |
| abstract_inverted_index.PI | 25, 85 |
| abstract_inverted_index.We | 94, 178 |
| abstract_inverted_index.an | 4 |
| abstract_inverted_index.as | 22, 54 |
| abstract_inverted_index.be | 125, 150 |
| abstract_inverted_index.in | 7, 40, 119 |
| abstract_inverted_index.it | 105 |
| abstract_inverted_index.of | 11, 49, 65, 82, 110, 131 |
| abstract_inverted_index.on | 46, 153 |
| abstract_inverted_index.or | 57 |
| abstract_inverted_index.to | 30, 78, 88, 164 |
| abstract_inverted_index.we | 137 |
| abstract_inverted_index.LLM | 51, 72, 102 |
| abstract_inverted_index.Our | 112 |
| abstract_inverted_index.PI. | 111 |
| abstract_inverted_index.and | 74, 103, 123, 134, 141, 144, 158 |
| abstract_inverted_index.any | 101 |
| abstract_inverted_index.are | 67, 121 |
| abstract_inverted_index.can | 124 |
| abstract_inverted_index.due | 29 |
| abstract_inverted_index.for | 99, 183 |
| abstract_inverted_index.has | 76 |
| abstract_inverted_index.led | 77 |
| abstract_inverted_index.new | 63, 71 |
| abstract_inverted_index.our | 181 |
| abstract_inverted_index.the | 9, 31, 55, 58, 90, 115, 139 |
| abstract_inverted_index.via | 60 |
| abstract_inverted_index.(PI) | 2 |
| abstract_inverted_index.LLMs | 120 |
| abstract_inverted_index.This | 162 |
| abstract_inverted_index.code | 182 |
| abstract_inverted_index.data | 13 |
| abstract_inverted_index.fact | 116 |
| abstract_inverted_index.game | 81 |
| abstract_inverted_index.have | 44 |
| abstract_inverted_index.into | 106 |
| abstract_inverted_index.open | 179 |
| abstract_inverted_index.over | 167 |
| abstract_inverted_index.role | 6 |
| abstract_inverted_index.size | 157 |
| abstract_inverted_index.such | 21 |
| abstract_inverted_index.that | 117, 171 |
| abstract_inverted_index.this | 75 |
| abstract_inverted_index.use. | 185 |
| abstract_inverted_index.user | 12 |
| abstract_inverted_index.when | 14 |
| abstract_inverted_index.with | 16, 36, 70, 128 |
| abstract_inverted_index.(such | 53 |
| abstract_inverted_index.GeLU) | 59 |
| abstract_inverted_index.LLMs. | 23, 41 |
| abstract_inverted_index.after | 174 |
| abstract_inverted_index.based | 152 |
| abstract_inverted_index.costs | 34 |
| abstract_inverted_index.every | 175 |
| abstract_inverted_index.field | 156 |
| abstract_inverted_index.given | 155 |
| abstract_inverted_index.input | 159 |
| abstract_inverted_index.leads | 163 |
| abstract_inverted_index.size. | 161 |
| abstract_inverted_index.types | 64 |
| abstract_inverted_index.where | 84, 147 |
| abstract_inverted_index.works | 43 |
| abstract_inverted_index.models | 20 |
| abstract_inverted_index.newest | 91 |
| abstract_inverted_index.serves | 3 |
| abstract_inverted_index.should | 149 |
| abstract_inverted_index.source | 180 |
| abstract_inverted_index.taking | 100 |
| abstract_inverted_index.Private | 0 |
| abstract_inverted_index.attempt | 87 |
| abstract_inverted_index.enforce | 172 |
| abstract_inverted_index.focused | 45 |
| abstract_inverted_index.latency | 33, 48, 165 |
| abstract_inverted_index.machine | 18 |
| abstract_inverted_index.massive | 32 |
| abstract_inverted_index.present | 39 |
| abstract_inverted_index.privacy | 10 |
| abstract_inverted_index.remains | 26 |
| abstract_inverted_index.Existing | 42 |
| abstract_inverted_index.Further, | 136 |
| abstract_inverted_index.However, | 24, 62 |
| abstract_inverted_index.Softmax, | 56 |
| abstract_inverted_index.catch-up | 83 |
| abstract_inverted_index.constant | 80 |
| abstract_inverted_index.decouple | 138 |
| abstract_inverted_index.existing | 168 |
| abstract_inverted_index.inserted | 151 |
| abstract_inverted_index.learning | 19 |
| abstract_inverted_index.optimize | 89 |
| abstract_inverted_index.sequence | 130 |
| abstract_inverted_index.specific | 50 |
| abstract_inverted_index.community | 184 |
| abstract_inverted_index.determine | 146 |
| abstract_inverted_index.emulation | 109 |
| abstract_inverted_index.framework | 98, 113 |
| abstract_inverted_index.function. | 93 |
| abstract_inverted_index.functions | 38 |
| abstract_inverted_index.important | 5 |
| abstract_inverted_index.improving | 47 |
| abstract_inverted_index.inference | 1 |
| abstract_inverted_index.introduce | 95 |
| abstract_inverted_index.leverages | 114 |
| abstract_inverted_index.nonlinear | 37, 92 |
| abstract_inverted_index.plaintext | 108 |
| abstract_inverted_index.protocols | 170 |
| abstract_inverted_index.regularly | 68 |
| abstract_inverted_index.accurately | 126 |
| abstract_inverted_index.additions, | 132 |
| abstract_inverted_index.associated | 35 |
| abstract_inverted_index.introduced | 69 |
| abstract_inverted_index.operation. | 177 |
| abstract_inverted_index.statically | 145 |
| abstract_inverted_index.truncation | 142, 173 |
| abstract_inverted_index.interfacing | 15 |
| abstract_inverted_index.intractable | 28 |
| abstract_inverted_index.operations, | 143 |
| abstract_inverted_index.practically | 27 |
| abstract_inverted_index.proprietary | 17 |
| abstract_inverted_index.researchers | 86 |
| abstract_inverted_index.truncations | 148 |
| abstract_inverted_index.TruncFormer, | 96 |
| abstract_inverted_index.add/multiply | 140 |
| abstract_inverted_index.approximated | 127 |
| abstract_inverted_index.guaranteeing | 8 |
| abstract_inverted_index.improvements | 166 |
| abstract_inverted_index.transforming | 104 |
| abstract_inverted_index.truncations. | 135 |
| abstract_inverted_index.cryptographic | 169 |
| abstract_inverted_index.architectures, | 73 |
| abstract_inverted_index.differentiable | 122 |
| abstract_inverted_index.multiplication | 176 |
| abstract_inverted_index.nonlinearities | 52, 66, 118 |
| abstract_inverted_index.representation | 160 |
| abstract_inverted_index.approximations. | 61 |
| abstract_inverted_index.multiplications, | 133 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 7 |
| citation_normalized_percentile |