TeleLoRA: Teleporting Model-Specific Alignment Across LLMs Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2503.20228
Mitigating Trojans in Large Language Models (LLMs) is one of many tasks where alignment data is LLM specific, as different LLMs have different Trojan triggers and trigger behaviors to be removed. In this paper, we introduce TeleLoRA (Teleporting Low-Rank Adaptation), a novel framework that synergizes model-specific alignment data across multiple LLMs to enable zero-shot Trojan mitigation on unseen LLMs without alignment data. TeleLoRA learns a unified generator of LoRA adapter weights by leveraging local activation information across multiple LLMs. This generator is designed to be permutation symmetric to generalize across models with different architectures and sizes. We optimize the model design for memory efficiency, making it feasible to learn with large-scale LLMs with minimal computational resources. Experiments on LLM Trojan mitigation benchmarks demonstrate that TeleLoRA effectively reduces attack success rates while preserving the benign performance of the models.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2503.20228
- https://arxiv.org/pdf/2503.20228
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4409051414
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4409051414Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2503.20228Digital Object Identifier
- Title
-
TeleLoRA: Teleporting Model-Specific Alignment Across LLMsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-03-26Full publication date if available
- Authors
-
Xiao Lin, Manoj Acharya, Anirban Roy, Susmit JhaList of authors in order
- Landing page
-
https://arxiv.org/abs/2503.20228Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2503.20228Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2503.20228Direct OA link when available
- Concepts
-
Political scienceTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4409051414 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2503.20228 |
| ids.doi | https://doi.org/10.48550/arxiv.2503.20228 |
| ids.openalex | https://openalex.org/W4409051414 |
| fwci | |
| type | preprint |
| title | TeleLoRA: Teleporting Model-Specific Alignment Across LLMs |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10703 |
| topics[0].field.id | https://openalex.org/fields/14 |
| topics[0].field.display_name | Business, Management and Accounting |
| topics[0].score | 0.98089998960495 |
| topics[0].domain.id | https://openalex.org/domains/2 |
| topics[0].domain.display_name | Social Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1404 |
| topics[0].subfield.display_name | Management Information Systems |
| topics[0].display_name | Business Process Modeling and Analysis |
| topics[1].id | https://openalex.org/T10215 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9722999930381775 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Semantic Web and Ontologies |
| topics[2].id | https://openalex.org/T10679 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9485999941825867 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1710 |
| topics[2].subfield.display_name | Information Systems |
| topics[2].display_name | Service-Oriented Architecture and Web Services |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C17744445 |
| concepts[0].level | 0 |
| concepts[0].score | 0.32051074504852295 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q36442 |
| concepts[0].display_name | Political science |
| keywords[0].id | https://openalex.org/keywords/political-science |
| keywords[0].score | 0.32051074504852295 |
| keywords[0].display_name | Political science |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2503.20228 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2503.20228 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2503.20228 |
| locations[1].id | doi:10.48550/arxiv.2503.20228 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2503.20228 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5116868324 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Xiao Lin |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Lin, Xiao |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5116868325 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Manoj Acharya |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Acharya, Manoj |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5101531634 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-0702-4553 |
| authorships[2].author.display_name | Anirban Roy |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Roy, Anirban |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5035902535 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-5983-9095 |
| authorships[3].author.display_name | Susmit Jha |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Jha, Susmit |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2503.20228 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | TeleLoRA: Teleporting Model-Specific Alignment Across LLMs |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10703 |
| primary_topic.field.id | https://openalex.org/fields/14 |
| primary_topic.field.display_name | Business, Management and Accounting |
| primary_topic.score | 0.98089998960495 |
| primary_topic.domain.id | https://openalex.org/domains/2 |
| primary_topic.domain.display_name | Social Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1404 |
| primary_topic.subfield.display_name | Management Information Systems |
| primary_topic.display_name | Business Process Modeling and Analysis |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2949263084, https://openalex.org/W2743539335, https://openalex.org/W594353338, https://openalex.org/W2922049016, https://openalex.org/W4390697879, https://openalex.org/W2070214669, https://openalex.org/W2724734218 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2503.20228 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2503.20228 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2503.20228 |
| primary_location.id | pmh:oai:arXiv.org:2503.20228 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2503.20228 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2503.20228 |
| publication_date | 2025-03-26 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 40, 64 |
| abstract_inverted_index.In | 31 |
| abstract_inverted_index.We | 96 |
| abstract_inverted_index.as | 18 |
| abstract_inverted_index.be | 29, 84 |
| abstract_inverted_index.by | 71 |
| abstract_inverted_index.in | 2 |
| abstract_inverted_index.is | 7, 15, 81 |
| abstract_inverted_index.it | 105 |
| abstract_inverted_index.of | 9, 67, 135 |
| abstract_inverted_index.on | 56, 117 |
| abstract_inverted_index.to | 28, 51, 83, 87, 107 |
| abstract_inverted_index.we | 34 |
| abstract_inverted_index.LLM | 16, 118 |
| abstract_inverted_index.and | 25, 94 |
| abstract_inverted_index.for | 101 |
| abstract_inverted_index.one | 8 |
| abstract_inverted_index.the | 98, 132, 136 |
| abstract_inverted_index.LLMs | 20, 50, 58, 111 |
| abstract_inverted_index.LoRA | 68 |
| abstract_inverted_index.This | 79 |
| abstract_inverted_index.data | 14, 47 |
| abstract_inverted_index.have | 21 |
| abstract_inverted_index.many | 10 |
| abstract_inverted_index.that | 43, 123 |
| abstract_inverted_index.this | 32 |
| abstract_inverted_index.with | 91, 109, 112 |
| abstract_inverted_index.LLMs. | 78 |
| abstract_inverted_index.Large | 3 |
| abstract_inverted_index.data. | 61 |
| abstract_inverted_index.learn | 108 |
| abstract_inverted_index.local | 73 |
| abstract_inverted_index.model | 99 |
| abstract_inverted_index.novel | 41 |
| abstract_inverted_index.rates | 129 |
| abstract_inverted_index.tasks | 11 |
| abstract_inverted_index.where | 12 |
| abstract_inverted_index.while | 130 |
| abstract_inverted_index.(LLMs) | 6 |
| abstract_inverted_index.Models | 5 |
| abstract_inverted_index.Trojan | 23, 54, 119 |
| abstract_inverted_index.across | 48, 76, 89 |
| abstract_inverted_index.attack | 127 |
| abstract_inverted_index.benign | 133 |
| abstract_inverted_index.design | 100 |
| abstract_inverted_index.enable | 52 |
| abstract_inverted_index.learns | 63 |
| abstract_inverted_index.making | 104 |
| abstract_inverted_index.memory | 102 |
| abstract_inverted_index.models | 90 |
| abstract_inverted_index.paper, | 33 |
| abstract_inverted_index.sizes. | 95 |
| abstract_inverted_index.unseen | 57 |
| abstract_inverted_index.Trojans | 1 |
| abstract_inverted_index.adapter | 69 |
| abstract_inverted_index.minimal | 113 |
| abstract_inverted_index.models. | 137 |
| abstract_inverted_index.reduces | 126 |
| abstract_inverted_index.success | 128 |
| abstract_inverted_index.trigger | 26 |
| abstract_inverted_index.unified | 65 |
| abstract_inverted_index.weights | 70 |
| abstract_inverted_index.without | 59 |
| abstract_inverted_index.Language | 4 |
| abstract_inverted_index.Low-Rank | 38 |
| abstract_inverted_index.TeleLoRA | 36, 62, 124 |
| abstract_inverted_index.designed | 82 |
| abstract_inverted_index.feasible | 106 |
| abstract_inverted_index.multiple | 49, 77 |
| abstract_inverted_index.optimize | 97 |
| abstract_inverted_index.removed. | 30 |
| abstract_inverted_index.triggers | 24 |
| abstract_inverted_index.alignment | 13, 46, 60 |
| abstract_inverted_index.behaviors | 27 |
| abstract_inverted_index.different | 19, 22, 92 |
| abstract_inverted_index.framework | 42 |
| abstract_inverted_index.generator | 66, 80 |
| abstract_inverted_index.introduce | 35 |
| abstract_inverted_index.specific, | 17 |
| abstract_inverted_index.symmetric | 86 |
| abstract_inverted_index.zero-shot | 53 |
| abstract_inverted_index.Mitigating | 0 |
| abstract_inverted_index.activation | 74 |
| abstract_inverted_index.benchmarks | 121 |
| abstract_inverted_index.generalize | 88 |
| abstract_inverted_index.leveraging | 72 |
| abstract_inverted_index.mitigation | 55, 120 |
| abstract_inverted_index.preserving | 131 |
| abstract_inverted_index.resources. | 115 |
| abstract_inverted_index.synergizes | 44 |
| abstract_inverted_index.Experiments | 116 |
| abstract_inverted_index.demonstrate | 122 |
| abstract_inverted_index.effectively | 125 |
| abstract_inverted_index.efficiency, | 103 |
| abstract_inverted_index.information | 75 |
| abstract_inverted_index.large-scale | 110 |
| abstract_inverted_index.performance | 134 |
| abstract_inverted_index.permutation | 85 |
| abstract_inverted_index.(Teleporting | 37 |
| abstract_inverted_index.Adaptation), | 39 |
| abstract_inverted_index.architectures | 93 |
| abstract_inverted_index.computational | 114 |
| abstract_inverted_index.model-specific | 45 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |