Optimizing Specific and Shared Parameters for Efficient Parameter Tuning Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2504.03450
Foundation models, with a vast number of parameters and pretraining on massive datasets, achieve state-of-the-art performance across various applications. However, efficiently adapting them to downstream tasks with minimal computational overhead remains a challenge. Parameter-Efficient Transfer Learning (PETL) addresses this by fine-tuning only a small subset of parameters while preserving pre-trained knowledge. In this paper, we propose SaS, a novel PETL method that effectively mitigates distributional shifts during fine-tuning. SaS integrates (1) a shared module that captures common statistical characteristics across layers using low-rank projections and (2) a layer-specific module that employs hypernetworks to generate tailored parameters for each layer. This dual design ensures an optimal balance between performance and parameter efficiency while introducing less than 0.05% additional parameters, making it significantly more compact than existing methods. Extensive experiments on diverse downstream tasks, few-shot settings and domain generalization demonstrate that SaS significantly enhances performance while maintaining superior parameter efficiency compared to existing methods, highlighting the importance of capturing both shared and layer-specific information in transfer learning. Code and data are available at https://anonymous.4open.science/r/SaS-PETL-3565.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2504.03450
- https://arxiv.org/pdf/2504.03450
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415980111
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415980111Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2504.03450Digital Object Identifier
- Title
-
Optimizing Specific and Shared Parameters for Efficient Parameter TuningWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-04-04Full publication date if available
- Authors
-
Nguyễn Thị Vân Anh, Thanh-Toan Do, Mehrtash Harandi, Dinh Phung, Trung LeList of authors in order
- Landing page
-
https://arxiv.org/abs/2504.03450Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2504.03450Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2504.03450Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415980111 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2504.03450 |
| ids.doi | https://doi.org/10.48550/arxiv.2504.03450 |
| ids.openalex | https://openalex.org/W4415980111 |
| fwci | |
| type | preprint |
| title | Optimizing Specific and Shared Parameters for Efficient Parameter Tuning |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2504.03450 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2504.03450 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2504.03450 |
| locations[1].id | doi:10.48550/arxiv.2504.03450 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2504.03450 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5101581517 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-1215-6910 |
| authorships[0].author.display_name | Nguyễn Thị Vân Anh |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Nguyen, Van-Anh |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5025723803 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-6249-0848 |
| authorships[1].author.display_name | Thanh-Toan Do |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Do, Thanh-Toan |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5021790939 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-6937-6300 |
| authorships[2].author.display_name | Mehrtash Harandi |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Harandi, Mehrtash |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5036447132 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-9977-8247 |
| authorships[3].author.display_name | Dinh Phung |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Phung, Dinh |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5103082579 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-4328-2138 |
| authorships[4].author.display_name | Trung Le |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Le, Trung |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2504.03450 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Optimizing Specific and Shared Parameters for Efficient Parameter Tuning |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-08T23:21:52.890332 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2504.03450 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2504.03450 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2504.03450 |
| primary_location.id | pmh:oai:arXiv.org:2504.03450 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2504.03450 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2504.03450 |
| publication_date | 2025-04-04 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 3, 31, 42, 57, 71, 86 |
| abstract_inverted_index.In | 51 |
| abstract_inverted_index.an | 103 |
| abstract_inverted_index.at | 170 |
| abstract_inverted_index.by | 39 |
| abstract_inverted_index.in | 162 |
| abstract_inverted_index.it | 119 |
| abstract_inverted_index.of | 6, 45, 155 |
| abstract_inverted_index.on | 10, 128 |
| abstract_inverted_index.to | 23, 92, 149 |
| abstract_inverted_index.we | 54 |
| abstract_inverted_index.(1) | 70 |
| abstract_inverted_index.(2) | 85 |
| abstract_inverted_index.SaS | 68, 139 |
| abstract_inverted_index.and | 8, 84, 108, 134, 159, 166 |
| abstract_inverted_index.are | 168 |
| abstract_inverted_index.for | 96 |
| abstract_inverted_index.the | 153 |
| abstract_inverted_index.Code | 165 |
| abstract_inverted_index.PETL | 59 |
| abstract_inverted_index.SaS, | 56 |
| abstract_inverted_index.This | 99 |
| abstract_inverted_index.both | 157 |
| abstract_inverted_index.data | 167 |
| abstract_inverted_index.dual | 100 |
| abstract_inverted_index.each | 97 |
| abstract_inverted_index.less | 113 |
| abstract_inverted_index.more | 121 |
| abstract_inverted_index.only | 41 |
| abstract_inverted_index.than | 114, 123 |
| abstract_inverted_index.that | 61, 74, 89, 138 |
| abstract_inverted_index.them | 22 |
| abstract_inverted_index.this | 38, 52 |
| abstract_inverted_index.vast | 4 |
| abstract_inverted_index.with | 2, 26 |
| abstract_inverted_index.0.05% | 115 |
| abstract_inverted_index.novel | 58 |
| abstract_inverted_index.small | 43 |
| abstract_inverted_index.tasks | 25 |
| abstract_inverted_index.using | 81 |
| abstract_inverted_index.while | 47, 111, 143 |
| abstract_inverted_index.(PETL) | 36 |
| abstract_inverted_index.across | 16, 79 |
| abstract_inverted_index.common | 76 |
| abstract_inverted_index.design | 101 |
| abstract_inverted_index.domain | 135 |
| abstract_inverted_index.during | 66 |
| abstract_inverted_index.layer. | 98 |
| abstract_inverted_index.layers | 80 |
| abstract_inverted_index.making | 118 |
| abstract_inverted_index.method | 60 |
| abstract_inverted_index.module | 73, 88 |
| abstract_inverted_index.number | 5 |
| abstract_inverted_index.paper, | 53 |
| abstract_inverted_index.shared | 72, 158 |
| abstract_inverted_index.shifts | 65 |
| abstract_inverted_index.subset | 44 |
| abstract_inverted_index.tasks, | 131 |
| abstract_inverted_index.achieve | 13 |
| abstract_inverted_index.balance | 105 |
| abstract_inverted_index.between | 106 |
| abstract_inverted_index.compact | 122 |
| abstract_inverted_index.diverse | 129 |
| abstract_inverted_index.employs | 90 |
| abstract_inverted_index.ensures | 102 |
| abstract_inverted_index.massive | 11 |
| abstract_inverted_index.minimal | 27 |
| abstract_inverted_index.models, | 1 |
| abstract_inverted_index.optimal | 104 |
| abstract_inverted_index.propose | 55 |
| abstract_inverted_index.remains | 30 |
| abstract_inverted_index.various | 17 |
| abstract_inverted_index.However, | 19 |
| abstract_inverted_index.Learning | 35 |
| abstract_inverted_index.Transfer | 34 |
| abstract_inverted_index.adapting | 21 |
| abstract_inverted_index.captures | 75 |
| abstract_inverted_index.compared | 148 |
| abstract_inverted_index.enhances | 141 |
| abstract_inverted_index.existing | 124, 150 |
| abstract_inverted_index.few-shot | 132 |
| abstract_inverted_index.generate | 93 |
| abstract_inverted_index.low-rank | 82 |
| abstract_inverted_index.methods, | 151 |
| abstract_inverted_index.methods. | 125 |
| abstract_inverted_index.overhead | 29 |
| abstract_inverted_index.settings | 133 |
| abstract_inverted_index.superior | 145 |
| abstract_inverted_index.tailored | 94 |
| abstract_inverted_index.transfer | 163 |
| abstract_inverted_index.Extensive | 126 |
| abstract_inverted_index.addresses | 37 |
| abstract_inverted_index.available | 169 |
| abstract_inverted_index.capturing | 156 |
| abstract_inverted_index.datasets, | 12 |
| abstract_inverted_index.learning. | 164 |
| abstract_inverted_index.mitigates | 63 |
| abstract_inverted_index.parameter | 109, 146 |
| abstract_inverted_index.Foundation | 0 |
| abstract_inverted_index.additional | 116 |
| abstract_inverted_index.challenge. | 32 |
| abstract_inverted_index.downstream | 24, 130 |
| abstract_inverted_index.efficiency | 110, 147 |
| abstract_inverted_index.importance | 154 |
| abstract_inverted_index.integrates | 69 |
| abstract_inverted_index.knowledge. | 50 |
| abstract_inverted_index.parameters | 7, 46, 95 |
| abstract_inverted_index.preserving | 48 |
| abstract_inverted_index.demonstrate | 137 |
| abstract_inverted_index.effectively | 62 |
| abstract_inverted_index.efficiently | 20 |
| abstract_inverted_index.experiments | 127 |
| abstract_inverted_index.fine-tuning | 40 |
| abstract_inverted_index.information | 161 |
| abstract_inverted_index.introducing | 112 |
| abstract_inverted_index.maintaining | 144 |
| abstract_inverted_index.parameters, | 117 |
| abstract_inverted_index.performance | 15, 107, 142 |
| abstract_inverted_index.pre-trained | 49 |
| abstract_inverted_index.pretraining | 9 |
| abstract_inverted_index.projections | 83 |
| abstract_inverted_index.statistical | 77 |
| abstract_inverted_index.fine-tuning. | 67 |
| abstract_inverted_index.highlighting | 152 |
| abstract_inverted_index.applications. | 18 |
| abstract_inverted_index.computational | 28 |
| abstract_inverted_index.hypernetworks | 91 |
| abstract_inverted_index.significantly | 120, 140 |
| abstract_inverted_index.distributional | 64 |
| abstract_inverted_index.generalization | 136 |
| abstract_inverted_index.layer-specific | 87, 160 |
| abstract_inverted_index.characteristics | 78 |
| abstract_inverted_index.state-of-the-art | 14 |
| abstract_inverted_index.Parameter-Efficient | 33 |
| abstract_inverted_index.https://anonymous.4open.science/r/SaS-PETL-3565. | 171 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |