SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2510.16917
Knowledge editing offers an efficient way to update model knowledge without full retraining, but prior work has concentrated almost exclusively on textual or visual modalities. We introduce SAKE, the first benchmark specifically designed for editing auditory attribute knowledge in Large Audio-Language Models (LALMs). Unlike factual updates, SAKE targets several abstract auditory attributes, capturing knowledge types that go beyond conventional textual and visual domains. We benchmark seven editing methods on two LALMs along four dimensions: reliability, generality, audio/text locality, and portability. Results highlight challenges such as preserving intra-attribute knowledge unrelated to the edit, generalizing edits to multimodal reasoning, and maintaining edits under sequential updates. SAKE provides a principled framework to study how knowledge editing extends to the auditory modalities, opening new directions for maintaining and adapting LALMs in more diverse real-world scenarios.
Related Topics
- Type
- preprint
- Landing Page
- http://arxiv.org/abs/2510.16917
- https://arxiv.org/pdf/2510.16917
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415960250
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415960250Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2510.16917Digital Object Identifier
- Title
-
SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language ModelsWork title
- Type
-
preprintOpenAlex work type
- Publication year
-
2025Year of publication
- Publication date
-
2025-10-19Full publication date if available
- Authors
-
Chih-Kai Yang, Yen-Ting Piao, Tian‐Chuan Hsu, Szu‐Wei Fu, Zhehuai Chen, Ke-Han Lu, Sung-Feng Huang, Chao-Han Huck Yang, Yu-Chiang Frank Wang, Yun-Nung Chen, Hung-yi LeeList of authors in order
- Landing page
-
https://arxiv.org/abs/2510.16917Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2510.16917Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2510.16917Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415960250 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2510.16917 |
| ids.doi | https://doi.org/10.48550/arxiv.2510.16917 |
| ids.openalex | https://openalex.org/W4415960250 |
| fwci | |
| type | preprint |
| title | SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | |
| locations[0].id | pmh:oai:arXiv.org:2510.16917 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2510.16917 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2510.16917 |
| locations[1].id | doi:10.48550/arxiv.2510.16917 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2510.16917 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5073025981 |
| authorships[0].author.orcid | https://orcid.org/0009-0009-7368-7521 |
| authorships[0].author.display_name | Chih-Kai Yang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Yang, Chih-Kai |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5120093887 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Yen-Ting Piao |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Piao, Yen-Ting |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5043215122 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-2612-1885 |
| authorships[2].author.display_name | Tian‐Chuan Hsu |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Hsu, Tzu-Wen |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5071471469 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-3487-8212 |
| authorships[3].author.display_name | Szu‐Wei Fu |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Fu, Szu-Wei |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5002433660 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-4400-5340 |
| authorships[4].author.display_name | Zhehuai Chen |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Chen, Zhehuai |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5015216145 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Ke-Han Lu |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Lu, Ke-Han |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5071504898 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-9720-811X |
| authorships[6].author.display_name | Sung-Feng Huang |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Huang, Sung-Feng |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5020376803 |
| authorships[7].author.orcid | https://orcid.org/0000-0003-2879-8811 |
| authorships[7].author.display_name | Chao-Han Huck Yang |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Yang, Chao-Han Huck |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5090045508 |
| authorships[8].author.orcid | https://orcid.org/0000-0002-2333-157X |
| authorships[8].author.display_name | Yu-Chiang Frank Wang |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Wang, Yu-Chiang Frank |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5032336004 |
| authorships[9].author.orcid | |
| authorships[9].author.display_name | Yun-Nung Chen |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | Chen, Yun-Nung |
| authorships[9].is_corresponding | False |
| authorships[10].author.id | https://openalex.org/A5040508737 |
| authorships[10].author.orcid | https://orcid.org/0000-0002-9654-5747 |
| authorships[10].author.display_name | Hung-yi Lee |
| authorships[10].author_position | last |
| authorships[10].raw_author_name | Lee, Hung-yi |
| authorships[10].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2510.16917 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-22T00:00:00 |
| display_name | SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-07T23:20:04.922697 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2510.16917 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2510.16917 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2510.16917 |
| primary_location.id | pmh:oai:arXiv.org:2510.16917 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2510.16917 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2510.16917 |
| publication_date | 2025-10-19 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 105 |
| abstract_inverted_index.We | 25, 63 |
| abstract_inverted_index.an | 3 |
| abstract_inverted_index.as | 84 |
| abstract_inverted_index.go | 56 |
| abstract_inverted_index.in | 38, 126 |
| abstract_inverted_index.on | 20, 68 |
| abstract_inverted_index.or | 22 |
| abstract_inverted_index.to | 6, 89, 94, 108, 114 |
| abstract_inverted_index.and | 60, 78, 97, 123 |
| abstract_inverted_index.but | 13 |
| abstract_inverted_index.for | 33, 121 |
| abstract_inverted_index.has | 16 |
| abstract_inverted_index.how | 110 |
| abstract_inverted_index.new | 119 |
| abstract_inverted_index.the | 28, 90, 115 |
| abstract_inverted_index.two | 69 |
| abstract_inverted_index.way | 5 |
| abstract_inverted_index.SAKE | 46, 103 |
| abstract_inverted_index.four | 72 |
| abstract_inverted_index.full | 11 |
| abstract_inverted_index.more | 127 |
| abstract_inverted_index.such | 83 |
| abstract_inverted_index.that | 55 |
| abstract_inverted_index.work | 15 |
| abstract_inverted_index.LALMs | 70, 125 |
| abstract_inverted_index.Large | 39 |
| abstract_inverted_index.SAKE, | 27 |
| abstract_inverted_index.along | 71 |
| abstract_inverted_index.edit, | 91 |
| abstract_inverted_index.edits | 93, 99 |
| abstract_inverted_index.first | 29 |
| abstract_inverted_index.model | 8 |
| abstract_inverted_index.prior | 14 |
| abstract_inverted_index.seven | 65 |
| abstract_inverted_index.study | 109 |
| abstract_inverted_index.types | 54 |
| abstract_inverted_index.under | 100 |
| abstract_inverted_index.Models | 41 |
| abstract_inverted_index.Unlike | 43 |
| abstract_inverted_index.almost | 18 |
| abstract_inverted_index.beyond | 57 |
| abstract_inverted_index.offers | 2 |
| abstract_inverted_index.update | 7 |
| abstract_inverted_index.visual | 23, 61 |
| abstract_inverted_index.Results | 80 |
| abstract_inverted_index.diverse | 128 |
| abstract_inverted_index.editing | 1, 34, 66, 112 |
| abstract_inverted_index.extends | 113 |
| abstract_inverted_index.factual | 44 |
| abstract_inverted_index.methods | 67 |
| abstract_inverted_index.opening | 118 |
| abstract_inverted_index.several | 48 |
| abstract_inverted_index.targets | 47 |
| abstract_inverted_index.textual | 21, 59 |
| abstract_inverted_index.without | 10 |
| abstract_inverted_index.(LALMs). | 42 |
| abstract_inverted_index.abstract | 49 |
| abstract_inverted_index.adapting | 124 |
| abstract_inverted_index.auditory | 35, 50, 116 |
| abstract_inverted_index.designed | 32 |
| abstract_inverted_index.domains. | 62 |
| abstract_inverted_index.provides | 104 |
| abstract_inverted_index.updates, | 45 |
| abstract_inverted_index.updates. | 102 |
| abstract_inverted_index.Knowledge | 0 |
| abstract_inverted_index.attribute | 36 |
| abstract_inverted_index.benchmark | 30, 64 |
| abstract_inverted_index.capturing | 52 |
| abstract_inverted_index.efficient | 4 |
| abstract_inverted_index.framework | 107 |
| abstract_inverted_index.highlight | 81 |
| abstract_inverted_index.introduce | 26 |
| abstract_inverted_index.knowledge | 9, 37, 53, 87, 111 |
| abstract_inverted_index.locality, | 77 |
| abstract_inverted_index.unrelated | 88 |
| abstract_inverted_index.audio/text | 76 |
| abstract_inverted_index.challenges | 82 |
| abstract_inverted_index.directions | 120 |
| abstract_inverted_index.multimodal | 95 |
| abstract_inverted_index.preserving | 85 |
| abstract_inverted_index.principled | 106 |
| abstract_inverted_index.real-world | 129 |
| abstract_inverted_index.reasoning, | 96 |
| abstract_inverted_index.scenarios. | 130 |
| abstract_inverted_index.sequential | 101 |
| abstract_inverted_index.attributes, | 51 |
| abstract_inverted_index.dimensions: | 73 |
| abstract_inverted_index.exclusively | 19 |
| abstract_inverted_index.generality, | 75 |
| abstract_inverted_index.maintaining | 98, 122 |
| abstract_inverted_index.modalities, | 117 |
| abstract_inverted_index.modalities. | 24 |
| abstract_inverted_index.retraining, | 12 |
| abstract_inverted_index.concentrated | 17 |
| abstract_inverted_index.conventional | 58 |
| abstract_inverted_index.generalizing | 92 |
| abstract_inverted_index.portability. | 79 |
| abstract_inverted_index.reliability, | 74 |
| abstract_inverted_index.specifically | 31 |
| abstract_inverted_index.Audio-Language | 40 |
| abstract_inverted_index.intra-attribute | 86 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 11 |
| citation_normalized_percentile |