HyperINF: Unleashing the HyperPower of the Schulz's Method for Data Influence Estimation Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2410.05090
Influence functions provide a principled method to assess the contribution of individual training samples to a specific target. Yet, their high computational costs limit their applications on large-scale models and datasets. Existing methods proposed for influence function approximation have significantly reduced the computational overheads. However, they mostly suffer from inaccurate estimation due to the lack of strong convergence guarantees from the algorithm. The family of hyperpower methods are well-known for their rigorous convergence guarantees on matrix inverse approximation, while the matrix multiplication operation can involve intractable memory and computation costs on large-scale models. We propose HyperINF, an efficient and accurate influence function approximation method which leverages the hyperpower method, specifically Schulz's iterative algorithm. To deal with the computation-intensive matrix multiplication, we incorporate the generalized fisher information (GFIM) as a low-rank approximation of the Hessian matrix, which reduces the memory and computation overheads to constant costs independent of ranks on LoRA-tuned models. We first demonstrate the superior accuracy and stability of HyperINF compared to other baselines through a synthetic convergence simulation for matrix inversion. We further validate the efficacy of HyperINF through extensive real-world data attribution tasks, including mislabeled data detection and data selection for LLM and VLM fine-tuning. On LoRA-tuned models, HyperINF achieves superior downstream performance with minimal memory and computational overhead, while other baselines suffer from significant degradation. Our codebase is available at https://github.com/Blackzxy/HyperINF.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2410.05090
- https://arxiv.org/pdf/2410.05090
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4403324069
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4403324069Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2410.05090Digital Object Identifier
- Title
-
HyperINF: Unleashing the HyperPower of the Schulz's Method for Data Influence EstimationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-10-07Full publication date if available
- Authors
-
Xinyu Zhou, Simin Fan, Martin JaggiList of authors in order
- Landing page
-
https://arxiv.org/abs/2410.05090Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2410.05090Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2410.05090Direct OA link when available
- Concepts
-
Estimation, Computer science, Economics, ManagementTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4403324069 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2410.05090 |
| ids.doi | https://doi.org/10.48550/arxiv.2410.05090 |
| ids.openalex | https://openalex.org/W4403324069 |
| fwci | |
| type | preprint |
| title | HyperINF: Unleashing the HyperPower of the Schulz's Method for Data Influence Estimation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11512 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9266999959945679 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Anomaly Detection Techniques and Applications |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C96250715 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6903104186058044 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q965330 |
| concepts[0].display_name | Estimation |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.35904693603515625 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C162324750 |
| concepts[2].level | 0 |
| concepts[2].score | 0.21749523282051086 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q8134 |
| concepts[2].display_name | Economics |
| concepts[3].id | https://openalex.org/C187736073 |
| concepts[3].level | 1 |
| concepts[3].score | 0.0 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q2920921 |
| concepts[3].display_name | Management |
| keywords[0].id | https://openalex.org/keywords/estimation |
| keywords[0].score | 0.6903104186058044 |
| keywords[0].display_name | Estimation |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.35904693603515625 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/economics |
| keywords[2].score | 0.21749523282051086 |
| keywords[2].display_name | Economics |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2410.05090 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2410.05090 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2410.05090 |
| locations[1].id | doi:10.48550/arxiv.2410.05090 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2410.05090 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5025993851 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-9443-5256 |
| authorships[0].author.display_name | Xinyu Zhou |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Zhou, Xinyu |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5023045511 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-1490-9413 |
| authorships[1].author.display_name | Simin Fan |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Fan, Simin |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5109022935 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Martin Jaggi |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Jaggi, Martin |
| authorships[2].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2410.05090 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | HyperINF: Unleashing the HyperPower of the Schulz's Method for Data Influence Estimation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11512 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9266999959945679 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Anomaly Detection Techniques and Applications |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4396696052, https://openalex.org/W4402327032, https://openalex.org/W2382290278 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2410.05090 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2410.05090 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2410.05090 |
| primary_location.id | pmh:oai:arXiv.org:2410.05090 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2410.05090 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2410.05090 |
| publication_date | 2024-10-07 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 3, 15, 128, 166 |
| abstract_inverted_index.On | 198 |
| abstract_inverted_index.To | 113 |
| abstract_inverted_index.We | 93, 151, 173 |
| abstract_inverted_index.an | 96 |
| abstract_inverted_index.as | 127 |
| abstract_inverted_index.at | 223 |
| abstract_inverted_index.is | 221 |
| abstract_inverted_index.of | 10, 55, 64, 131, 146, 159, 178 |
| abstract_inverted_index.on | 26, 74, 90, 148 |
| abstract_inverted_index.to | 6, 14, 52, 142, 162 |
| abstract_inverted_index.we | 120 |
| abstract_inverted_index.LLM | 194 |
| abstract_inverted_index.Our | 219 |
| abstract_inverted_index.The | 62 |
| abstract_inverted_index.VLM | 196 |
| abstract_inverted_index.and | 29, 87, 98, 139, 157, 190, 195, 209 |
| abstract_inverted_index.are | 67 |
| abstract_inverted_index.can | 83 |
| abstract_inverted_index.due | 51 |
| abstract_inverted_index.for | 34, 69, 170, 193 |
| abstract_inverted_index.the | 8, 41, 53, 60, 79, 106, 116, 122, 132, 137, 154, 176 |
| abstract_inverted_index.Yet, | 18 |
| abstract_inverted_index.data | 183, 188, 191 |
| abstract_inverted_index.deal | 114 |
| abstract_inverted_index.from | 48, 59, 216 |
| abstract_inverted_index.have | 38 |
| abstract_inverted_index.high | 20 |
| abstract_inverted_index.lack | 54 |
| abstract_inverted_index.they | 45 |
| abstract_inverted_index.with | 115, 206 |
| abstract_inverted_index.costs | 22, 89, 144 |
| abstract_inverted_index.first | 152 |
| abstract_inverted_index.limit | 23 |
| abstract_inverted_index.other | 163, 213 |
| abstract_inverted_index.ranks | 147 |
| abstract_inverted_index.their | 19, 24, 70 |
| abstract_inverted_index.which | 104, 135 |
| abstract_inverted_index.while | 78, 212 |
| abstract_inverted_index.(GFIM) | 126 |
| abstract_inverted_index.assess | 7 |
| abstract_inverted_index.family | 63 |
| abstract_inverted_index.fisher | 124 |
| abstract_inverted_index.matrix | 75, 80, 118, 171 |
| abstract_inverted_index.memory | 86, 138, 208 |
| abstract_inverted_index.method | 5, 103 |
| abstract_inverted_index.models | 28 |
| abstract_inverted_index.mostly | 46 |
| abstract_inverted_index.strong | 56 |
| abstract_inverted_index.suffer | 47, 215 |
| abstract_inverted_index.tasks, | 185 |
| abstract_inverted_index.Hessian | 133 |
| abstract_inverted_index.further | 174 |
| abstract_inverted_index.inverse | 76 |
| abstract_inverted_index.involve | 84 |
| abstract_inverted_index.matrix, | 134 |
| abstract_inverted_index.method, | 108 |
| abstract_inverted_index.methods | 32, 66 |
| abstract_inverted_index.minimal | 207 |
| abstract_inverted_index.models, | 200 |
| abstract_inverted_index.models. | 92, 150 |
| abstract_inverted_index.propose | 94 |
| abstract_inverted_index.provide | 2 |
| abstract_inverted_index.reduced | 40 |
| abstract_inverted_index.reduces | 136 |
| abstract_inverted_index.samples | 13 |
| abstract_inverted_index.target. | 17 |
| abstract_inverted_index.through | 165, 180 |
| abstract_inverted_index.Existing | 31 |
| abstract_inverted_index.However, | 44 |
| abstract_inverted_index.HyperINF | 160, 179, 201 |
| abstract_inverted_index.Schulz's | 110 |
| abstract_inverted_index.accuracy | 156 |
| abstract_inverted_index.accurate | 99 |
| abstract_inverted_index.achieves | 202 |
| abstract_inverted_index.codebase | 220 |
| abstract_inverted_index.compared | 161 |
| abstract_inverted_index.constant | 143 |
| abstract_inverted_index.efficacy | 177 |
| abstract_inverted_index.function | 36, 101 |
| abstract_inverted_index.low-rank | 129 |
| abstract_inverted_index.proposed | 33 |
| abstract_inverted_index.rigorous | 71 |
| abstract_inverted_index.specific | 16 |
| abstract_inverted_index.superior | 155, 203 |
| abstract_inverted_index.training | 12 |
| abstract_inverted_index.validate | 175 |
| abstract_inverted_index.HyperINF, | 95 |
| abstract_inverted_index.Influence | 0 |
| abstract_inverted_index.available | 222 |
| abstract_inverted_index.baselines | 164, 214 |
| abstract_inverted_index.datasets. | 30 |
| abstract_inverted_index.detection | 189 |
| abstract_inverted_index.efficient | 97 |
| abstract_inverted_index.extensive | 181 |
| abstract_inverted_index.functions | 1 |
| abstract_inverted_index.including | 186 |
| abstract_inverted_index.influence | 35, 100 |
| abstract_inverted_index.iterative | 111 |
| abstract_inverted_index.leverages | 105 |
| abstract_inverted_index.operation | 82 |
| abstract_inverted_index.overhead, | 211 |
| abstract_inverted_index.overheads | 141 |
| abstract_inverted_index.selection | 192 |
| abstract_inverted_index.stability | 158 |
| abstract_inverted_index.synthetic | 167 |
| abstract_inverted_index.LoRA-tuned | 149, 199 |
| abstract_inverted_index.algorithm. | 61, 112 |
| abstract_inverted_index.downstream | 204 |
| abstract_inverted_index.estimation | 50 |
| abstract_inverted_index.guarantees | 58, 73 |
| abstract_inverted_index.hyperpower | 65, 107 |
| abstract_inverted_index.inaccurate | 49 |
| abstract_inverted_index.individual | 11 |
| abstract_inverted_index.inversion. | 172 |
| abstract_inverted_index.mislabeled | 187 |
| abstract_inverted_index.overheads. | 43 |
| abstract_inverted_index.principled | 4 |
| abstract_inverted_index.real-world | 182 |
| abstract_inverted_index.simulation | 169 |
| abstract_inverted_index.well-known | 68 |
| abstract_inverted_index.attribution | 184 |
| abstract_inverted_index.computation | 88, 140 |
| abstract_inverted_index.convergence | 57, 72, 168 |
| abstract_inverted_index.demonstrate | 153 |
| abstract_inverted_index.generalized | 123 |
| abstract_inverted_index.incorporate | 121 |
| abstract_inverted_index.independent | 145 |
| abstract_inverted_index.information | 125 |
| abstract_inverted_index.intractable | 85 |
| abstract_inverted_index.large-scale | 27, 91 |
| abstract_inverted_index.performance | 205 |
| abstract_inverted_index.significant | 217 |
| abstract_inverted_index.applications | 25 |
| abstract_inverted_index.contribution | 9 |
| abstract_inverted_index.degradation. | 218 |
| abstract_inverted_index.fine-tuning. | 197 |
| abstract_inverted_index.specifically | 109 |
| abstract_inverted_index.approximation | 37, 102, 130 |
| abstract_inverted_index.computational | 21, 42, 210 |
| abstract_inverted_index.significantly | 39 |
| abstract_inverted_index.approximation, | 77 |
| abstract_inverted_index.multiplication | 81 |
| abstract_inverted_index.multiplication, | 119 |
| abstract_inverted_index.computation-intensive | 117 |
| abstract_inverted_index.https://github.com/Blackzxy/HyperINF. | 224 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| citation_normalized_percentile |