Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2311.18237
Vision Foundation Models (VFMs) pretrained on massive datasets exhibit impressive performance on various downstream tasks, especially with limited labeled target data. However, due to their high inference compute cost, these models cannot be deployed for many real-world applications. Motivated by this, we ask the following important question, "How can we leverage the knowledge from a large VFM to train a small task-specific model for a new target task with limited labeled training data?", and propose a simple task-oriented knowledge transfer approach as a highly effective solution to this problem. Our experimental results on five target tasks show that the proposed approach outperforms task-agnostic VFM distillation, web-scale CLIP pretraining, supervised ImageNet pretraining, and self-supervised DINO pretraining by up to 11.6%, 22.1%, 13.7%, and 29.8%, respectively. Furthermore, the proposed approach also demonstrates up to 9x, 4x and 15x reduction in pretraining compute cost when compared to task-agnostic VFM distillation, ImageNet pretraining and DINO pretraining, respectively, while outperforming them. We also show that the dataset used for transferring knowledge has a significant effect on the final target task performance, and introduce a retrieval-augmented knowledge transfer strategy that uses web-scale image retrieval to curate effective transfer sets.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2311.18237
- https://arxiv.org/pdf/2311.18237
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4389260862
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4389260862Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2311.18237Digital Object Identifier
- Title
-
Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific ModelsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-11-30Full publication date if available
- Authors
-
Raviteja Vemulapalli, Hadi Pouransari, Fartash Faghri, Sachin Mehta, Mehrdad Farajtabar, Mohammad Rastegari, Oncel TuzelList of authors in order
- Landing page
-
https://arxiv.org/abs/2311.18237Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2311.18237Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2311.18237Direct OA link when available
- Concepts
-
Leverage (statistics), Computer science, Inference, Task (project management), Artificial intelligence, Machine learning, Transfer of learning, Distillation, Economics, Management, Organic chemistry, ChemistryTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4389260862 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2311.18237 |
| ids.doi | https://doi.org/10.48550/arxiv.2311.18237 |
| ids.openalex | https://openalex.org/W4389260862 |
| fwci | |
| type | preprint |
| title | Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11307 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9997000098228455 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Domain Adaptation and Few-Shot Learning |
| topics[1].id | https://openalex.org/T10036 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9991000294685364 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1707 |
| topics[1].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[1].display_name | Advanced Neural Network Applications |
| topics[2].id | https://openalex.org/T10627 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9987000226974487 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1707 |
| topics[2].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[2].display_name | Advanced Image and Video Retrieval Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C153083717 |
| concepts[0].level | 2 |
| concepts[0].score | 0.802446722984314 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q6535263 |
| concepts[0].display_name | Leverage (statistics) |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.7853727340698242 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C2776214188 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6723388433456421 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q408386 |
| concepts[2].display_name | Inference |
| concepts[3].id | https://openalex.org/C2780451532 |
| concepts[3].level | 2 |
| concepts[3].score | 0.6534526348114014 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q759676 |
| concepts[3].display_name | Task (project management) |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.623885452747345 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C119857082 |
| concepts[5].level | 1 |
| concepts[5].score | 0.5921871662139893 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[5].display_name | Machine learning |
| concepts[6].id | https://openalex.org/C150899416 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5454883575439453 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q1820378 |
| concepts[6].display_name | Transfer of learning |
| concepts[7].id | https://openalex.org/C204030448 |
| concepts[7].level | 2 |
| concepts[7].score | 0.46468213200569153 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q101017 |
| concepts[7].display_name | Distillation |
| concepts[8].id | https://openalex.org/C162324750 |
| concepts[8].level | 0 |
| concepts[8].score | 0.0 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q8134 |
| concepts[8].display_name | Economics |
| concepts[9].id | https://openalex.org/C187736073 |
| concepts[9].level | 1 |
| concepts[9].score | 0.0 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q2920921 |
| concepts[9].display_name | Management |
| concepts[10].id | https://openalex.org/C178790620 |
| concepts[10].level | 1 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q11351 |
| concepts[10].display_name | Organic chemistry |
| concepts[11].id | https://openalex.org/C185592680 |
| concepts[11].level | 0 |
| concepts[11].score | 0.0 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[11].display_name | Chemistry |
| keywords[0].id | https://openalex.org/keywords/leverage |
| keywords[0].score | 0.802446722984314 |
| keywords[0].display_name | Leverage (statistics) |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.7853727340698242 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/inference |
| keywords[2].score | 0.6723388433456421 |
| keywords[2].display_name | Inference |
| keywords[3].id | https://openalex.org/keywords/task |
| keywords[3].score | 0.6534526348114014 |
| keywords[3].display_name | Task (project management) |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.623885452747345 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/machine-learning |
| keywords[5].score | 0.5921871662139893 |
| keywords[5].display_name | Machine learning |
| keywords[6].id | https://openalex.org/keywords/transfer-of-learning |
| keywords[6].score | 0.5454883575439453 |
| keywords[6].display_name | Transfer of learning |
| keywords[7].id | https://openalex.org/keywords/distillation |
| keywords[7].score | 0.46468213200569153 |
| keywords[7].display_name | Distillation |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2311.18237 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2311.18237 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2311.18237 |
| locations[1].id | doi:10.48550/arxiv.2311.18237 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2311.18237 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5071825172 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-0425-7797 |
| authorships[0].author.display_name | Raviteja Vemulapalli |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Vemulapalli, Raviteja |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5059295598 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Hadi Pouransari |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Pouransari, Hadi |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5036601505 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-5975-5158 |
| authorships[2].author.display_name | Fartash Faghri |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Faghri, Fartash |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5074132108 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-5420-4725 |
| authorships[3].author.display_name | Sachin Mehta |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Mehta, Sachin |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5050499655 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-5510-518X |
| authorships[4].author.display_name | Mehrdad Farajtabar |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Farajtabar, Mehrdad |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5056246621 |
| authorships[5].author.orcid | https://orcid.org/0000-0001-9606-3687 |
| authorships[5].author.display_name | Mohammad Rastegari |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Rastegari, Mohammad |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5028613002 |
| authorships[6].author.orcid | |
| authorships[6].author.display_name | Oncel Tuzel |
| authorships[6].author_position | last |
| authorships[6].raw_author_name | Tuzel, Oncel |
| authorships[6].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2311.18237 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2023-12-02T00:00:00 |
| display_name | Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11307 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9997000098228455 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Domain Adaptation and Few-Shot Learning |
| related_works | https://openalex.org/W2055243143, https://openalex.org/W3201126466, https://openalex.org/W3026162553, https://openalex.org/W2787993192, https://openalex.org/W4282827391, https://openalex.org/W2768175398, https://openalex.org/W2344382886, https://openalex.org/W2158269427, https://openalex.org/W4381280689, https://openalex.org/W4379251913 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2311.18237 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2311.18237 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2311.18237 |
| primary_location.id | pmh:oai:arXiv.org:2311.18237 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2311.18237 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2311.18237 |
| publication_date | 2023-11-30 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 54, 59, 64, 75, 82, 167, 178 |
| abstract_inverted_index.4x | 133 |
| abstract_inverted_index.We | 156 |
| abstract_inverted_index.as | 81 |
| abstract_inverted_index.be | 32 |
| abstract_inverted_index.by | 39, 115 |
| abstract_inverted_index.in | 137 |
| abstract_inverted_index.on | 5, 11, 92, 170 |
| abstract_inverted_index.to | 23, 57, 86, 117, 131, 143, 188 |
| abstract_inverted_index.up | 116, 130 |
| abstract_inverted_index.we | 41, 49 |
| abstract_inverted_index.15x | 135 |
| abstract_inverted_index.9x, | 132 |
| abstract_inverted_index.Our | 89 |
| abstract_inverted_index.VFM | 56, 103, 145 |
| abstract_inverted_index.and | 73, 111, 121, 134, 149, 176 |
| abstract_inverted_index.ask | 42 |
| abstract_inverted_index.can | 48 |
| abstract_inverted_index.due | 22 |
| abstract_inverted_index.for | 34, 63, 163 |
| abstract_inverted_index.has | 166 |
| abstract_inverted_index.new | 65 |
| abstract_inverted_index.the | 43, 51, 98, 125, 160, 171 |
| abstract_inverted_index."How | 47 |
| abstract_inverted_index.CLIP | 106 |
| abstract_inverted_index.DINO | 113, 150 |
| abstract_inverted_index.also | 128, 157 |
| abstract_inverted_index.cost | 140 |
| abstract_inverted_index.five | 93 |
| abstract_inverted_index.from | 53 |
| abstract_inverted_index.high | 25 |
| abstract_inverted_index.many | 35 |
| abstract_inverted_index.show | 96, 158 |
| abstract_inverted_index.task | 67, 174 |
| abstract_inverted_index.that | 97, 159, 183 |
| abstract_inverted_index.this | 87 |
| abstract_inverted_index.used | 162 |
| abstract_inverted_index.uses | 184 |
| abstract_inverted_index.when | 141 |
| abstract_inverted_index.with | 16, 68 |
| abstract_inverted_index.cost, | 28 |
| abstract_inverted_index.data. | 20 |
| abstract_inverted_index.final | 172 |
| abstract_inverted_index.image | 186 |
| abstract_inverted_index.large | 55 |
| abstract_inverted_index.model | 62 |
| abstract_inverted_index.sets. | 192 |
| abstract_inverted_index.small | 60 |
| abstract_inverted_index.tasks | 95 |
| abstract_inverted_index.their | 24 |
| abstract_inverted_index.them. | 155 |
| abstract_inverted_index.these | 29 |
| abstract_inverted_index.this, | 40 |
| abstract_inverted_index.train | 58 |
| abstract_inverted_index.while | 153 |
| abstract_inverted_index.(VFMs) | 3 |
| abstract_inverted_index.11.6%, | 118 |
| abstract_inverted_index.13.7%, | 120 |
| abstract_inverted_index.22.1%, | 119 |
| abstract_inverted_index.29.8%, | 122 |
| abstract_inverted_index.Models | 2 |
| abstract_inverted_index.Vision | 0 |
| abstract_inverted_index.cannot | 31 |
| abstract_inverted_index.curate | 189 |
| abstract_inverted_index.effect | 169 |
| abstract_inverted_index.highly | 83 |
| abstract_inverted_index.models | 30 |
| abstract_inverted_index.simple | 76 |
| abstract_inverted_index.target | 19, 66, 94, 173 |
| abstract_inverted_index.tasks, | 14 |
| abstract_inverted_index.compute | 27, 139 |
| abstract_inverted_index.data?", | 72 |
| abstract_inverted_index.dataset | 161 |
| abstract_inverted_index.exhibit | 8 |
| abstract_inverted_index.labeled | 18, 70 |
| abstract_inverted_index.limited | 17, 69 |
| abstract_inverted_index.massive | 6 |
| abstract_inverted_index.propose | 74 |
| abstract_inverted_index.results | 91 |
| abstract_inverted_index.various | 12 |
| abstract_inverted_index.However, | 21 |
| abstract_inverted_index.ImageNet | 109, 147 |
| abstract_inverted_index.approach | 80, 100, 127 |
| abstract_inverted_index.compared | 142 |
| abstract_inverted_index.datasets | 7 |
| abstract_inverted_index.deployed | 33 |
| abstract_inverted_index.leverage | 50 |
| abstract_inverted_index.problem. | 88 |
| abstract_inverted_index.proposed | 99, 126 |
| abstract_inverted_index.solution | 85 |
| abstract_inverted_index.strategy | 182 |
| abstract_inverted_index.training | 71 |
| abstract_inverted_index.transfer | 79, 181, 191 |
| abstract_inverted_index.Motivated | 38 |
| abstract_inverted_index.effective | 84, 190 |
| abstract_inverted_index.following | 44 |
| abstract_inverted_index.important | 45 |
| abstract_inverted_index.inference | 26 |
| abstract_inverted_index.introduce | 177 |
| abstract_inverted_index.knowledge | 52, 78, 165, 180 |
| abstract_inverted_index.question, | 46 |
| abstract_inverted_index.reduction | 136 |
| abstract_inverted_index.retrieval | 187 |
| abstract_inverted_index.web-scale | 105, 185 |
| abstract_inverted_index.Foundation | 1 |
| abstract_inverted_index.downstream | 13 |
| abstract_inverted_index.especially | 15 |
| abstract_inverted_index.impressive | 9 |
| abstract_inverted_index.pretrained | 4 |
| abstract_inverted_index.real-world | 36 |
| abstract_inverted_index.supervised | 108 |
| abstract_inverted_index.outperforms | 101 |
| abstract_inverted_index.performance | 10 |
| abstract_inverted_index.pretraining | 114, 138, 148 |
| abstract_inverted_index.significant | 168 |
| abstract_inverted_index.Furthermore, | 124 |
| abstract_inverted_index.demonstrates | 129 |
| abstract_inverted_index.experimental | 90 |
| abstract_inverted_index.performance, | 175 |
| abstract_inverted_index.pretraining, | 107, 110, 151 |
| abstract_inverted_index.transferring | 164 |
| abstract_inverted_index.applications. | 37 |
| abstract_inverted_index.distillation, | 104, 146 |
| abstract_inverted_index.outperforming | 154 |
| abstract_inverted_index.respectively, | 152 |
| abstract_inverted_index.respectively. | 123 |
| abstract_inverted_index.task-agnostic | 102, 144 |
| abstract_inverted_index.task-oriented | 77 |
| abstract_inverted_index.task-specific | 61 |
| abstract_inverted_index.self-supervised | 112 |
| abstract_inverted_index.retrieval-augmented | 179 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 7 |
| citation_normalized_percentile |