LLMR: Knowledge Distillation with a Large Language Model-Induced Reward Article Swipe
Dongqi Li
,
Yongchang Hao
,
Lili Mou
·
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2409.12500
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2409.12500
Large language models have become increasingly popular and demonstrated remarkable performance in various natural language processing (NLP) tasks. However, these models are typically computationally expensive and difficult to be deployed in resource-constrained environments. In this paper, we propose LLMR, a novel knowledge distillation (KD) method based on a reward function induced from large language models. We conducted experiments on multiple datasets in the dialogue generation and summarization tasks. Empirical results demonstrate that our LLMR approach consistently outperforms traditional KD methods in different tasks and datasets.
Related Topics
Metadata
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2409.12500
- https://arxiv.org/pdf/2409.12500
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4403747413
All OpenAlex metadata
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4403747413Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2409.12500Digital Object Identifier
- Title
-
LLMR: Knowledge Distillation with a Large Language Model-Induced RewardWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-09-19Full publication date if available
- Authors
-
Dongqi Li, Yongchang Hao, Lili MouList of authors in order
- Landing page
-
https://arxiv.org/abs/2409.12500Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2409.12500Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2409.12500Direct OA link when available
- Concepts
-
Distillation, Computer science, Natural language processing, Chemistry, ChromatographyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4403747413 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2409.12500 |
| ids.doi | https://doi.org/10.48550/arxiv.2409.12500 |
| ids.openalex | https://openalex.org/W4403747413 |
| fwci | |
| type | preprint |
| title | LLMR: Knowledge Distillation with a Large Language Model-Induced Reward |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10028 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9898999929428101 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Topic Modeling |
| topics[1].id | https://openalex.org/T10181 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9599999785423279 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Natural Language Processing Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C204030448 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6720160245895386 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q101017 |
| concepts[0].display_name | Distillation |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.5722776651382446 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C204321447 |
| concepts[2].level | 1 |
| concepts[2].score | 0.3806074261665344 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[2].display_name | Natural language processing |
| concepts[3].id | https://openalex.org/C185592680 |
| concepts[3].level | 0 |
| concepts[3].score | 0.20438051223754883 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[3].display_name | Chemistry |
| concepts[4].id | https://openalex.org/C43617362 |
| concepts[4].level | 1 |
| concepts[4].score | 0.09111392498016357 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q170050 |
| concepts[4].display_name | Chromatography |
| keywords[0].id | https://openalex.org/keywords/distillation |
| keywords[0].score | 0.6720160245895386 |
| keywords[0].display_name | Distillation |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.5722776651382446 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/natural-language-processing |
| keywords[2].score | 0.3806074261665344 |
| keywords[2].display_name | Natural language processing |
| keywords[3].id | https://openalex.org/keywords/chemistry |
| keywords[3].score | 0.20438051223754883 |
| keywords[3].display_name | Chemistry |
| keywords[4].id | https://openalex.org/keywords/chromatography |
| keywords[4].score | 0.09111392498016357 |
| keywords[4].display_name | Chromatography |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2409.12500 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2409.12500 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2409.12500 |
| locations[1].id | doi:10.48550/arxiv.2409.12500 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2409.12500 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5011910976 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Dongqi Li |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Li, Dongheng |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5111588412 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Yongchang Hao |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Hao, Yongchang |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5024821632 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-7753-4295 |
| authorships[2].author.display_name | Lili Mou |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Mou, Lili |
| authorships[2].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2409.12500 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | LLMR: Knowledge Distillation with a Large Language Model-Induced Reward |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10028 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9898999929428101 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Topic Modeling |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4396696052 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2409.12500 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2409.12500 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2409.12500 |
| primary_location.id | pmh:oai:arXiv.org:2409.12500 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2409.12500 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2409.12500 |
| publication_date | 2024-09-19 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 39, 47 |
| abstract_inverted_index.In | 33 |
| abstract_inverted_index.KD | 78 |
| abstract_inverted_index.We | 55 |
| abstract_inverted_index.be | 28 |
| abstract_inverted_index.in | 11, 30, 61, 80 |
| abstract_inverted_index.on | 46, 58 |
| abstract_inverted_index.to | 27 |
| abstract_inverted_index.we | 36 |
| abstract_inverted_index.and | 7, 25, 65, 83 |
| abstract_inverted_index.are | 21 |
| abstract_inverted_index.our | 72 |
| abstract_inverted_index.the | 62 |
| abstract_inverted_index.(KD) | 43 |
| abstract_inverted_index.LLMR | 73 |
| abstract_inverted_index.from | 51 |
| abstract_inverted_index.have | 3 |
| abstract_inverted_index.that | 71 |
| abstract_inverted_index.this | 34 |
| abstract_inverted_index.(NLP) | 16 |
| abstract_inverted_index.LLMR, | 38 |
| abstract_inverted_index.Large | 0 |
| abstract_inverted_index.based | 45 |
| abstract_inverted_index.large | 52 |
| abstract_inverted_index.novel | 40 |
| abstract_inverted_index.tasks | 82 |
| abstract_inverted_index.these | 19 |
| abstract_inverted_index.become | 4 |
| abstract_inverted_index.method | 44 |
| abstract_inverted_index.models | 2, 20 |
| abstract_inverted_index.paper, | 35 |
| abstract_inverted_index.reward | 48 |
| abstract_inverted_index.tasks. | 17, 67 |
| abstract_inverted_index.induced | 50 |
| abstract_inverted_index.methods | 79 |
| abstract_inverted_index.models. | 54 |
| abstract_inverted_index.natural | 13 |
| abstract_inverted_index.popular | 6 |
| abstract_inverted_index.propose | 37 |
| abstract_inverted_index.results | 69 |
| abstract_inverted_index.various | 12 |
| abstract_inverted_index.However, | 18 |
| abstract_inverted_index.approach | 74 |
| abstract_inverted_index.datasets | 60 |
| abstract_inverted_index.deployed | 29 |
| abstract_inverted_index.dialogue | 63 |
| abstract_inverted_index.function | 49 |
| abstract_inverted_index.language | 1, 14, 53 |
| abstract_inverted_index.multiple | 59 |
| abstract_inverted_index.Empirical | 68 |
| abstract_inverted_index.conducted | 56 |
| abstract_inverted_index.datasets. | 84 |
| abstract_inverted_index.different | 81 |
| abstract_inverted_index.difficult | 26 |
| abstract_inverted_index.expensive | 24 |
| abstract_inverted_index.knowledge | 41 |
| abstract_inverted_index.typically | 22 |
| abstract_inverted_index.generation | 64 |
| abstract_inverted_index.processing | 15 |
| abstract_inverted_index.remarkable | 9 |
| abstract_inverted_index.demonstrate | 70 |
| abstract_inverted_index.experiments | 57 |
| abstract_inverted_index.outperforms | 76 |
| abstract_inverted_index.performance | 10 |
| abstract_inverted_index.traditional | 77 |
| abstract_inverted_index.consistently | 75 |
| abstract_inverted_index.demonstrated | 8 |
| abstract_inverted_index.distillation | 42 |
| abstract_inverted_index.increasingly | 5 |
| abstract_inverted_index.environments. | 32 |
| abstract_inverted_index.summarization | 66 |
| abstract_inverted_index.computationally | 23 |
| abstract_inverted_index.resource-constrained | 31 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| citation_normalized_percentile |