LLMR: Knowledge Distillation with a Large Language Model-Induced Reward Article Swipe

PDF

YOU? · · 2024 · Open Access · · DOI: https://doi.org/10.48550/arxiv.2409.12500

Large language models have become increasingly popular and demonstrated remarkable performance in various natural language processing (NLP) tasks. However, these models are typically computationally expensive and difficult to be deployed in resource-constrained environments. In this paper, we propose LLMR, a novel knowledge distillation (KD) method based on a reward function induced from large language models. We conducted experiments on multiple datasets in the dialogue generation and summarization tasks. Empirical results demonstrate that our LLMR approach consistently outperforms traditional KD methods in different tasks and datasets.

Related Topics

Concepts

Distillation Computer science Natural language processing Chemistry Chromatography

Metadata

Type: preprint
Language: en
Landing Page: http://arxiv.org/abs/2409.12500
PDF: https://arxiv.org/pdf/2409.12500
OA Status: green
Related Works: 10
OpenAlex ID: https://openalex.org/W4403747413

All OpenAlex metadata

Raw OpenAlex JSON

OpenAlex ID: https://openalex.org/W4403747413

Canonical identifier for this work in OpenAlex
DOI: https://doi.org/10.48550/arxiv.2409.12500

Digital Object Identifier
Title: LLMR: Knowledge Distillation with a Large Language Model-Induced Reward

Work title
Type: preprint

OpenAlex work type
Language: en

Primary language
Publication year: 2024

Year of publication
Publication date: 2024-09-19

Full publication date if available
Authors: Dongqi Li, Yongchang Hao, Lili Mou

List of authors in order
Landing page: https://arxiv.org/abs/2409.12500

Publisher landing page
PDF URL: https://arxiv.org/pdf/2409.12500

Direct link to full text PDF
Open access: Yes

Whether a free full text is available
OA status: green

Open access status per OpenAlex
OA URL: https://arxiv.org/pdf/2409.12500

Direct OA link when available
Concepts: Distillation, Computer science, Natural language processing, Chemistry, Chromatography

Top concepts (fields/topics) attached by OpenAlex
Cited by: 0

Total citation count in OpenAlex
Related works (count): 10

Other works algorithmically related by OpenAlex

Full payload

id	https://openalex.org/W4403747413
doi	https://doi.org/10.48550/arxiv.2409.12500
ids.doi	https://doi.org/10.48550/arxiv.2409.12500
ids.openalex	https://openalex.org/W4403747413
fwci
type	preprint
title	LLMR: Knowledge Distillation with a Large Language Model-Induced Reward
biblio.issue
biblio.volume
biblio.last_page
biblio.first_page
topics[0].id	https://openalex.org/T10028
topics[0].field.id	https://openalex.org/fields/17
topics[0].field.display_name	Computer Science
topics[0].score	0.9898999929428101
topics[0].domain.id	https://openalex.org/domains/3
topics[0].domain.display_name	Physical Sciences
topics[0].subfield.id	https://openalex.org/subfields/1702
topics[0].subfield.display_name	Artificial Intelligence
topics[0].display_name	Topic Modeling
topics[1].id	https://openalex.org/T10181
topics[1].field.id	https://openalex.org/fields/17
topics[1].field.display_name	Computer Science
topics[1].score	0.9599999785423279
topics[1].domain.id	https://openalex.org/domains/3
topics[1].domain.display_name	Physical Sciences
topics[1].subfield.id	https://openalex.org/subfields/1702
topics[1].subfield.display_name	Artificial Intelligence
topics[1].display_name	Natural Language Processing Techniques
is_xpac	False
apc_list
apc_paid
concepts[0].id	https://openalex.org/C204030448
concepts[0].level	2
concepts[0].score	0.6720160245895386
concepts[0].wikidata	https://www.wikidata.org/wiki/Q101017
concepts[0].display_name	Distillation
concepts[1].id	https://openalex.org/C41008148
concepts[1].level	0
concepts[1].score	0.5722776651382446
concepts[1].wikidata	https://www.wikidata.org/wiki/Q21198
concepts[1].display_name	Computer science
concepts[2].id	https://openalex.org/C204321447
concepts[2].level	1
concepts[2].score	0.3806074261665344
concepts[2].wikidata	https://www.wikidata.org/wiki/Q30642
concepts[2].display_name	Natural language processing
concepts[3].id	https://openalex.org/C185592680
concepts[3].level	0
concepts[3].score	0.20438051223754883
concepts[3].wikidata	https://www.wikidata.org/wiki/Q2329
concepts[3].display_name	Chemistry
concepts[4].id	https://openalex.org/C43617362
concepts[4].level	1
concepts[4].score	0.09111392498016357
concepts[4].wikidata	https://www.wikidata.org/wiki/Q170050
concepts[4].display_name	Chromatography
keywords[0].id	https://openalex.org/keywords/distillation
keywords[0].score	0.6720160245895386
keywords[0].display_name	Distillation
keywords[1].id	https://openalex.org/keywords/computer-science
keywords[1].score	0.5722776651382446
keywords[1].display_name	Computer science
keywords[2].id	https://openalex.org/keywords/natural-language-processing
keywords[2].score	0.3806074261665344
keywords[2].display_name	Natural language processing
keywords[3].id	https://openalex.org/keywords/chemistry
keywords[3].score	0.20438051223754883
keywords[3].display_name	Chemistry
keywords[4].id	https://openalex.org/keywords/chromatography
keywords[4].score	0.09111392498016357
keywords[4].display_name	Chromatography
language	en
locations[0].id	pmh:oai:arXiv.org:2409.12500
locations[0].is_oa	True
locations[0].source.id	https://openalex.org/S4306400194
locations[0].source.issn
locations[0].source.type	repository
locations[0].source.is_oa	True
locations[0].source.issn_l
locations[0].source.is_core	False
locations[0].source.is_in_doaj	False
locations[0].source.display_name	arXiv (Cornell University)
locations[0].source.host_organization	https://openalex.org/I205783295
locations[0].source.host_organization_name	Cornell University
locations[0].source.host_organization_lineage	https://openalex.org/I205783295
locations[0].license
locations[0].pdf_url	https://arxiv.org/pdf/2409.12500
locations[0].version	submittedVersion
locations[0].raw_type	text
locations[0].license_id
locations[0].is_accepted	False
locations[0].is_published	False
locations[0].raw_source_name
locations[0].landing_page_url	http://arxiv.org/abs/2409.12500
locations[1].id	doi:10.48550/arxiv.2409.12500
locations[1].is_oa	True
locations[1].source.id	https://openalex.org/S4306400194
locations[1].source.issn
locations[1].source.type	repository
locations[1].source.is_oa	True
locations[1].source.issn_l
locations[1].source.is_core	False
locations[1].source.is_in_doaj	False
locations[1].source.display_name	arXiv (Cornell University)
locations[1].source.host_organization	https://openalex.org/I205783295
locations[1].source.host_organization_name	Cornell University
locations[1].source.host_organization_lineage	https://openalex.org/I205783295
locations[1].license
locations[1].pdf_url
locations[1].version
locations[1].raw_type	article
locations[1].license_id
locations[1].is_accepted	False
locations[1].is_published
locations[1].raw_source_name
locations[1].landing_page_url	https://doi.org/10.48550/arxiv.2409.12500
indexed_in	arxiv, datacite
authorships[0].author.id	https://openalex.org/A5011910976
authorships[0].author.orcid
authorships[0].author.display_name	Dongqi Li
authorships[0].author_position	first
authorships[0].raw_author_name	Li, Dongheng
authorships[0].is_corresponding	False
authorships[1].author.id	https://openalex.org/A5111588412
authorships[1].author.orcid
authorships[1].author.display_name	Yongchang Hao
authorships[1].author_position	middle
authorships[1].raw_author_name	Hao, Yongchang
authorships[1].is_corresponding	False
authorships[2].author.id	https://openalex.org/A5024821632
authorships[2].author.orcid	https://orcid.org/0000-0001-7753-4295
authorships[2].author.display_name	Lili Mou
authorships[2].author_position	last
authorships[2].raw_author_name	Mou, Lili
authorships[2].is_corresponding	False
has_content.pdf	False
has_content.grobid_xml	False
is_paratext	False
open_access.is_oa	True
open_access.oa_url	https://arxiv.org/pdf/2409.12500
open_access.oa_status	green
open_access.any_repository_has_fulltext	False
created_date	2025-10-10T00:00:00
display_name	LLMR: Knowledge Distillation with a Large Language Model-Induced Reward
has_fulltext	False
is_retracted	False
updated_date	2025-11-06T06:51:31.235846
primary_topic.id	https://openalex.org/T10028
primary_topic.field.id	https://openalex.org/fields/17
primary_topic.field.display_name	Computer Science
primary_topic.score	0.9898999929428101
primary_topic.domain.id	https://openalex.org/domains/3
primary_topic.domain.display_name	Physical Sciences
primary_topic.subfield.id	https://openalex.org/subfields/1702
primary_topic.subfield.display_name	Artificial Intelligence
primary_topic.display_name	Topic Modeling
related_works	https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4396696052
cited_by_count	0
locations_count	2
best_oa_location.id	pmh:oai:arXiv.org:2409.12500
best_oa_location.is_oa	True
best_oa_location.source.id	https://openalex.org/S4306400194
best_oa_location.source.issn
best_oa_location.source.type	repository
best_oa_location.source.is_oa	True
best_oa_location.source.issn_l
best_oa_location.source.is_core	False
best_oa_location.source.is_in_doaj	False
best_oa_location.source.display_name	arXiv (Cornell University)
best_oa_location.source.host_organization	https://openalex.org/I205783295
best_oa_location.source.host_organization_name	Cornell University
best_oa_location.source.host_organization_lineage	https://openalex.org/I205783295
best_oa_location.license
best_oa_location.pdf_url	https://arxiv.org/pdf/2409.12500
best_oa_location.version	submittedVersion
best_oa_location.raw_type	text
best_oa_location.license_id
best_oa_location.is_accepted	False
best_oa_location.is_published	False
best_oa_location.raw_source_name
best_oa_location.landing_page_url	http://arxiv.org/abs/2409.12500
primary_location.id	pmh:oai:arXiv.org:2409.12500
primary_location.is_oa	True
primary_location.source.id	https://openalex.org/S4306400194
primary_location.source.issn
primary_location.source.type	repository
primary_location.source.is_oa	True
primary_location.source.issn_l
primary_location.source.is_core	False
primary_location.source.is_in_doaj	False
primary_location.source.display_name	arXiv (Cornell University)
primary_location.source.host_organization	https://openalex.org/I205783295
primary_location.source.host_organization_name	Cornell University
primary_location.source.host_organization_lineage	https://openalex.org/I205783295
primary_location.license
primary_location.pdf_url	https://arxiv.org/pdf/2409.12500
primary_location.version	submittedVersion
primary_location.raw_type	text
primary_location.license_id
primary_location.is_accepted	False
primary_location.is_published	False
primary_location.raw_source_name
primary_location.landing_page_url	http://arxiv.org/abs/2409.12500
publication_date	2024-09-19
publication_year	2024
referenced_works_count	0
abstract_inverted_index.a	39, 47
abstract_inverted_index.In	33
abstract_inverted_index.KD	78
abstract_inverted_index.We	55
abstract_inverted_index.be	28
abstract_inverted_index.in	11, 30, 61, 80
abstract_inverted_index.on	46, 58
abstract_inverted_index.to	27
abstract_inverted_index.we	36
abstract_inverted_index.and	7, 25, 65, 83
abstract_inverted_index.are	21
abstract_inverted_index.our	72
abstract_inverted_index.the	62
abstract_inverted_index.(KD)	43
abstract_inverted_index.LLMR	73
abstract_inverted_index.from	51
abstract_inverted_index.have	3
abstract_inverted_index.that	71
abstract_inverted_index.this	34
abstract_inverted_index.(NLP)	16
abstract_inverted_index.LLMR,	38
abstract_inverted_index.Large	0
abstract_inverted_index.based	45
abstract_inverted_index.large	52
abstract_inverted_index.novel	40
abstract_inverted_index.tasks	82
abstract_inverted_index.these	19
abstract_inverted_index.become	4
abstract_inverted_index.method	44
abstract_inverted_index.models	2, 20
abstract_inverted_index.paper,	35
abstract_inverted_index.reward	48
abstract_inverted_index.tasks.	17, 67
abstract_inverted_index.induced	50
abstract_inverted_index.methods	79
abstract_inverted_index.models.	54
abstract_inverted_index.natural	13
abstract_inverted_index.popular	6
abstract_inverted_index.propose	37
abstract_inverted_index.results	69
abstract_inverted_index.various	12
abstract_inverted_index.However,	18
abstract_inverted_index.approach	74
abstract_inverted_index.datasets	60
abstract_inverted_index.deployed	29
abstract_inverted_index.dialogue	63
abstract_inverted_index.function	49
abstract_inverted_index.language	1, 14, 53
abstract_inverted_index.multiple	59
abstract_inverted_index.Empirical	68
abstract_inverted_index.conducted	56
abstract_inverted_index.datasets.	84
abstract_inverted_index.different	81
abstract_inverted_index.difficult	26
abstract_inverted_index.expensive	24
abstract_inverted_index.knowledge	41
abstract_inverted_index.typically	22
abstract_inverted_index.generation	64
abstract_inverted_index.processing	15
abstract_inverted_index.remarkable	9
abstract_inverted_index.demonstrate	70
abstract_inverted_index.experiments	57
abstract_inverted_index.outperforms	76
abstract_inverted_index.performance	10
abstract_inverted_index.traditional	77
abstract_inverted_index.consistently	75
abstract_inverted_index.demonstrated	8
abstract_inverted_index.distillation	42
abstract_inverted_index.increasingly	5
abstract_inverted_index.environments.	32
abstract_inverted_index.summarization	66
abstract_inverted_index.computationally	23
abstract_inverted_index.resource-constrained	31
cited_by_percentile_year
countries_distinct_count	0
institutions_distinct_count	3
citation_normalized_percentile