Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2412.07762
The modern paradigm in machine learning involves pre-training on diverse data, followed by task-specific fine-tuning. In reinforcement learning (RL), this translates to learning via offline RL on a diverse historical dataset, followed by rapid online RL fine-tuning using interaction data. Most RL fine-tuning methods require continued training on offline data for stability and performance. However, this is undesirable because training on diverse offline data is slow and expensive for large datasets, and in principle, also limit the performance improvement possible because of constraints or pessimism on offline data. In this paper, we show that retaining offline data is unnecessary as long as we use a properly-designed online RL approach for fine-tuning offline RL initializations. To build this approach, we start by analyzing the role of retaining offline data in online fine-tuning. We find that continued training on offline data is mostly useful for preventing a sudden divergence in the value function at the onset of fine-tuning, caused by a distribution mismatch between the offline data and online rollouts. This divergence typically results in unlearning and forgetting the benefits of offline pre-training. Our approach, Warm-start RL (WSRL), mitigates the catastrophic forgetting of pre-trained initializations using a very simple idea. WSRL employs a warmup phase that seeds the online RL run with a very small number of rollouts from the pre-trained policy to do fast online RL. The data collected during warmup helps ``recalibrate'' the offline Q-function to the online distribution, allowing us to completely discard offline data without destabilizing the online RL fine-tuning. We show that WSRL is able to fine-tune without retaining any offline data, and is able to learn faster and attains higher performance than existing algorithms irrespective of whether they retain offline data or not.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2412.07762
- https://arxiv.org/pdf/2412.07762
- OA Status
- green
- Cited By
- 1
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4405281200
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4405281200Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2412.07762Digital Object Identifier
- Title
-
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline DataWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-12-10Full publication date if available
- Authors
-
Zhiyuan Zhou, Anjie Peng, Qiyang Li, Sergey Levine, Aviral KumarList of authors in order
- Landing page
-
https://arxiv.org/abs/2412.07762Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2412.07762Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2412.07762Direct OA link when available
- Concepts
-
Reinforcement learning, Reinforcement, Computer science, Online and offline, Online learning, Artificial intelligence, World Wide Web, Operating system, Materials science, Composite materialTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4405281200 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2412.07762 |
| ids.doi | https://doi.org/10.48550/arxiv.2412.07762 |
| ids.openalex | https://openalex.org/W4405281200 |
| fwci | |
| type | preprint |
| title | Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11437 |
| topics[0].field.id | https://openalex.org/fields/14 |
| topics[0].field.display_name | Business, Management and Accounting |
| topics[0].score | 0.6171000003814697 |
| topics[0].domain.id | https://openalex.org/domains/2 |
| topics[0].domain.display_name | Social Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1408 |
| topics[0].subfield.display_name | Strategy and Management |
| topics[0].display_name | Digital Platforms and Economics |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C97541855 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8037155270576477 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[0].display_name | Reinforcement learning |
| concepts[1].id | https://openalex.org/C67203356 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6742357015609741 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q1321905 |
| concepts[1].display_name | Reinforcement |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.5829175114631653 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C2780102126 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5171843767166138 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q10928179 |
| concepts[3].display_name | Online and offline |
| concepts[4].id | https://openalex.org/C2986087404 |
| concepts[4].level | 2 |
| concepts[4].score | 0.4221288859844208 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q15946010 |
| concepts[4].display_name | Online learning |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.33751755952835083 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C136764020 |
| concepts[6].level | 1 |
| concepts[6].score | 0.1664975881576538 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q466 |
| concepts[6].display_name | World Wide Web |
| concepts[7].id | https://openalex.org/C111919701 |
| concepts[7].level | 1 |
| concepts[7].score | 0.14008843898773193 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[7].display_name | Operating system |
| concepts[8].id | https://openalex.org/C192562407 |
| concepts[8].level | 0 |
| concepts[8].score | 0.1320594847202301 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q228736 |
| concepts[8].display_name | Materials science |
| concepts[9].id | https://openalex.org/C159985019 |
| concepts[9].level | 1 |
| concepts[9].score | 0.07698237895965576 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q181790 |
| concepts[9].display_name | Composite material |
| keywords[0].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[0].score | 0.8037155270576477 |
| keywords[0].display_name | Reinforcement learning |
| keywords[1].id | https://openalex.org/keywords/reinforcement |
| keywords[1].score | 0.6742357015609741 |
| keywords[1].display_name | Reinforcement |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.5829175114631653 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/online-and-offline |
| keywords[3].score | 0.5171843767166138 |
| keywords[3].display_name | Online and offline |
| keywords[4].id | https://openalex.org/keywords/online-learning |
| keywords[4].score | 0.4221288859844208 |
| keywords[4].display_name | Online learning |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.33751755952835083 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/world-wide-web |
| keywords[6].score | 0.1664975881576538 |
| keywords[6].display_name | World Wide Web |
| keywords[7].id | https://openalex.org/keywords/operating-system |
| keywords[7].score | 0.14008843898773193 |
| keywords[7].display_name | Operating system |
| keywords[8].id | https://openalex.org/keywords/materials-science |
| keywords[8].score | 0.1320594847202301 |
| keywords[8].display_name | Materials science |
| keywords[9].id | https://openalex.org/keywords/composite-material |
| keywords[9].score | 0.07698237895965576 |
| keywords[9].display_name | Composite material |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2412.07762 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2412.07762 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2412.07762 |
| locations[1].id | doi:10.48550/arxiv.2412.07762 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2412.07762 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5101728868 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-0258-484X |
| authorships[0].author.display_name | Zhiyuan Zhou |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Zhou, Zhiyuan |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5014092332 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-9287-7536 |
| authorships[1].author.display_name | Anjie Peng |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Peng, Andy |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5069290071 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-2214-9968 |
| authorships[2].author.display_name | Qiyang Li |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Li, Qiyang |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5026322200 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-6764-2743 |
| authorships[3].author.display_name | Sergey Levine |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Levine, Sergey |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5102493293 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Aviral Kumar |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Kumar, Aviral |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2412.07762 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11437 |
| primary_topic.field.id | https://openalex.org/fields/14 |
| primary_topic.field.display_name | Business, Management and Accounting |
| primary_topic.score | 0.6171000003814697 |
| primary_topic.domain.id | https://openalex.org/domains/2 |
| primary_topic.domain.display_name | Social Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1408 |
| primary_topic.subfield.display_name | Strategy and Management |
| primary_topic.display_name | Digital Platforms and Economics |
| related_works | https://openalex.org/W4310083477, https://openalex.org/W2328553770, https://openalex.org/W2920061524, https://openalex.org/W1977959518, https://openalex.org/W2038908348, https://openalex.org/W2107890255, https://openalex.org/W2106552856, https://openalex.org/W2145821588, https://openalex.org/W4225619808, https://openalex.org/W3207447243 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2412.07762 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2412.07762 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2412.07762 |
| primary_location.id | pmh:oai:arXiv.org:2412.07762 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2412.07762 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2412.07762 |
| publication_date | 2024-12-10 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 27, 104, 144, 158, 194, 200, 210 |
| abstract_inverted_index.In | 15, 88 |
| abstract_inverted_index.RL | 25, 35, 41, 107, 112, 184, 207, 250 |
| abstract_inverted_index.To | 114 |
| abstract_inverted_index.We | 131, 252 |
| abstract_inverted_index.as | 99, 101 |
| abstract_inverted_index.at | 151 |
| abstract_inverted_index.by | 12, 32, 120, 157 |
| abstract_inverted_index.do | 221 |
| abstract_inverted_index.in | 3, 72, 128, 147, 172 |
| abstract_inverted_index.is | 56, 64, 97, 139, 256, 266 |
| abstract_inverted_index.of | 81, 124, 154, 178, 190, 214, 279 |
| abstract_inverted_index.on | 8, 26, 47, 60, 85, 136 |
| abstract_inverted_index.or | 83, 285 |
| abstract_inverted_index.to | 21, 220, 235, 241, 258, 268 |
| abstract_inverted_index.us | 240 |
| abstract_inverted_index.we | 91, 102, 118 |
| abstract_inverted_index.Our | 181 |
| abstract_inverted_index.RL. | 224 |
| abstract_inverted_index.The | 0, 225 |
| abstract_inverted_index.and | 52, 66, 71, 165, 174, 265, 271 |
| abstract_inverted_index.any | 262 |
| abstract_inverted_index.for | 50, 68, 109, 142 |
| abstract_inverted_index.run | 208 |
| abstract_inverted_index.the | 76, 122, 148, 152, 162, 176, 187, 205, 217, 232, 236, 248 |
| abstract_inverted_index.use | 103 |
| abstract_inverted_index.via | 23 |
| abstract_inverted_index.Most | 40 |
| abstract_inverted_index.This | 168 |
| abstract_inverted_index.WSRL | 198, 255 |
| abstract_inverted_index.able | 257, 267 |
| abstract_inverted_index.also | 74 |
| abstract_inverted_index.data | 49, 63, 96, 127, 138, 164, 226, 245, 284 |
| abstract_inverted_index.fast | 222 |
| abstract_inverted_index.find | 132 |
| abstract_inverted_index.from | 216 |
| abstract_inverted_index.long | 100 |
| abstract_inverted_index.not. | 286 |
| abstract_inverted_index.role | 123 |
| abstract_inverted_index.show | 92, 253 |
| abstract_inverted_index.slow | 65 |
| abstract_inverted_index.than | 275 |
| abstract_inverted_index.that | 93, 133, 203, 254 |
| abstract_inverted_index.they | 281 |
| abstract_inverted_index.this | 19, 55, 89, 116 |
| abstract_inverted_index.very | 195, 211 |
| abstract_inverted_index.with | 209 |
| abstract_inverted_index.(RL), | 18 |
| abstract_inverted_index.build | 115 |
| abstract_inverted_index.data, | 10, 264 |
| abstract_inverted_index.data. | 39, 87 |
| abstract_inverted_index.helps | 230 |
| abstract_inverted_index.idea. | 197 |
| abstract_inverted_index.large | 69 |
| abstract_inverted_index.learn | 269 |
| abstract_inverted_index.limit | 75 |
| abstract_inverted_index.onset | 153 |
| abstract_inverted_index.phase | 202 |
| abstract_inverted_index.rapid | 33 |
| abstract_inverted_index.seeds | 204 |
| abstract_inverted_index.small | 212 |
| abstract_inverted_index.start | 119 |
| abstract_inverted_index.using | 37, 193 |
| abstract_inverted_index.value | 149 |
| abstract_inverted_index.caused | 156 |
| abstract_inverted_index.during | 228 |
| abstract_inverted_index.faster | 270 |
| abstract_inverted_index.higher | 273 |
| abstract_inverted_index.modern | 1 |
| abstract_inverted_index.mostly | 140 |
| abstract_inverted_index.number | 213 |
| abstract_inverted_index.online | 34, 106, 129, 166, 206, 223, 237, 249 |
| abstract_inverted_index.paper, | 90 |
| abstract_inverted_index.policy | 219 |
| abstract_inverted_index.retain | 282 |
| abstract_inverted_index.simple | 196 |
| abstract_inverted_index.sudden | 145 |
| abstract_inverted_index.useful | 141 |
| abstract_inverted_index.warmup | 201, 229 |
| abstract_inverted_index.(WSRL), | 185 |
| abstract_inverted_index.attains | 272 |
| abstract_inverted_index.because | 58, 80 |
| abstract_inverted_index.between | 161 |
| abstract_inverted_index.discard | 243 |
| abstract_inverted_index.diverse | 9, 28, 61 |
| abstract_inverted_index.employs | 199 |
| abstract_inverted_index.machine | 4 |
| abstract_inverted_index.methods | 43 |
| abstract_inverted_index.offline | 24, 48, 62, 86, 95, 111, 126, 137, 163, 179, 233, 244, 263, 283 |
| abstract_inverted_index.require | 44 |
| abstract_inverted_index.results | 171 |
| abstract_inverted_index.whether | 280 |
| abstract_inverted_index.without | 246, 260 |
| abstract_inverted_index.However, | 54 |
| abstract_inverted_index.allowing | 239 |
| abstract_inverted_index.approach | 108 |
| abstract_inverted_index.benefits | 177 |
| abstract_inverted_index.dataset, | 30 |
| abstract_inverted_index.existing | 276 |
| abstract_inverted_index.followed | 11, 31 |
| abstract_inverted_index.function | 150 |
| abstract_inverted_index.involves | 6 |
| abstract_inverted_index.learning | 5, 17, 22 |
| abstract_inverted_index.mismatch | 160 |
| abstract_inverted_index.paradigm | 2 |
| abstract_inverted_index.possible | 79 |
| abstract_inverted_index.rollouts | 215 |
| abstract_inverted_index.training | 46, 59, 135 |
| abstract_inverted_index.analyzing | 121 |
| abstract_inverted_index.approach, | 117, 182 |
| abstract_inverted_index.collected | 227 |
| abstract_inverted_index.continued | 45, 134 |
| abstract_inverted_index.datasets, | 70 |
| abstract_inverted_index.expensive | 67 |
| abstract_inverted_index.fine-tune | 259 |
| abstract_inverted_index.mitigates | 186 |
| abstract_inverted_index.pessimism | 84 |
| abstract_inverted_index.retaining | 94, 125, 261 |
| abstract_inverted_index.rollouts. | 167 |
| abstract_inverted_index.stability | 51 |
| abstract_inverted_index.typically | 170 |
| abstract_inverted_index.Q-function | 234 |
| abstract_inverted_index.Warm-start | 183 |
| abstract_inverted_index.algorithms | 277 |
| abstract_inverted_index.completely | 242 |
| abstract_inverted_index.divergence | 146, 169 |
| abstract_inverted_index.forgetting | 175, 189 |
| abstract_inverted_index.historical | 29 |
| abstract_inverted_index.preventing | 143 |
| abstract_inverted_index.principle, | 73 |
| abstract_inverted_index.translates | 20 |
| abstract_inverted_index.unlearning | 173 |
| abstract_inverted_index.constraints | 82 |
| abstract_inverted_index.fine-tuning | 36, 42, 110 |
| abstract_inverted_index.improvement | 78 |
| abstract_inverted_index.interaction | 38 |
| abstract_inverted_index.performance | 77, 274 |
| abstract_inverted_index.pre-trained | 191, 218 |
| abstract_inverted_index.undesirable | 57 |
| abstract_inverted_index.unnecessary | 98 |
| abstract_inverted_index.catastrophic | 188 |
| abstract_inverted_index.distribution | 159 |
| abstract_inverted_index.fine-tuning, | 155 |
| abstract_inverted_index.fine-tuning. | 14, 130, 251 |
| abstract_inverted_index.irrespective | 278 |
| abstract_inverted_index.performance. | 53 |
| abstract_inverted_index.pre-training | 7 |
| abstract_inverted_index.destabilizing | 247 |
| abstract_inverted_index.distribution, | 238 |
| abstract_inverted_index.pre-training. | 180 |
| abstract_inverted_index.reinforcement | 16 |
| abstract_inverted_index.task-specific | 13 |
| abstract_inverted_index.``recalibrate'' | 231 |
| abstract_inverted_index.initializations | 192 |
| abstract_inverted_index.initializations. | 113 |
| abstract_inverted_index.properly-designed | 105 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |