DHGRPO: Domain-Induced, Hierarchical Group Relative Policy Optimization Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.5281/zenodo.16786367
DHGRPO (Domain-Induced Hierarchical Group Relative Policy Optimization) is a mathematically grounded extension of Group Relative Policy Optimization (GRPO) that mitigates group-level failure modes in preference-based fine-tuning of large language models. The method integrates: (i) robust per-prompt normalization via median and median absolute deviation (MAD) to suppress outlier influence, (ii) a Domain-Induced Factor (DIF) for trust gating based on long-term reward stability, (iii) a Domain-Optimism Parameter (DOP) for recency-weighted learning emphasis, and (iv) a bounded reward amplifier with optional magnitude matching to preserve update scale. We present a stepwise derivation from the exact policy gradient to the GRPO surrogate and its DHGRPO refinement, a controlled simulation framework with hyperparameter sweeps demonstrating consistent proxy improvements, and actionable implementation recommendations for real-world deployment in large-scale preference optimization.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2501.12948
- https://arxiv.org/pdf/2501.12948
- OA Status
- green
- Cited By
- 411
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4406779522
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4406779522Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.5281/zenodo.16786367Digital Object Identifier
- Title
-
DHGRPO: Domain-Induced, Hierarchical Group Relative Policy OptimizationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-08-09Full publication date if available
- Authors
-
DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, Xiaokang Zhang, Xingkai Yu, Yu Wu, Zhenhua Wu, Zhibin Gou, Zhihong Shao, Zhuoshu Li, Ziyi Gao, Aixin Liu, Bing Xue, Bingxuan Wang, Bowen Wu, Bei Feng, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai Dai, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fengze Dai, Fuli Luo, Guangbo Hao, Guan-Ting Chen, Guowei Li, Hongjun Zhang, Han Bao, Hanwei Xu, Haocheng Wang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Qu, Hui Li, Jianzhong Guo, Jiashi Li, Jiawei Wang, Jingchang Chen, Jingyang Yuan, Junjie Qiu, Junlong Li, Jiali Cai, Jiaqi Ni, Jian Liang, Jing Chen, Kai Dong, Kai Hu, Kaige Gao, Kang Guan, Kexin Huang, Kuai Yu, Lean Wang, Lecong Zhang, Liang Zhao, Litong Wang, Liyue Zhang, Lei Xu, L. Xia, Mingchuan Zhang, Minghua Zhang, Minghui Tang, Meng Li, Miaojun Wang, Mingming Li, Ning Tian, Panpan Huang, Peng Zhang, Qiancheng Wang, Qinyu Chen, Qiushi Du, Ruiqi Ge, Ruisong Zhang, Rui‐Le Pan, Runji Wang, R. J. Chen, Rong Jin, Ruyi Chen, Shanghao Lu, Shangyan Zhou, Shanhuang Chen, Shengfeng Ye, Shiyu Wang, Shuiping Yu, Shunfeng Zhou, Shuting Pan, Sansan LiList of authors in order
- Landing page
-
https://arxiv.org/abs/2501.12948Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2501.12948Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2501.12948Direct OA link when available
- Concepts
-
Reinforcement learning, Reinforcement, Computer science, Cognitive science, Artificial intelligence, Psychology, Social psychologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
411Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 408, 2024: 3Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4406779522 |
|---|---|
| doi | https://doi.org/10.5281/zenodo.16786367 |
| ids.doi | https://doi.org/10.5281/zenodo.16786367 |
| ids.openalex | https://openalex.org/W4406779522 |
| fwci | |
| type | preprint |
| title | DHGRPO: Domain-Induced, Hierarchical Group Relative Policy Optimization |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10260 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.8632000088691711 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1710 |
| topics[0].subfield.display_name | Information Systems |
| topics[0].display_name | Software Engineering Research |
| topics[1].id | https://openalex.org/T11652 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.8411999940872192 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Imbalanced Data Classification Techniques |
| topics[2].id | https://openalex.org/T14351 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.8392999768257141 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Statistical and Computational Modeling |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C97541855 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6568659543991089 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[0].display_name | Reinforcement learning |
| concepts[1].id | https://openalex.org/C67203356 |
| concepts[1].level | 2 |
| concepts[1].score | 0.5590121746063232 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q1321905 |
| concepts[1].display_name | Reinforcement |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.37950843572616577 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C188147891 |
| concepts[3].level | 1 |
| concepts[3].score | 0.3306683301925659 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q147638 |
| concepts[3].display_name | Cognitive science |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.3106316924095154 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C15744967 |
| concepts[5].level | 0 |
| concepts[5].score | 0.2911536693572998 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q9418 |
| concepts[5].display_name | Psychology |
| concepts[6].id | https://openalex.org/C77805123 |
| concepts[6].level | 1 |
| concepts[6].score | 0.199873685836792 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q161272 |
| concepts[6].display_name | Social psychology |
| keywords[0].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[0].score | 0.6568659543991089 |
| keywords[0].display_name | Reinforcement learning |
| keywords[1].id | https://openalex.org/keywords/reinforcement |
| keywords[1].score | 0.5590121746063232 |
| keywords[1].display_name | Reinforcement |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.37950843572616577 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/cognitive-science |
| keywords[3].score | 0.3306683301925659 |
| keywords[3].display_name | Cognitive science |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.3106316924095154 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/psychology |
| keywords[5].score | 0.2911536693572998 |
| keywords[5].display_name | Psychology |
| keywords[6].id | https://openalex.org/keywords/social-psychology |
| keywords[6].score | 0.199873685836792 |
| keywords[6].display_name | Social psychology |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2501.12948 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2501.12948 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2501.12948 |
| locations[1].id | doi:10.48550/arxiv.2501.12948 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2501.12948 |
| locations[2].id | doi:10.4230/oasics.icpec.2025.4 |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S7407052059 |
| locations[2].source.type | repository |
| locations[2].source.is_oa | False |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | Dagstuhl Research Online Publication Server |
| locations[2].source.host_organization | |
| locations[2].source.host_organization_name | |
| locations[2].license | cc-by |
| locations[2].pdf_url | |
| locations[2].version | |
| locations[2].raw_type | |
| locations[2].license_id | https://openalex.org/licenses/cc-by |
| locations[2].is_accepted | False |
| locations[2].is_published | |
| locations[2].raw_source_name | |
| locations[2].landing_page_url | https://doi.org/10.4230/oasics.icpec.2025.4 |
| locations[3].id | doi:10.5281/zenodo.16786367 |
| locations[3].is_oa | True |
| locations[3].source.id | https://openalex.org/S4306400562 |
| locations[3].source.issn | |
| locations[3].source.type | repository |
| locations[3].source.is_oa | True |
| locations[3].source.issn_l | |
| locations[3].source.is_core | False |
| locations[3].source.is_in_doaj | False |
| locations[3].source.display_name | Zenodo (CERN European Organization for Nuclear Research) |
| locations[3].source.host_organization | https://openalex.org/I67311998 |
| locations[3].source.host_organization_name | European Organization for Nuclear Research |
| locations[3].source.host_organization_lineage | https://openalex.org/I67311998 |
| locations[3].license | cc-by |
| locations[3].pdf_url | |
| locations[3].version | |
| locations[3].raw_type | article |
| locations[3].license_id | https://openalex.org/licenses/cc-by |
| locations[3].is_accepted | False |
| locations[3].is_published | |
| locations[3].raw_source_name | |
| locations[3].landing_page_url | https://doi.org/10.5281/zenodo.16786367 |
| locations[4].id | doi:10.5281/zenodo.16786368 |
| locations[4].is_oa | True |
| locations[4].source.id | https://openalex.org/S4306400562 |
| locations[4].source.issn | |
| locations[4].source.type | repository |
| locations[4].source.is_oa | True |
| locations[4].source.issn_l | |
| locations[4].source.is_core | False |
| locations[4].source.is_in_doaj | False |
| locations[4].source.display_name | Zenodo (CERN European Organization for Nuclear Research) |
| locations[4].source.host_organization | https://openalex.org/I67311998 |
| locations[4].source.host_organization_name | European Organization for Nuclear Research |
| locations[4].source.host_organization_lineage | https://openalex.org/I67311998 |
| locations[4].license | cc-by |
| locations[4].pdf_url | |
| locations[4].version | |
| locations[4].raw_type | article |
| locations[4].license_id | https://openalex.org/licenses/cc-by |
| locations[4].is_accepted | False |
| locations[4].is_published | |
| locations[4].raw_source_name | |
| locations[4].landing_page_url | https://doi.org/10.5281/zenodo.16786368 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5093670579 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | DeepSeek-AI |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | DeepSeek-AI |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5060364305 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Daya Guo |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Guo, Daya |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5110130182 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Dejian Yang |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Yang, Dejian |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5100615599 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-1985-1126 |
| authorships[3].author.display_name | Haowei Zhang |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Zhang, Haowei |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5063711526 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Junxiao Song |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Song, Junxiao |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5115605110 |
| authorships[5].author.orcid | https://orcid.org/0009-0003-0410-2194 |
| authorships[5].author.display_name | Ruoyu Zhang |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Zhang, Ruoyu |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5023594937 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-3876-2284 |
| authorships[6].author.display_name | Runxin Xu |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Xu, Runxin |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5018225836 |
| authorships[7].author.orcid | https://orcid.org/0000-0002-5566-4491 |
| authorships[7].author.display_name | Qihao Zhu |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Zhu, Qihao |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5057030499 |
| authorships[8].author.orcid | https://orcid.org/0000-0002-0035-9078 |
| authorships[8].author.display_name | Shirong Ma |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Ma, Shirong |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5101800668 |
| authorships[9].author.orcid | https://orcid.org/0000-0001-9121-4327 |
| authorships[9].author.display_name | Peiyi Wang |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | Wang, Peiyi |
| authorships[9].is_corresponding | False |
| authorships[10].author.id | https://openalex.org/A5068799172 |
| authorships[10].author.orcid | https://orcid.org/0000-0002-2733-0451 |
| authorships[10].author.display_name | Xiao Bi |
| authorships[10].author_position | middle |
| authorships[10].raw_author_name | Bi, Xiao |
| authorships[10].is_corresponding | False |
| authorships[11].author.id | https://openalex.org/A5048566877 |
| authorships[11].author.orcid | https://orcid.org/0009-0005-4550-6261 |
| authorships[11].author.display_name | Xiaokang Zhang |
| authorships[11].author_position | middle |
| authorships[11].raw_author_name | Zhang, Xiaokang |
| authorships[11].is_corresponding | False |
| authorships[12].author.id | https://openalex.org/A5100608388 |
| authorships[12].author.orcid | https://orcid.org/0000-0002-7422-7406 |
| authorships[12].author.display_name | Xingkai Yu |
| authorships[12].author_position | middle |
| authorships[12].raw_author_name | Yu, Xingkai |
| authorships[12].is_corresponding | False |
| authorships[13].author.id | https://openalex.org/A5100861568 |
| authorships[13].author.orcid | https://orcid.org/0009-0002-4233-6461 |
| authorships[13].author.display_name | Yu Wu |
| authorships[13].author_position | middle |
| authorships[13].raw_author_name | Wu, Yu |
| authorships[13].is_corresponding | False |
| authorships[14].author.id | https://openalex.org/A5011674202 |
| authorships[14].author.orcid | https://orcid.org/0000-0002-5527-0310 |
| authorships[14].author.display_name | Zhenhua Wu |
| authorships[14].author_position | middle |
| authorships[14].raw_author_name | Wu, Z. F. |
| authorships[14].is_corresponding | False |
| authorships[15].author.id | https://openalex.org/A5094077400 |
| authorships[15].author.orcid | |
| authorships[15].author.display_name | Zhibin Gou |
| authorships[15].author_position | middle |
| authorships[15].raw_author_name | Gou, Zhibin |
| authorships[15].is_corresponding | False |
| authorships[16].author.id | https://openalex.org/A5103100933 |
| authorships[16].author.orcid | https://orcid.org/0000-0001-7290-5958 |
| authorships[16].author.display_name | Zhihong Shao |
| authorships[16].author_position | middle |
| authorships[16].raw_author_name | Shao, Zhihong |
| authorships[16].is_corresponding | False |
| authorships[17].author.id | https://openalex.org/A5082206666 |
| authorships[17].author.orcid | https://orcid.org/0000-0002-2168-7986 |
| authorships[17].author.display_name | Zhuoshu Li |
| authorships[17].author_position | middle |
| authorships[17].raw_author_name | Li, Zhuoshu |
| authorships[17].is_corresponding | False |
| authorships[18].author.id | https://openalex.org/A5006532556 |
| authorships[18].author.orcid | https://orcid.org/0000-0003-2868-8663 |
| authorships[18].author.display_name | Ziyi Gao |
| authorships[18].author_position | middle |
| authorships[18].raw_author_name | Gao, Ziyi |
| authorships[18].is_corresponding | False |
| authorships[19].author.id | https://openalex.org/A5015078816 |
| authorships[19].author.orcid | https://orcid.org/0000-0001-7152-511X |
| authorships[19].author.display_name | Aixin Liu |
| authorships[19].author_position | middle |
| authorships[19].raw_author_name | Liu, Aixin |
| authorships[19].is_corresponding | False |
| authorships[20].author.id | https://openalex.org/A5077569089 |
| authorships[20].author.orcid | https://orcid.org/0000-0002-4865-8026 |
| authorships[20].author.display_name | Bing Xue |
| authorships[20].author_position | middle |
| authorships[20].raw_author_name | Xue, Bing |
| authorships[20].is_corresponding | False |
| authorships[21].author.id | https://openalex.org/A5008187090 |
| authorships[21].author.orcid | |
| authorships[21].author.display_name | Bingxuan Wang |
| authorships[21].author_position | middle |
| authorships[21].raw_author_name | Wang, Bingxuan |
| authorships[21].is_corresponding | False |
| authorships[22].author.id | https://openalex.org/A5103178710 |
| authorships[22].author.orcid | https://orcid.org/0009-0005-8791-9689 |
| authorships[22].author.display_name | Bowen Wu |
| authorships[22].author_position | middle |
| authorships[22].raw_author_name | Wu, Bochao |
| authorships[22].is_corresponding | False |
| authorships[23].author.id | https://openalex.org/A5016665033 |
| authorships[23].author.orcid | |
| authorships[23].author.display_name | Bei Feng |
| authorships[23].author_position | middle |
| authorships[23].raw_author_name | Feng, Bei |
| authorships[23].is_corresponding | False |
| authorships[24].author.id | https://openalex.org/A5090328003 |
| authorships[24].author.orcid | https://orcid.org/0000-0002-9452-4053 |
| authorships[24].author.display_name | Chengda Lu |
| authorships[24].author_position | middle |
| authorships[24].raw_author_name | Lu, Chengda |
| authorships[24].is_corresponding | False |
| authorships[25].author.id | https://openalex.org/A5079470006 |
| authorships[25].author.orcid | |
| authorships[25].author.display_name | Chenggang Zhao |
| authorships[25].author_position | middle |
| authorships[25].raw_author_name | Zhao, Chenggang |
| authorships[25].is_corresponding | False |
| authorships[26].author.id | https://openalex.org/A5084839131 |
| authorships[26].author.orcid | |
| authorships[26].author.display_name | Chengqi Deng |
| authorships[26].author_position | middle |
| authorships[26].raw_author_name | Deng, Chengqi |
| authorships[26].is_corresponding | False |
| authorships[27].author.id | https://openalex.org/A5100377485 |
| authorships[27].author.orcid | https://orcid.org/0000-0002-7137-8954 |
| authorships[27].author.display_name | Chenyu Zhang |
| authorships[27].author_position | middle |
| authorships[27].raw_author_name | Zhang, Chenyu |
| authorships[27].is_corresponding | False |
| authorships[28].author.id | https://openalex.org/A5016017027 |
| authorships[28].author.orcid | |
| authorships[28].author.display_name | Chong Ruan |
| authorships[28].author_position | middle |
| authorships[28].raw_author_name | Ruan, Chong |
| authorships[28].is_corresponding | False |
| authorships[29].author.id | https://openalex.org/A5020456783 |
| authorships[29].author.orcid | |
| authorships[29].author.display_name | Damai Dai |
| authorships[29].author_position | middle |
| authorships[29].raw_author_name | Dai, Damai |
| authorships[29].is_corresponding | False |
| authorships[30].author.id | https://openalex.org/A5085346370 |
| authorships[30].author.orcid | https://orcid.org/0000-0003-3631-6253 |
| authorships[30].author.display_name | Deli Chen |
| authorships[30].author_position | middle |
| authorships[30].raw_author_name | Chen, Deli |
| authorships[30].is_corresponding | False |
| authorships[31].author.id | https://openalex.org/A5043755032 |
| authorships[31].author.orcid | |
| authorships[31].author.display_name | Dongjie Ji |
| authorships[31].author_position | middle |
| authorships[31].raw_author_name | Ji, Dongjie |
| authorships[31].is_corresponding | False |
| authorships[32].author.id | https://openalex.org/A5113091614 |
| authorships[32].author.orcid | |
| authorships[32].author.display_name | Erhang Li |
| authorships[32].author_position | middle |
| authorships[32].raw_author_name | Li, Erhang |
| authorships[32].is_corresponding | False |
| authorships[33].author.id | https://openalex.org/A5109672590 |
| authorships[33].author.orcid | |
| authorships[33].author.display_name | Fangyun Lin |
| authorships[33].author_position | middle |
| authorships[33].raw_author_name | Lin, Fangyun |
| authorships[33].is_corresponding | False |
| authorships[34].author.id | https://openalex.org/A5029983760 |
| authorships[34].author.orcid | https://orcid.org/0000-0002-8539-0356 |
| authorships[34].author.display_name | Fengze Dai |
| authorships[34].author_position | middle |
| authorships[34].raw_author_name | Dai, Fucong |
| authorships[34].is_corresponding | False |
| authorships[35].author.id | https://openalex.org/A5000793347 |
| authorships[35].author.orcid | https://orcid.org/0000-0001-5372-4772 |
| authorships[35].author.display_name | Fuli Luo |
| authorships[35].author_position | middle |
| authorships[35].raw_author_name | Luo, Fuli |
| authorships[35].is_corresponding | False |
| authorships[36].author.id | https://openalex.org/A5064813063 |
| authorships[36].author.orcid | https://orcid.org/0000-0002-5930-5453 |
| authorships[36].author.display_name | Guangbo Hao |
| authorships[36].author_position | middle |
| authorships[36].raw_author_name | Hao, Guangbo |
| authorships[36].is_corresponding | False |
| authorships[37].author.id | https://openalex.org/A5085396184 |
| authorships[37].author.orcid | https://orcid.org/0000-0002-0414-1412 |
| authorships[37].author.display_name | Guan-Ting Chen |
| authorships[37].author_position | middle |
| authorships[37].raw_author_name | Chen, Guanting |
| authorships[37].is_corresponding | False |
| authorships[38].author.id | https://openalex.org/A5100773163 |
| authorships[38].author.orcid | https://orcid.org/0000-0002-2286-3056 |
| authorships[38].author.display_name | Guowei Li |
| authorships[38].author_position | middle |
| authorships[38].raw_author_name | Li, Guowei |
| authorships[38].is_corresponding | False |
| authorships[39].author.id | https://openalex.org/A5100330900 |
| authorships[39].author.orcid | https://orcid.org/0000-0002-4714-8809 |
| authorships[39].author.display_name | Hongjun Zhang |
| authorships[39].author_position | middle |
| authorships[39].raw_author_name | Zhang, H. |
| authorships[39].is_corresponding | False |
| authorships[40].author.id | https://openalex.org/A5113417549 |
| authorships[40].author.orcid | |
| authorships[40].author.display_name | Han Bao |
| authorships[40].author_position | middle |
| authorships[40].raw_author_name | Bao, Han |
| authorships[40].is_corresponding | False |
| authorships[41].author.id | https://openalex.org/A5016073978 |
| authorships[41].author.orcid | https://orcid.org/0009-0001-6301-6339 |
| authorships[41].author.display_name | Hanwei Xu |
| authorships[41].author_position | middle |
| authorships[41].raw_author_name | Xu, Hanwei |
| authorships[41].is_corresponding | False |
| authorships[42].author.id | https://openalex.org/A5034335736 |
| authorships[42].author.orcid | https://orcid.org/0000-0002-5033-0230 |
| authorships[42].author.display_name | Haocheng Wang |
| authorships[42].author_position | middle |
| authorships[42].raw_author_name | Wang, Haocheng |
| authorships[42].is_corresponding | False |
| authorships[43].author.id | https://openalex.org/A5066575939 |
| authorships[43].author.orcid | |
| authorships[43].author.display_name | Honghui Ding |
| authorships[43].author_position | middle |
| authorships[43].raw_author_name | Ding, Honghui |
| authorships[43].is_corresponding | False |
| authorships[44].author.id | https://openalex.org/A5111038333 |
| authorships[44].author.orcid | |
| authorships[44].author.display_name | Huajian Xin |
| authorships[44].author_position | middle |
| authorships[44].raw_author_name | Xin, Huajian |
| authorships[44].is_corresponding | False |
| authorships[45].author.id | https://openalex.org/A5102602497 |
| authorships[45].author.orcid | |
| authorships[45].author.display_name | Huazuo Gao |
| authorships[45].author_position | middle |
| authorships[45].raw_author_name | Gao, Huazuo |
| authorships[45].is_corresponding | False |
| authorships[46].author.id | https://openalex.org/A5075888304 |
| authorships[46].author.orcid | https://orcid.org/0000-0002-7610-6413 |
| authorships[46].author.display_name | Hui Qu |
| authorships[46].author_position | middle |
| authorships[46].raw_author_name | Qu, Hui |
| authorships[46].is_corresponding | False |
| authorships[47].author.id | https://openalex.org/A5117779795 |
| authorships[47].author.orcid | https://orcid.org/0000-0002-7031-8562 |
| authorships[47].author.display_name | Hui Li |
| authorships[47].author_position | middle |
| authorships[47].raw_author_name | Li, Hui |
| authorships[47].is_corresponding | False |
| authorships[48].author.id | https://openalex.org/A5082534819 |
| authorships[48].author.orcid | https://orcid.org/0000-0003-2144-1764 |
| authorships[48].author.display_name | Jianzhong Guo |
| authorships[48].author_position | middle |
| authorships[48].raw_author_name | Guo, Jianzhong |
| authorships[48].is_corresponding | False |
| authorships[49].author.id | https://openalex.org/A5104127081 |
| authorships[49].author.orcid | |
| authorships[49].author.display_name | Jiashi Li |
| authorships[49].author_position | middle |
| authorships[49].raw_author_name | Li, Jiashi |
| authorships[49].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2501.12948 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-01-24T00:00:00 |
| display_name | DHGRPO: Domain-Induced, Hierarchical Group Relative Policy Optimization |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10260 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.8632000088691711 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1710 |
| primary_topic.subfield.display_name | Information Systems |
| primary_topic.display_name | Software Engineering Research |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W4310083477, https://openalex.org/W2328553770, https://openalex.org/W2920061524, https://openalex.org/W1977959518, https://openalex.org/W2038908348, https://openalex.org/W2107890255, https://openalex.org/W2106552856 |
| cited_by_count | 411 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 408 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 3 |
| locations_count | 5 |
| best_oa_location.id | pmh:oai:arXiv.org:2501.12948 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2501.12948 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2501.12948 |
| primary_location.id | pmh:oai:arXiv.org:2501.12948 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2501.12948 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2501.12948 |
| publication_date | 2025-08-09 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 8, 49, 62, 72, 86, 102 |
| abstract_inverted_index.We | 84 |
| abstract_inverted_index.in | 23, 120 |
| abstract_inverted_index.is | 7 |
| abstract_inverted_index.of | 12, 26 |
| abstract_inverted_index.on | 57 |
| abstract_inverted_index.to | 44, 80, 94 |
| abstract_inverted_index.(i) | 33 |
| abstract_inverted_index.The | 30 |
| abstract_inverted_index.and | 39, 70, 98, 113 |
| abstract_inverted_index.for | 53, 66, 117 |
| abstract_inverted_index.its | 99 |
| abstract_inverted_index.the | 90, 95 |
| abstract_inverted_index.via | 37 |
| abstract_inverted_index.(ii) | 48 |
| abstract_inverted_index.(iv) | 71 |
| abstract_inverted_index.GRPO | 96 |
| abstract_inverted_index.from | 89 |
| abstract_inverted_index.that | 18 |
| abstract_inverted_index.with | 76, 106 |
| abstract_inverted_index.(DIF) | 52 |
| abstract_inverted_index.(DOP) | 65 |
| abstract_inverted_index.(MAD) | 43 |
| abstract_inverted_index.(iii) | 61 |
| abstract_inverted_index.Group | 3, 13 |
| abstract_inverted_index.based | 56 |
| abstract_inverted_index.exact | 91 |
| abstract_inverted_index.large | 27 |
| abstract_inverted_index.modes | 22 |
| abstract_inverted_index.proxy | 111 |
| abstract_inverted_index.trust | 54 |
| abstract_inverted_index.(GRPO) | 17 |
| abstract_inverted_index.DHGRPO | 0, 100 |
| abstract_inverted_index.Factor | 51 |
| abstract_inverted_index.Policy | 5, 15 |
| abstract_inverted_index.gating | 55 |
| abstract_inverted_index.median | 38, 40 |
| abstract_inverted_index.method | 31 |
| abstract_inverted_index.policy | 92 |
| abstract_inverted_index.reward | 59, 74 |
| abstract_inverted_index.robust | 34 |
| abstract_inverted_index.scale. | 83 |
| abstract_inverted_index.sweeps | 108 |
| abstract_inverted_index.update | 82 |
| abstract_inverted_index.bounded | 73 |
| abstract_inverted_index.failure | 21 |
| abstract_inverted_index.models. | 29 |
| abstract_inverted_index.outlier | 46 |
| abstract_inverted_index.present | 85 |
| abstract_inverted_index.Relative | 4, 14 |
| abstract_inverted_index.absolute | 41 |
| abstract_inverted_index.gradient | 93 |
| abstract_inverted_index.grounded | 10 |
| abstract_inverted_index.language | 28 |
| abstract_inverted_index.learning | 68 |
| abstract_inverted_index.matching | 79 |
| abstract_inverted_index.optional | 77 |
| abstract_inverted_index.preserve | 81 |
| abstract_inverted_index.stepwise | 87 |
| abstract_inverted_index.suppress | 45 |
| abstract_inverted_index.Parameter | 64 |
| abstract_inverted_index.amplifier | 75 |
| abstract_inverted_index.deviation | 42 |
| abstract_inverted_index.emphasis, | 69 |
| abstract_inverted_index.extension | 11 |
| abstract_inverted_index.framework | 105 |
| abstract_inverted_index.long-term | 58 |
| abstract_inverted_index.magnitude | 78 |
| abstract_inverted_index.mitigates | 19 |
| abstract_inverted_index.surrogate | 97 |
| abstract_inverted_index.actionable | 114 |
| abstract_inverted_index.consistent | 110 |
| abstract_inverted_index.controlled | 103 |
| abstract_inverted_index.deployment | 119 |
| abstract_inverted_index.derivation | 88 |
| abstract_inverted_index.influence, | 47 |
| abstract_inverted_index.per-prompt | 35 |
| abstract_inverted_index.preference | 122 |
| abstract_inverted_index.real-world | 118 |
| abstract_inverted_index.simulation | 104 |
| abstract_inverted_index.stability, | 60 |
| abstract_inverted_index.fine-tuning | 25 |
| abstract_inverted_index.group-level | 20 |
| abstract_inverted_index.integrates: | 32 |
| abstract_inverted_index.large-scale | 121 |
| abstract_inverted_index.refinement, | 101 |
| abstract_inverted_index.Hierarchical | 2 |
| abstract_inverted_index.Optimization | 16 |
| abstract_inverted_index.Optimization) | 6 |
| abstract_inverted_index.demonstrating | 109 |
| abstract_inverted_index.improvements, | 112 |
| abstract_inverted_index.normalization | 36 |
| abstract_inverted_index.optimization. | 123 |
| abstract_inverted_index.Domain-Induced | 50 |
| abstract_inverted_index.hyperparameter | 107 |
| abstract_inverted_index.implementation | 115 |
| abstract_inverted_index.mathematically | 9 |
| abstract_inverted_index.(Domain-Induced | 1 |
| abstract_inverted_index.Domain-Optimism | 63 |
| abstract_inverted_index.recommendations | 116 |
| abstract_inverted_index.preference-based | 24 |
| abstract_inverted_index.recency-weighted | 67 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 200 |
| citation_normalized_percentile |