Zhi-You Hong
YOU?
Author Swipe
View article: RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning
RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning Open
Reinforcement learning (RL) has recently emerged as a compelling approach for enhancing the reasoning capabilities of large language models (LLMs), where an LLM generator serves as a policy guided by a verifier (reward model). However, cur…
View article: Efficient multiple unmanned aerial vehicle-assisted data collection strategy in power infrastructure construction
Efficient multiple unmanned aerial vehicle-assisted data collection strategy in power infrastructure construction Open
Efficient data collection and sharing play a crucial role in power infrastructure construction. However, in an outdoor remote area, the data collection efficiency is reduced because of the sparse distribution of base stations (BSs). Unmann…