Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Exploring foci of: arXiv (Cornell University) Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent November 2024 • Xingwu Sun, Yanfeng Chen, Yiqing Huang, Ruobing Xie, Jiaqi Zhu, Kai Zhang, Shuaipeng Li, Zhen Yang, J. L. Han, Shu Xiao-bo, Jiahao Bu, Z C Chen, X. P… In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 billion activation parameters, capable of handling up to 256K tokens. We conduct a thorough evaluation of Hunyuan-Large's superior performance across various benchmarks including language understanding and generation, logical reasoning, mathematical problem-solving, coding, long-context, and aggregated tasks, where it outperforms LLama3.1-70… Open Article Page

Computer Science Open Article