Exploring foci of:
arXiv (Cornell University)
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
November 2024 • Xingwu Sun, Yanfeng Chen, Yiqing Huang, Ruobing Xie, Jiaqi Zhu, Kai Zhang, Shuaipeng Li, Zhen Yang, J. L. Han, Shu Xiao-bo, Jiahao Bu, Z C Chen, X. P…
In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 billion activation parameters, capable of handling up to 256K tokens. We conduct a thorough evaluation of Hunyuan-Large's superior performance across various benchmarks including language understanding and generation, logical reasoning, mathematical problem-solving, coding, long-context, and aggregated tasks, where it outperforms LLama3.1-70…
Computer Science