Search Results for author: Lingjun Li

Found 2 papers, 2 papers with code

Yuan 2.0-M32: Mixture of Experts with Attention Router

1 code implementation28 May 2024 Shaohua Wu, Jiangang Luo, Xi Chen, Lingjun Li, Xudong Zhao, Tong Yu, Chao Wang, Yue Wang, Fei Wang, Weixu Qiao, Houbo He, Zeru Zhang, Zeyu Sun, Junxiong Mao, Chong Shen

Yuan 2. 0-M32, with a similar base architecture as Yuan-2. 0 2B, uses a mixture-of-experts architecture with 32 experts of which 2 experts are active.

YUAN 2.0: A Large Language Model with Localized Filtering-based Attention

2 code implementations27 Nov 2023 Shaohua Wu, Xudong Zhao, Shenling Wang, Jiangang Luo, Lingjun Li, Xi Chen, Bing Zhao, Wei Wang, Tong Yu, Rongguo Zhang, Jiahua Zhang, Chao Wang

In this work, we develop and release Yuan 2. 0, a series of large language models with parameters ranging from 2. 1 billion to 102. 6 billion.

Code Generation Language Modelling +2

Cannot find the paper you are looking for? You can Submit a new open access paper.