Search Results for author: Haoran Que

Found 3 papers, 1 papers with code

D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models

no code implementations3 Jun 2024 Haoran Que, Jiaheng Liu, Ge Zhang, Chenchen Zhang, Xingwei Qu, Yinghao Ma, Feiyu Duan, Zhiqi Bai, Jiakai Wang, Yuanxing Zhang, Xu Tan, Jie Fu, Wenbo Su, Jiamang Wang, Lin Qu, Bo Zheng

To address the limitations of existing methods, inspired by the Scaling Law for performance prediction, we propose to investigate the Scaling Law of the Domain-specific Continual Pre-Training (D-CPT Law) to decide the optimal mixture ratio with acceptable training costs for LLMs of different sizes.

Math

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models

no code implementations13 Jan 2024 Jiaheng Liu, Zhiqi Bai, Yuanxing Zhang, Chenchen Zhang, Yu Zhang, Ge Zhang, Jiakai Wang, Haoran Que, Yukang Chen, Wenbo Su, Tiezheng Ge, Jie Fu, Wenhu Chen, Bo Zheng

Typically, training LLMs with long context sizes is computationally expensive, requiring extensive training hours and GPU resources.

4k Position

RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models

1 code implementation1 Oct 2023 Zekun Moore Wang, Zhongyuan Peng, Haoran Que, Jiaheng Liu, Wangchunshu Zhou, Yuhan Wu, Hongcheng Guo, Ruitong Gan, Zehao Ni, Jian Yang, Man Zhang, Zhaoxiang Zhang, Wanli Ouyang, Ke Xu, Stephen W. Huang, Jie Fu, Junran Peng

The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters.

Benchmarking

Cannot find the paper you are looking for? You can Submit a new open access paper.