Direct Multi-Token Decoding by Xuan Luo, Weizhi Wang, Xifeng Yan
arXiv:2510.11958, 2025
[arxiv]
Train a Unified Multimodal Data Quality Classifier with Synthetic
Data by Weizhi Wang, Rongmei Lin, Shiyang Li, Colin Lockard, Ritesh
Sarkhel, Sanket Lokegaonkar, Jingbo Shang, Xifeng Yan, Nasser Zalmout,
Xian Li EMNLP'25
(Proceedings of Findings of EMNLP 2025)
[pdf]
Adaptive Layer-skipping in Pre-trained LLMs X. Luo, W. Wang, X.
Yan COLM'25
(Conference on Language Modeling), arXiv:2503.23798v2, 2025 [arxiv]
Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal
LLMs on Academic Resources
Weizhi Wang, Yu Tian, Linjie Yang, Heng Wang, Xifeng Yan COLM'25 (Conference
on Language Modeling),
arXiv:2504.00595v2, 2025 [arxiv]
Language Models Augmented with Decoupled Memory, by W. Wang, L.
Dong, H. Cheng, X. Liu, X. Yan, J. Gao, F. Wei NeurIPS'23(The Thirty-seventh Annual Conference on Neural Information Processing
Systems), 2023 [arxiv]
Visually-augmented language modeling by W. Wang, L. Dong, H. Cheng, H.
Song, X. Liu, X. Yan, J. Gao, F. Wei
ICLR'23
(Proceedings of Int. Conf. on Learning Representations) [pdf]
Enhancing the Locality and Breaking the Memory Bottleneck of Transformer
on Time Series Forecasting, by S. Li, X. Jin, Y. Xuan, X. Zhou, W. Chen, Y.-X. Wang, X. Yan NeurIPS'19 (The Thirty-third Annual Conference on Neural Information Processing
Systems) [pdf]