News


7 April 2026
Congratulations to my interns Zhezheng Hao and Hong Wang! Our papers ReCreate: Reasoning and Creating Domain Agents Driven by Experience and Rethinking Entropy Interventions in RLVR: An Entropy Change Perspective are accepted by ACL'26 main conference.

7 April 2026
Congratulations to my interns Yuyan Zhou and Jianqing Zhang! Three papers are accepted by ACL'26 Findings.

26 January 2026
Congratulations to my interns Hong Wang and Zhezheng Hao! Our paper Scheduling Your LLM Reinforcement Learning with Reasoning Trees is accepted by ICLR'26.

8 November 2025
Congratulations to my intern Jianqing Zhang! Our paper AP2O-Coder: Adaptively Progressive Preference Optimization for Reducing Compilation and Runtime Errors in LLM-Generated Code is accepted by AAAI'26.

1 August 2025
Congratulations to my intern Jinke Li! Our paper UniSVG: A Unified Dataset for Vector Graphic Understanding and Generation with Multimodal Large Language Models is accepted by ACMMM'25.

2 June 2025
Congratulations to my intern Tianyu Guo! Our paper EFIM: Efficient Serving of LLMs for Infilling Tasks with Improved KV Cache Reuse is accepted by EURO-PAR'25.

22 August 2023
I began working at Tencent, as a senior researcher.

15 July 2022
I started my first job at International Digital Economy Academy (IDEA), as an algorithm engineer.

15 April 2021
One full paper is accepted by SIGIR'21, AutoDebias: Learning to Debias for Recommendation.

16 Jan 2021
One full paper is accepted by WWW'21, On the Equivalence of Decoupled GCN and Label Propagation.

Hande Dong 

LLM R&D Lead, CodeBuddy/WorkBuddy

Tencent, Shenzhen, China

Email: donghd66 AT gmail.com
Google ScholarGithub

I am the LLM R&D Lead of Tencent CodeBuddy/WorkBuddy, leading the model research and development team. My expertise is centered around Large Language Models, spanning pre-training, post-training, reinforcement learning, and LLM Agent. Currently, I am focused on leveraging the vast experience data generated by widely deployed Agent applications to enhance model capabilities.

Work Experiences

LLM R&D Lead, Tencent, 2026.04-today
Working Area: large language model, code intelligence, AI agent, etc.
Responsibilities: leading the model research and development team of CodeBuddy/WorkBuddy.
Product: CodeBuddy, WorkBuddy.
Senior Researcher, Tencent, 2023.08-2026.04
Working Area: large language model, code intelligence, RAG, code agent, etc.
Responsibilities: model research and development of CodeBuddy.
Product: CodeBuddy.
Algorithm Engineer, International Digital Economy Academy, 2022.07-2023.08
Working Area: code understanding and generation, pretrained language model, large language model.

Education

University of Science and Technology of China (USTC)
Master in School of Information Science and Technology                   2019.09 - 2022.06
Advisor: Prof. Xiangnan He
University of Science and Technology of China (USTC)
Bachelor in School of Physical Sciences      2015.08 - 2019.06
Chung-Yao Chao Talent Program in Applied Physics

Selected Publications


Rethinking Entropy Interventions in RLVR: An Entropy Change Perspective
Zhezheng Hao, Hong Wang, Haoyang Liu, Jian Luo, Jiarui Yu, Hande Dong, Qiang Lin, Can Wang, Jiawei Chen
ACL 2026   • arXiv   • Codes   • Corresponding author
ReCreate: Reasoning and Creating Domain Agents Driven by Experience
Zhezheng Hao, Hong Wang, Jian Luo, Jianqing Zhang, Yuyan Zhou, Qiang Lin, Can Wang, Hande Dong, Jiawei Chen
ACL 2026   • arXiv   • Codes   • Corresponding author
LEPO: Latent Reasoning Policy Optimization for Large Language Models
Yuyan Zhou, Jiarui Yu, Hande Dong, Zhezheng Hao, Hong Wang, Jianqing Zhang, Qiang Lin
ACL 2026 Findings   • Corresponding author
Scheduling Your LLM Reinforcement Learning with Reasoning Trees
Hong Wang, Zhezheng Hao, Jian Luo, Chenxing Wei, Yao Shu, Lei Liu, Qiang Lin, Hande Dong, Jiawei Chen
ICLR 2026   • arXiv   • Codes   • Corresponding author
AP2O: Correcting LLM-Generated Code Errors Type by Type Like Humans via Adaptive Progressive Preference Optimization
Jianqing Zhang, Wei Xia, Hande Dong, Qiang Lin, Jian Cao
AAAI 2026   • arXiv   • Codes
UniSVG: A Unified Dataset for Vector Graphic Understanding and Generation with Multimodal Large Language Models
Jinke Li, Jiarui Yu, Chenxing Wei, Hande Dong, Qiang Lin, Liangjing Yang, Zhicai Wang, Yanbin Hao
ACMMM 2025 dataset track   • arXiv   • Codes   • Corresponding author
EFIM: Efficient Serving of LLMs for Infilling Tasks with Improved KV Cache Reuse
Tianyu Guo, Hande Dong, Yichong Leng, Feng Liu, Cheater Lin, Nong Xiao, Xianwei Zhang
EURO-PAR 2025   • arXiv   • Codes   • Corresponding author
Improving Code Search with Hard Negative Sampling Based on Fine-tuning
Hande Dong, Jiayi Lin, Yanlin Wang, Yichong Leng, Jiawei Chen, Yutao Xie
APSEC 2024   • arXiv   • Codes
Survey of Code Search based on Deep Learning
Yutao Xie, Jiayi Lin, Hande Dong, Lei Zhang, Zhonghai Wu
ACM Transactions on Software Engineering and Methodology   • arXiv
Bias and Debias in Recommender System: A Survey and Future Directions
Jiawei Chen, Hande Dong, Xiang Wang, Fuli Feng, Meng Wang, Xiangnan He
ACM Transactions on Information Systems   • arXiv
AutoDebias: Learning to Debias for Recommendation
Jiawei Chen, Hande Dong, Yang Qiu, Xiangnan He, Xin Xin, Liang Chen, Guli Lin, Keping Yang
SIGIR 2021   • arXiv   • Codes   • Slides   • Co-first author
On the Equivalence of Decoupled Graph Convolution Network and Label Propagation
Hande Dong, Jiawei Chen, Fuli Feng, Xiangnan He, Shuxian Bi, Zhaolin Ding, Peng Cui
WWW 2021   • arXiv   • Codes   • Slides

Selected Honors & Awards

GM Lightning Award, 2025.12, Tencent, 2%
Outstanding Mentor for New Employees, 2024, Tencent
Outstanding contributor, 2024.12, Tencent, 10%
Outstanding contributor, 2024.06, Tencent, 10%
Outstanding graduates, 2022.06, University of Science and Technology of China, 10%
Character and academic outstanding graduates, 2019.06, Anhui Province, 2%
Outstanding graduates, 2019.06, University of Science and Technology of China, 10%
Outstanding student leaders, 2018.05, University of Science and Technology of China