Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?

Published in arXiv preprint, 2026

Co-first Author

As large language models advance their mathematical capabilities toward the IMO level, the scarcity of challenging, high-quality problems for training and evaluation has become a significant bottleneck. We investigate the potential of code agents to autonomously evolve existing math problems into more complex variations, introducing a multi-agent framework designed to perform problem evolution while validating the solvability and increased difficulty of the generated problems. Our experiments demonstrate that, given sufficient test-time exploration, code agents can synthesize new, solvable problems that are structurally distinct from and more challenging than the originals.

[Code]

Recommended citation: Dadi Guo*, Yuejin Xie*, Qingyu Liu, Jiayu Liu, Zhiyuan Fan, Qihan Ren, Shuai Shao, Tianyi Zhou, Dongrui Liu, Yi R. Fung. (2026). "Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?" arXiv preprint arXiv:2603.03202. https://arxiv.org/abs/2603.03202