A Multi-Agent Conversational Bandit Approach to Online Evaluation and Selection of User-Aligned LLM Responses

Published in AAAI 2026, 2026

Recommended citation: Xiangxiang Dai, Yuejin Xie, Maoli Liu, Xuchuang Wang, Zhuohua Li, Huanyu Wang, John C.S. Lui. (2026). "A Multi-Agent Conversational Bandit Approach to Online Evaluation and Selection of User-Aligned LLM Responses." Proceedings of the AAAI Conference on Artificial Intelligence, 40(44), 37323-37331. https://doi.org/10.1609/aaai.v40i44.41064

Share on

Twitter Facebook LinkedIn