ATBench: A Diverse and Realistic Agent Trajectory Benchmark for Safety Evaluation and Diagnosis

Published in arXiv preprint, 2026

Co-first Author

ATBench is a trajectory-level benchmark for evaluating agent safety in realistic long-horizon interactions. It organizes agentic risk by source, failure mode, and real-world harm, enabling taxonomy-stratified diagnosis of safety failures across frontier models, open-source models, and guard systems.

[Dataset]

Recommended citation: Yu Li*, Haoyu Luo*, Yuejin Xie*, Yuqian Fu, Zhonghao Yang, Shuai Shao, Qihan Ren, Wanying Qu, Yanwei Fu, Yujiu Yang, Jing Shao, Xia Hu, Dongrui Liu. (2026). "ATBench: A Diverse and Realistic Agent Trajectory Benchmark for Safety Evaluation and Diagnosis." arXiv preprint arXiv:2604.02022. https://arxiv.org/abs/2604.02022