Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Posts

portfolio

publications

Multi-Agent Conversational Online Learning for Adaptive LLM Response Identification

Published in arXiv preprint, 2025

We develop a novel Bandit algorithm for rapidly identifying user preferences to improve LLM responses.

Recommended citation: Xiangxiang Dai, Yuejin Xie, Maoli Liu, Xuchuang Wang, Zhuohua Li, Huanyu Wang, John C.S. Lui. (2025). "Multi-Agent Conversational Online Learning for Adaptive LLM Response Identification." arXiv preprint arXiv:2501.01849. https://arxiv.org/abs/2501.01849

Towards Evaluating Proactive Risk Awareness of Multimodal Language Models

Published in NeurIPS 2025 D&B Track, 2025

We propose a benchmark for evaluating proactive risk awareness in multimodal language models.

Recommended citation: Youliang Yuan, Wenxiang Jiao, Yuejin Xie, Chihao Shen, Menghan Tian, Wenxuan Wang, Jen-tse Huang, Pinjia He. (2025). "Towards Evaluating Proactive Risk Awareness of Multimodal Language Models." NeurIPS 2025 Datasets and Benchmarks Track. https://arxiv.org/abs/2505.17455

ToolSafety: A Comprehensive Dataset for Enhancing Safety in LLM-Based Agent Tool Invocations

Published in EMNLP 2025, 2025

We introduce ToolSafety, a safety fine-tuning dataset containing 5,668 direct harm samples, 4,311 indirect harm samples, and 4,311 multi-step samples to address safety vulnerabilities in tool-using AI systems.

Recommended citation: Yuejin Xie, Youliang Yuan, Wenxuan Wang, Fan Mo, Jianmin Guo, Pinjia He. (2025). "ToolSafety: A Comprehensive Dataset for Enhancing Safety in LLM-Based Agent Tool Invocations." Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP). https://aclanthology.org/2025.emnlp-main.714/

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Published in arXiv preprint, 2026

We propose AgentDoG, a diagnostic guardrail framework that provides fine-grained and contextual monitoring across agent trajectories, diagnosing root causes of unsafe actions.

Recommended citation: Dongrui Liu, ..., Yuejin Xie, et al. (2026). "AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security." arXiv preprint arXiv:2601.18491. https://arxiv.org/abs/2601.18491

Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?

Published in arXiv preprint, 2026

We propose a multi-agent framework that leverages code agents to autonomously evolve existing math problems into more complex variants while validating solvability and increased difficulty.

Recommended citation: Dadi Guo*, Yuejin Xie*, Qingyu Liu, Jiayu Liu, Zhiyuan Fan, Qihan Ren, Shuai Shao, Tianyi Zhou, Dongrui Liu, Yi R. Fung. (2026). "Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?" arXiv preprint arXiv:2603.03202. https://arxiv.org/abs/2603.03202

Yuejin Xie

Sitemap

Pages

Page Not Found

About me

Archive Layout with Content

Posts by Category

Posts by Collection

CV

Markdown

Page not in menu

Page Archive

Portfolio

Publications

Sitemap

Posts by Tags

Talk map

Talks and presentations

Teaching

Terms and Privacy Policy

Blog posts

Jupyter notebook markdown generator

Posts

portfolio

publications

Multi-Agent Conversational Online Learning for Adaptive LLM Response Identification

Towards Evaluating Proactive Risk Awareness of Multimodal Language Models

ToolSafety: A Comprehensive Dataset for Enhancing Safety in LLM-Based Agent Tool Invocations

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?

talks

teaching