Transparency in Agentic AI: A Survey of Interpretability, Explainability, and Governance
DOI:
https://doi.org/10.31224/6451Abstract
Agentic AI systems built on large language models plan, use tools, and maintain memory over multiple steps. Their risks and responsibilities depend on an execution trajectory rather than a single output. Despite progress, work on transparency for such systems remains scattered. Most explainability and interpretability research still targets static or single step model outputs, while Agentic AI surveys emphasize planning, tools, and memory, giving limited attention to transparency and oversight. Literature insufficiently addresses what should be subject to transparency and recorded during an agent’s lifecycle and how to verify records. Addressing this need, this survey offers a transparency-focused analysis by connecting interpretability, explainability, and governance for Agentic AI systems from design to deployment and synthesizing relevant methods for agent artifacts, including plans, tool interactions, memory events, and coordination signals, relating them to assurance needs such as faithfulness, auditability, compliance, robustness, and equity. The paper consolidates evaluation practices and highlights gaps, especially in trajectory level accountability, tool mediated provenance, and multi-agent coordination transparency. It proposes the Minimal Explanation Packet as standardized outcome artifact bundling key lifecycle evidence into an audit-ready record. The survey serves as a reference for researchers and practitioners to consistently compare approaches, design evaluations, and report transparency evidence.
Downloads
Downloads
Posted
License
Copyright (c) 2026 Shaina Raza, Ahmed Radwan, Sindhuja Chaduvula, Mahshid Alinoori, Christos Emmanouilidis

This work is licensed under a Creative Commons Attribution 4.0 International License.