Abstract
The escalating complexity of sixth-generation (6G) networks demands unprecedented levels of autonomy beyond the capabilities of traditional optimization-based and current AI-based resource management approaches. While agentic AI has emerged as a promising paradigm for autonomous RAN, current frameworks provide sophisticated reasoning capabilities but lack mechanisms for empirical validation and self-improvement. This article identifies simulation-in-the-loop validation as a critical enabler for truly autonomous networks, where AI agents can empirically verify decisions and learn from outcomes. We present the first reflection-driven self-optimization framework that integrates agentic AI with high-fidelity network simulation in a closed-loop architecture. Our system orchestrates four specialized agents, including scenario, solver, simulation, and reflector agents, working in concert to transform agentic AI into a self-correcting system capable of escaping local optima, recognizing implic