Preprint / Version 1

Agentic AI for Computer Vision: A Review

##article.authors##

  • Anwaar Ulhaq CQ University, Australia

DOI:

https://doi.org/10.31224/5832

Keywords:

Agentic AI, Vision agents

Abstract

General Computer vision models operate as passive systems that produce one output and stop, while agentic AI for computer vision represents a shift toward autonomous visual decision-making systems. In such systems, visual input is used to plan actions, decide next steps, and refine results through feedback. It allows the model to select or design the optimal processing steps and improve its output over time. This review maps current research on agentic approaches in computer vision and examines how autonomy is implemented in practice. This search is based on standard databases, including IEEE Xplore, ACM Digital Library, Scopus, Web of Science, arXiv, and OpenReview, and only studies where the final task was a computer vision outcome were included. Papers that used agents for non-visual tasks were excluded. Screening, selection, and data charting were conducted using the PRISMA Scoping Review (PRISMA ScR) guidelines. A scoping review was chosen instead of a systematic review because the field is new, and preprints dominate the evidence base. Each included study was examined for its autonomy design, visual modality, datasets, and evaluation method. The results indicate that agentic approaches are promising for tasks that involve multiple steps, self-correction, or interaction with an environment. This review clarifies the emerging landscape of agentic AI for computer vision and identifies research gaps, including the need for stronger definitions of autonomy, shared testing environments, and reproducible evaluation methods.

Downloads

Download data is not yet available.

Downloads

Posted

2025-11-20