Integrating Low-Level Image Processing with Generative AI and Intelligent Agents
Presentation Menu
The rapid advancement of Generative AI and foundation models is projected to reshape the landscape of intelligent media processing in the near future. While these emerging technologies empower low-level image processing with superior perceptual quality, addressing challenges such as universality, semantic consistency, and automatic human-computer interaction (HCI) remains a critical requirement. The field has increasingly turned to generative foundation models to tackle these hurdles by exploiting brand-new architectures, e.g., diffusion models, and reasoning capabilities, e.g., Large Language Models (LLMs) and Multi-modal LLMs. This tutorial introduces the recent progress of Generative AI and Intelligent Agents within visual systems, covering definitions and methodologies, followed by comprehensive applications in image compression, restoration, and quality assessment. It examines recent research directions, e.g., diffusion-based enhancement, agent-driven compression, and interactive quality evaluation, and finally discusses research opportunities looking forward, focusing on universal frameworks and multi-agent cooperation.