Multi-modal scene understanding theory and application

Object detection is a fundamental research topic in computer vision and has made remarkable progress in recent years. However, natural images usually contain objects with various categories, sizes and semantic confusion, which makes it challenging to accurately locate these objects. It aims to simultaneously locate objects of interest and recognize their categories information. This lecture will discuss recent research directions and hot topics for the object detection. Firstly, a novel detection method is introduced by employing a set of growing cross lines as object represents. An object is flexibly represented as cross lines in different combinations, which will contribute to enhance discriminability of object features for classification and accurately find the object boundaries location. Then, by regarding the object regression as a classification problem, a new strategy is introduced to predict accurate object location, which provides different penalties for diffident samples and avoids the gradient explosion problem caused by samples with large errors. Furthermore, this talk will discuss how to attack the class label noise problem, which employs different losses to describe different roles of noisy class labels to enhance the learning. Finally, the challenges and opportunities of object detection in the future will be summarized.

Join CASS

Join CASS

Join CASS

Visit CASS MiLe

Join CASS

ISCAS 2025

2025 23nd International Forum on MPSoC for Software-Defined Hardware (MPSoC)

2025 IEEE 23rd Interregional NEWCAS Conference

2025 IEEE International Conference on Multimedia & Expo

Multi-modal scene understanding theory and application

Hongliang Li

Presentation Menu