Multi-modal scene understanding theory and application
Object detection is a fundamental research topic in computer vision and has made remarkable progress in recent years. However, natural images usually contain objects with various categories, sizes and semantic confusion, which makes it challenging to accurately locate these objects. It aims to simultaneously locate objects of interest and recognize their categories information. This lecture will discuss recent research directions and hot topics for the object detection. Firstly, a novel detection method is introduced by employing a set of growing cross lines as object represents. An object is flexibly represented as cross lines in different combinations, which will contribute to enhance discriminability of object features for classification and accurately find the object boundaries location. Then, by regarding the object regression as a classification problem, a new strategy is introduced to predict accurate object location, which provides different penalties for diffident samples and avoids the gradient explosion problem caused by samples with large errors. Furthermore, this talk will discuss how to attack the class label noise problem, which employs different losses to describe different roles of noisy class labels to enhance the learning. Finally, the challenges and opportunities of object detection in the future will be summarized.