A Gibbs Sampler based topic model for image annotation, which takes into account the interaction between visual geometric context and related topic, is presented. Most of the existing topic models for scene annotation use segmentation-based algorithm. However, topic models using segmentation algorithm alone sometimes can produce erroneous results when used to annotate real-life scene pictures. Therefore, our algorithm makes use of peaks of image surface instead of segmentation regions. Existing approaches use SIFT algorithm and treat the peaks as round blob features. In this paper, the peaks are treated as anisotropic blob features, which models low level visual elements more precisely. In order to better utilize visual features, our model not only takes into consideration visual codeword, but also considers influence of visual properties to topic formation, such as orientation, width, length and color. The basic idea is based on the assumption that different topics will produce distinct visual appearance, and different visual appearance is helpful to distinguish topics. During the learning stage, each topic will be associated with a set of distributions of visual properties, which depicts appearance of the topic. This paper considers more geometric properties, which will reduce topic uncertainty and learn the images better. Tested with Corel5K, SAIAPR-TC12 and Espgame100k Datasets, our method performs moderately better than some state of the arts methods.
|