In computer vision, image recognition is one of the classic tasks. Currently, with the foundation of big data and advanced hardware, deep learning has achieved high accuracy. However, deep learning often fails to perform well when faced with a small number of samples. Therefore, few-shot learning has become a key technology to solve this problem. The learning paradigm of few-shot learning is different from that of deep learning. It aims to learn a universal representation from multiple training categories, used for recognition in new categories. Each few-shot learning training instance consists of a group of images and an unlabeled sample. The goal is to enable the model to perform well in recognizing new categories. To achieve this, the model needs to extract representative and highly generalizable features that enable the correct recognition of new category samples. To address the problem of small sample space being unable to describe enough dataset’s semantic features, we propose the attention mechanism and earth mover’s distance for few-shot learning (AMEMD-FSL) method. First, we fuse the attention mechanism (AM) to deep learning to help the model extract more semantically rich features. Then we use the earth mover’s distance (EMD) metric method to calculate the distance between samples, enabling better classification. Finally, we combine the deep-learning residual network and AMEMD to perform few-shot learning. We validate our algorithm on the Caltech-UCSD Birds-200-2011 dataset and the few-shot public dataset mini-ImageNet, which comes from the DeepMind team. The experimental results demonstrate that we have proposed an end-to-end and effective method in the field of few-shot image classification. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Machine learning
Education and training
Deep learning
Data modeling
Feature extraction
Image classification
Distance measurement