Human action recognition has been one of the hot topics in computer vision both from the handcrafted and deep learning approaches. In the handcrafted approach, the extracted features are encoded for reducing the size of these features. Amonsgt the state-of-the-art approaches is to encode these visual features using the Gaussian mixture model. However, the size of the codebook is an issue in terms of the computation complexity, especially for large-scale data as it requires encoding using a large codebook. In this paper, we introduced the use of different optimizers to reduce the codebook size while boosting its accuracy. To illustrate the performance , first we use the improved dense trajectories (IDT) to extract the handcrafted features. This is followed with encoding the descriptor using Fisher kernel-based codebook using the Gaussian mixture model. Next, the support vector machine is used to classify the categories. We then use and compare five different Stochastic gradient descent optimization techniques to modify the number of Gaussian components. In this manner we are able to select the discriminative foreground features (as represented by the final number of Gaussian components), and omit the background features. Finally, to show the performance improvement of the proposed method, we implement this technique to two datasets UCF101 and HMDB51.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.