Skip to main content

Advances, Systems and Applications

Table 3 Our model is compared to various approaches in sport action recognition in order to determine their accuracy levels

From: Recognizing sports activities from video frames using deformable convolution and adaptive multiscale features

Reference

Accuracy

Architecture

UCF sport

YouTube

UCF50

Ullah et al. [13]

 

96.21%

96.40%

Optimized deep autoencoder and CNN

Liu et al. [61]

-

89.7%

93.20%

Hierarchical clustering multi-task learning

Sadanand et al. [62]

95.00%

-

57.9%

High-level representation

Tu et al. [63]

97.53%

-

-

Multi-stream CNN

Afza et al. [64]

99.30%

94.50%

-

Features fusion and weighted entropy-variances

Muhammad et al. [65]

99.10%

98.30%

-

Attention based LSTM network with dilated CNN

Meng et al. [66]

93.20%

89.70%

-

Spatial-temporal convolutional neural network and LSTM

Gammulle et al. [67]

92.20%

89.20%

-

Two stream LSTM

Ijjina et al. [68]

98.90%

94.60%

-

Hybrid deep neural network

Zhou et al. [69]

98.75%

97.60%

-

Density clustering and context-guided Bi-LSTM

Xiong et al. [70]

-

-

96.71%

Two-Stream 3D Dilated Neural Network

Zhang et al. [71]

-

-

60.40%

LSTM and fully-connected LSTM with different attentions

Dai et al. [72]

98.90%

96.90%

-

Two-stream attention-based LSTM

Proposed model

97.84%

98.90%

97.75%

Deformable convolution and adaptive multiscale features