A Review of Multi-modal Human Motion Recognition Based on Deep Learning

Ye Li; Yifan Pan; Xinhui Wu

Authors

Ye Li Shenyang Normal University
Yifan Pan Shenyang Normal University
Xinhui Wu Shenyang Normal University

Keywords:

Human motion recognition, Computer vision, Multi-modal, Deep learning

Abstract

Human motion recognition is a research hotspot in the field of computer vision, which has a wide range of applications, including biometrics, intelligent surveillance and human-computer interaction. In vision-based human motion recognition, the main input modes are RGB, depth image and bone data. Each mode can capture some kind of information, which is likely to be complementary to other modes, for example, some modes capture global information while others capture local details of an action. Intuitively speaking, the fusion of multiple modal data can improve the recognition accuracy. In addition, how to correctly model and utilize spatiotemporal information is one of the challenges facing human motion recognition. Aiming at the feature extraction methods involved in human action recognition tasks in video, this paper summarizes the traditional manual feature extraction methods from the aspects of global feature extraction and local feature extraction, and introduces the commonly used feature learning models of feature extraction methods based on deep learning in detail. This paper summarizes the opportunities and challenges in the field of motion recognition and looks forward to the possible research directions in the future.

A Review of Multi-modal Human Motion Recognition Based on Deep Learning

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

Current Issue

Information

Make a Submission