Abstract:
Human action recognition has a wide range of applications in real life. The people and things involved in the action are the subject of the action. Aimed at the problem that the existing behavior recognition framework cannot describe the subjects participating in the action and the interaction between the subjects, an action recognition method based on the modeling of the relationship between people and things and the relationship between spatial and temporal is proposed. This method used a spatial and temporal graph of people-things to describe behavior. In the graph, the nodes of the graph represented the spatiotemporal state of the subject, and the edges of the graph represented the interaction between the subjects. The spatio-temporal graph of human-object relationship was optimized by reasoning through graph convolution. This method was tested on HMDB51 and UCF101, and the experimental results were 77.73% and 96.59% respectively.