
Understanding movements has an important role in the analysis of cross -based media and learning to represent multiple knowledge. A group of researchers led by Hehe Fan studied the problems of recognition and prediction of the physical movement using deep neural networks (DNN), in particular convolutional neural networks and networks of recurring neural. Scientists have developed and tested an in -depth learning approach based on a coded position change as a series of vectors, and discovered that their method outperform existing movement modeling frames.
In physics, movement is a relative change in position over time. To eliminate object and background factors, scientists focused on an ideal scenario in which a point moves in a two -dimensional (2D) plan. Two tasks were used to assess the ability of DNN architectures to model movement: movement recognition and movement prediction. Consequently, a vector network (VECNET) has been developed to model a change in relative position. The key innovation of scientists was to code the movement of the position separately.
Group's research was published in the journal Smart IT.
The study focuses on movement analysis. Movement recognition aims to recognize different types of movements in a series of observations. This can be considered as one of the conditions necessary for the recognition of action, because recognition of action can be divided into recognition of objects and recognition of movement. For example, to recognize the action “Open the door”, the DNNs must recognize the object “door” and the “open” movement. Otherwise, the model would not distinguish “open the door” to “open the window” or “open the door” to close the door “. The prediction of movement aims to predict future position changes after viewing part of the movement, that is to say the context of movement, which can be considered one of the conditions required for video predictions.
Vecnet takes on a short -term movement as a vector. Vecnet can also move the point to the corresponding position given by the vector representation. To acquire an overview of the movement over time, a long short -term memory (LSTM) has been used to aggregate or predict vector representations over time. The new VECNET + LSTM method, which can effectively take care of recognition and prediction effectively, proving that modeling of the relative position change is necessary for movement recognition and facilitates movement prediction.
Recognition of action is linked to movement recognition because it is linked to movement. Since there is no current DNN architecture without ambiguity for action recognition, the researchers compared and studied a subset of models covering most of the field.
The VECNET + LSTM approach obtained a higher score in movement recognition tests than six other popular DNN architectures from video studies on the modeling of relative position change. Some of them were simply lower, and some were completely unsuitable for the movement modeling task.
For example, compared to the ConvlSTM method, the new method was more precise, required less training time and has not lost precision so quickly when carrying out additional predictions.
Experiments have shown that the VECNET + LSTM method is effective for recognition and movement prediction. He confirms that the use of the change in relative position considerably improves movement modeling. With methods of appearance or image processing, the movement modeling method offered can be used for a general understanding of the video that can be studied in the future.
