D3.2: ML Detector to analyse crowd movements in videos

Executive Summary

This deliverable describes the efforts done during Period 1, Period 2 and part of Period 3 in the Work Package 3 of the CrowdDNA project towards developing a new crowd simulator algorithm tailored to model both macro and micro-level crowd characteristics. As a reminder, the overall objective of WP3 is to deliver a new method for crowd motion analysis that is capable of estimating the intensity of physical interactions between individuals from the observation of macroscopic crowd motion features. Our approach is a Machine Learning one. We have two competitive methods in mind to perform such mapping: i) training a generator that is based on a set of features we pre-determined, or ii) train a detector that would identify itself the most relevant features to extract from the crowd motion. Training will be based on the datasets generated in WP1. The second goal, i.e. the detector is the main goal of D3.2, which is to develop a detector to analyse crowd movements in videos.

To this end, two partners UCL/UL have developed relevant new algorithms and models to detect crowd behaviours from videos. The approach focuses on fine-grained high-density crowd behaviours, based on vastly different data types and qualities, under common in-the-wild data collection settings, to best accommodate real-world high-density crowd data. UCL/UL have focused on videos of high-density crowds where each person occupies merely a few pixels and the data is captured by far-distance cameras so the data contains excessive noise.

Figure 1 depicts the different components of the proposed detection methods, next to the CrowdDNA partner who led each of the developments.

Figure 1. Components of CrowdDNA simulator.

All in all, due to the complex nature of high-density crowd behaviours, to detect crowd behaviours and accommodate different settings of the crowd observatories, we have explored data of various qualities, from where it is still possible to observe full/partial bodies of each individual, to where it is impossible to identify detailed individual body motions.