English

MTCNET: Multi-task Learning Paradigm for Crowd Count Estimation

Machine Learning 2025-04-16 v1 Artificial Intelligence Computer Vision and Pattern Recognition Machine Learning

Abstract

We propose a Multi-Task Learning (MTL) paradigm based deep neural network architecture, called MTCNet (Multi-Task Crowd Network) for crowd density and count estimation. Crowd count estimation is challenging due to the non-uniform scale variations and the arbitrary perspective of an individual image. The proposed model has two related tasks, with Crowd Density Estimation as the main task and Crowd-Count Group Classification as the auxiliary task. The auxiliary task helps in capturing the relevant scale-related information to improve the performance of the main task. The main task model comprises two blocks: VGG-16 front-end for feature extraction and a dilated Convolutional Neural Network for density map generation. The auxiliary task model shares the same front-end as the main task, followed by a CNN classifier. Our proposed network achieves 5.8% and 14.9% lower Mean Absolute Error (MAE) than the state-of-the-art methods on ShanghaiTech dataset without using any data augmentation. Our model also outperforms with 10.5% lower MAE on UCF_CC_50 dataset.

Keywords

Cite

@article{arxiv.1908.08652,
  title  = {MTCNET: Multi-task Learning Paradigm for Crowd Count Estimation},
  author = {Abhay Kumar and Nishant Jain and Suraj Tripathi and Chirag Singh and Kamal Krishna},
  journal= {arXiv preprint arXiv:1908.08652},
  year   = {2025}
}

Comments

5 pages, 3 figures, Accepted in IEEE AVSS 2019

R2 v1 2026-06-23T10:54:50.370Z