We present an end-to-end deep network for fine-grained visual categorization called Collaborative Convolutional Network (CoCoNet). The network uses a collaborative layer after the convolutional layers to represent an image as an optimal weighted collaboration of features learned from training samples as a whole rather than one at a time. This gives CoCoNet more power to encode the fine-grained nature of the data with limited samples. We perform a detailed study of the performance with 1-stage and 2-stage transfer learning. The ablation study shows that the proposed method outperforms its constituent parts consistently. CoCoNet also outperforms few state-of-the-art competing methods. Experiments have been performed on the fine-grained bird species classification problem as a representative example, but the method may be applied to other similar tasks. We also introduce a new public dataset for fine-grained species recognition, that of Indian endemic birds and have reported initial results on it.
@article{arxiv.1901.09886,
title = {CoCoNet: A Collaborative Convolutional Network},
author = {Tapabrata Chakraborti and Brendan McCane and Steven Mills and Umapada Pal},
journal= {arXiv preprint arXiv:1901.09886},
year = {2020}
}