We present an end-to-end deep network for fine-grained visual categorization called Collaborative Convolutional Network (CoCoNet). The network uses a collaborative layer after the convolutional layers to represent an image as an optimal weighted collaboration of features learned from training samples as a whole rather than one at a time. This gives CoCoNet more power to encode the fine-grained nature of the data with limited samples. We perform a detailed study of the performance with 1-stage and 2-stage transfer learning. The ablation study shows that the proposed method outperforms its constituent parts consistently. CoCoNet also outperforms few state-of-the-art competing methods. Experiments have been performed on the fine-grained bird species classification problem as a representative example, but the method may be applied to other similar tasks. We also introduce a new public dataset for fine-grained species recognition, that of Indian endemic birds and have reported initial results on it.