Related papers: CSIFT Based Locality-constrained Linear Coding for…
Image representation and classification are two fundamental tasks towards multimedia content retrieval and understanding. The idea that shape and texture information (e.g. edge or orientation) are the key features for visual representation…
While convolution and self-attention are extensively used in learned image compression (LIC) for transform coding, this paper proposes an alternative called Contextual Clustering based LIC (CLIC) which primarily relies on clustering…
We propose an action classification algorithm which uses Locality-constrained Linear Coding (LLC) to capture discriminative information of human body variations in each spatiotemporal subsequence of a video sequence. Our proposed method…
In recent years, visual sensors have been quickly improving, notably targeting richer acquisitions of the light present in a visual scene. In this context, the so-called lenslet light field (LLF) cameras are able to go beyond the…
Image identification is one of the most challenging tasks in different areas of computer vision. Scale-invariant feature transform is an algorithm to detect and describe local features in images to further use them as an image matching…
Image colorization achieves more and more realistic results with the increasing computation power of recent deep learning techniques. It becomes more difficult to identify the fake colorized images by human eyes. In this work, we propose a…
Single-sample face recognition is one of the most challenging problems in face recognition. We propose a novel algorithm to address this problem based on a sparse representation based classification (SRC) framework. The new algorithm is…
We present a novel means of describing local image appearances using binary strings. Binary descriptors have drawn increasing interest in recent years due to their speed and low memory footprint. A known shortcoming of these representations…
Local binary pattern (LBP) as a kind of local feature has shown its simplicity, easy implementation and strong discriminating power in image recognition. Although some LBP variants are specifically investigated for color image recognition,…
Contrastive Language-Image Pre-training (CLIP) has been a celebrated method for training vision encoders to generate image/text representations facilitating various applications. Recently, CLIP has been widely adopted as the vision backbone…
Visual place recognition is essential for vision-based robot localization and SLAM. Despite the tremendous progress made in recent years, place recognition in changing environments remains challenging. A promising approach to cope with…
Fine-grained categories are more difficulty distinguished than generic categories due to the similarity of inter-class and the diversity of intra-class. Therefore, the fine-grained visual categorization (FGVC) is considered as one of…
Recently sparse coding have been highly successful in image classification mainly due to its capability of incorporating the sparsity of image representation. In this paper, we propose an improved sparse coding model based on linear spatial…
In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes. CNNs allow us to learn suitable feature representations for localization that are robust against motion blur and illumination…
Sparse coding has been proposed as a theory of visual cortex and as an unsupervised algorithm for learning representations. We show empirically with the MNIST dataset that sparse codes can be very sensitive to image distortions, a behavior…
In this paper, we proposed a novel pipeline for image-level classification in the hyperspectral images. By doing this, we show that the discriminative spectral information at image-level features lead to significantly improved performance…
Good local features improve the robustness of many 3D re-localization and multi-view reconstruction pipelines. The problem is that viewing angle and distance severely impact the recognizability of a local feature. Attempts to improve…
We present a method for visual object classification using only a single feature, transformed color SIFT with a variant of Spatial Pyramid Matching (SPM) that we called Sliding Spatial Pyramid Matching (SSPM), trained with an ensemble of…
Fine-grained image classification is to recognize hundreds of subcategories in each basic-level category. Existing methods employ discriminative localization to find the key distinctions among subcategories. However, they generally have two…
This paper proposes a neural rendering approach that represents a scene as "compressed light-field tokens (CLiFTs)", retaining rich appearance and geometric information of a scene. CLiFT enables compute-efficient rendering by compressed…