Related papers: Structuring Quantitative Image Analysis with Objec…
Shifting of objects in an image and merging many images after appropriate shifting is being used in several engineering and scientific applications which require complex perception development. A method has been presented here which could…
The pixels in an image, and the objects, scenes, and actions that they compose, determine whether an image will be memorable or forgettable. While memorability varies by image, it is largely independent of an individual observer. Observer…
Inspired by human categorization, object property reasoning involves identifying and recognizing low-level details and higher-level abstractions. While current visual question answering (VQA) studies consider multiple object properties,…
We propose an end-to-end learning framework for segmenting generic objects in both images and videos. Given a novel image or video, our approach produces a pixel-level mask for all "object-like" regions---even for object categories never…
Treating images as data has become increasingly popular in political science. While existing classifiers for images reach high levels of accuracy, it is difficult to systematically assess the visual features on which they base their…
We have seen significant leapfrog advancement in machine learning in recent decades. The central idea of machine learnability lies on constructing learning algorithms that learn from good data. The availability of more data being made…
In object recognition applications, object images usually appear with different quality levels. Practically, it is very important to indicate object image qualities for better application performance, e.g. filtering out low-quality object…
Significant performance gains in deep learning coupled with the exponential growth of image and video data on the Internet have resulted in the recent emergence of automated image captioning systems. Ensuring scalability of automated image…
Quality of image always plays a vital role in in-creasing object recognition or classification rate. A good quality image gives better recognition or classification rate than any unprocessed noisy images. It is more difficult to extract…
This article introduces a benchmark designed to evaluate the capabilities of multimodal models in analyzing and interpreting images. The benchmark focuses on seven key visual aspects: main object, additional objects, background, detail,…
Automated frame analysis of political communication is a popular task in computational social science that is used to study how authors select aspects of a topic to frame its reception. So far, such studies have been narrow, in that they…
Object detection in high-resolution satellite imagery is emerging as a scalable alternative to on-the-ground survey data collection in many environmental and socioeconomic monitoring applications. However, performing object detection over…
Object detection or localization is an incremental step in progression from coarse to fine digital image inference. It not only provides the classes of the image objects, but also provides the location of the image objects which have been…
The style of an image plays a significant role in how it is viewed, but style has received little attention in computer vision research. We describe an approach to predicting style of images, and perform a thorough evaluation of different…
When humans describe images they tend to use combinations of nouns and adjectives, corresponding to objects and their associated attributes respectively. To generate such a description automatically, one needs to model objects, attributes…
Generative adversarial networks conditioned on textual image descriptions are capable of generating realistic-looking images. However, current methods still struggle to generate images based on complex image captions from a heterogeneous…
Image recognition and quality assessment are two important viewing tasks, while potentially following different visual mechanisms. This paper investigates if the two tasks can be performed in a multitask learning manner. A sequential…
We consider detecting objects in an image by iteratively selecting from a set of arbitrarily shaped candidate regions. Our generic approach, which we term visual chunking, reasons about the locations of multiple object instances in an image…
Image data provide unique information about political events, actors, and their interactions which are difficult to measure from or not available in text data. This article introduces a new class of automated methods based on computer…
A common use of language is to refer to visually present objects. Modelling it in computers requires modelling the link between language and perception. The "words as classifiers" model of grounded semantics views words as classifiers of…