Related papers: Enhancing Compositional Generalization via Composi…
Learning from a few examples is a challenging task for machine learning. While recent progress has been made for this problem, most of the existing methods ignore the compositionality in visual concept representation (e.g. objects are built…
Scene Graph Generation (SGG) aims to detect all the visual relation triplets $<$\texttt{sub}, \texttt{pred}, \texttt{obj}$>$ in a given image. With the emergence of various advanced techniques for better utilizing both the intrinsic and…
Compositional generalization is the capability of a model to understand novel compositions composed of seen concepts. There are multiple levels of novel compositions including phrase-phrase level, phrase-word level, and word-word level.…
How to usefully encode compositional task structure has long been a core challenge in AI. Recent work in chain of thought prompting has shown that for very large neural language models (LMs), explicitly demonstrating the inferential steps…
Foundation models like CLIP allow zero-shot transfer on various tasks without additional training data. Yet, the zero-shot performance is less competitive than a fully supervised one. Thus, to enhance the performance, fine-tuning and…
Compositional generalization is a basic mechanism in human language learning, which current neural networks struggle with. A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability by…
Neural network models often generalize poorly to mismatched domains or distributions. In NLP, this issue arises in particular when models are expected to generalize compositionally, that is, to novel combinations of familiar words and…
Computer vision is difficult, partly because the desired mathematical function connecting input and output data is often complex, fuzzy and thus hard to learn. Coarse-to-fine (C2F) learning is a promising direction, but it remains unclear…
Despite recent advancements in deep learning, deep neural networks continue to suffer from performance degradation when applied to new data that differs from training data. Test-time adaptation (TTA) aims to address this challenge by…
Domain Generalization (DG) is essentially a sub-branch of out-of-distribution generalization, which trains models from multiple source domains and generalizes to unseen target domains. Recently, some domain generalization algorithms have…
Ensemble learning is a well established body of methods for machine learning to enhance predictive performance by combining multiple algorithms/models. Combinatorial Fusion Analysis (CFA) has provided method and practice for combining…
The deep neural networks (DNNs) have freed the synthetic aperture radar automatic target recognition (SAR ATR) from expertise-based feature designing and demonstrated superiority over conventional solutions. There has been shown the unique…
Compositional generalization refers to correctly interpret novel combinations of known primitives, which remains a major challenge. Existing approaches often rely on supervised fine-tuning, which encourages models to imitate target outputs.…
Dual-encoder Vision-Language Models (VLMs) such as CLIP are often characterized as bag-of-words systems due to their poor performance on compositional benchmarks. We argue that this limitation may stem less from deficient representations…
Classifier-Free Guidance (CFG) is a critical technique for enhancing the sample quality of visual generative models. However, in autoregressive (AR) multi-modal generation, CFG introduces design inconsistencies between language and visual…
Domain generalization (DG) task aims to learn a robust model from source domains that could handle the out-of-distribution (OOD) issue. In order to improve the generalization ability of the model in unseen domains, increasing the diversity…
Adversarial training has been widely acknowledged as the most effective method to improve the adversarial robustness against adversarial examples for Deep Neural Networks (DNNs). So far, most existing works focus on enhancing the overall…
Compositional generalization-a key open challenge in modern machine learning-requires models to predict unknown combinations of known concepts. However, assessing compositional generalization remains a fundamental challenge due to the lack…
Domain generalization (DG) aims to learn a robust model from source domains that generalize well on unseen target domains. Recent studies focus on generating novel domain samples or features to diversify distributions complementary to…
Inferring objects and their relationships from an image in the form of a scene graph is useful in many applications at the intersection of vision and language. We consider a challenging problem of compositional generalization that emerges…