English
Related papers

Related papers: MIREncoder: Multi-modal IR-based Pretrained Embedd…

200 papers

Growing heterogeneity and configurability in HPC architectures has made auto-tuning applications and runtime parameters on these systems very complex. Users are presented with a multitude of options to configure parameters. In addition to…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-04-28 Akash Dutta , Jordi Alcaraz , Ali TehraniJamsaz , Eduardo Cesar , Anna Sikora , Ali Jannesari

Neural program embeddings have demonstrated considerable promise in a range of program analysis tasks, including clone identification, program repair, code completion, and program synthesis. However, most existing methods generate neural…

Software Engineering · Computer Science 2022-04-21 Zongjie Li , Pingchuan Ma , Huaijin Wang , Shuai Wang , Qiyi Tang , Sen Nie , Shi Wu

Multimodal representation learning has demonstrated remarkable potential in enabling models to process and integrate diverse data modalities, such as text and images, for improved understanding and performance. While the medical domain can…

Computer Vision and Pattern Recognition · Computer Science 2025-03-04 Shuvendu Roy , Franklin Ogidi , Ali Etemad , Elham Dolatabadi , Arash Afkanpour

Code understanding and generation have fast become some of the most popular applications of language models (LMs). Nonetheless, research on multilingual aspects of Code-LMs (i.e., LMs for code generation) such as cross-lingual transfer…

Artificial Intelligence · Computer Science 2024-04-16 Indraneil Paul , Goran Glavaš , Iryna Gurevych

Multimodal pre-training has propelled great advancement in vision-and-language research. These large-scale pre-trained models, although successful, fatefully suffer from slow inference speed due to enormous computation cost mainly from…

Computation and Language · Computer Science 2021-04-13 Siqi Sun , Yen-Chun Chen , Linjie Li , Shuohang Wang , Yuwei Fang , Jingjing Liu

Large-scale multi-modal deep learning models have revolutionized domains such as healthcare, highlighting the importance of computational power. However, in resource-constrained regions like Low and Middle-Income Countries (LMICs), limited…

We present a new pre-training strategy called M$^{3}$3D ($\underline{M}$ulti-$\underline{M}$odal $\underline{M}$asked $\underline{3D}$) built based on Multi-modal masked autoencoders that can leverage 3D priors and learned cross-modal…

Computer Vision and Pattern Recognition · Computer Science 2023-09-28 Muhammad Abdullah Jamal , Omid Mohareri

Embedded systems have proliferated in various consumer and industrial applications with the evolution of Cyber-Physical Systems and the Internet of Things. These systems are subjected to stringent constraints so that embedded software must…

Recently, Multi-modal Large Language Models (MLLMs) have shown remarkable effectiveness for multi-modal tasks due to their abilities to generate and understand cross-modal data. However, processing long sequences of visual tokens extracted…

Computer Vision and Pattern Recognition · Computer Science 2025-04-11 Haicheng Wang , Zhemeng Yu , Gabriele Spadaro , Chen Ju , Victor Quétu , Shuai Xiao , Enzo Tartaglione

Multimodal Large Language Models (MLLMs) have shown immense promise in universal multimodal retrieval, which aims to find relevant items of various modalities for a given query. But their practical application is often hindered by the…

Computer Vision and Pattern Recognition · Computer Science 2026-02-06 Qi Li , Yanzhe Zhao , Yongxin Zhou , Yameng Wang , Yandong Yang , Yuanjia Zhou , Jue Wang , Zuojian Wang , Jinxiang Liu

Multimodal Recommender Systems aim to improve recommendation accuracy by integrating heterogeneous content, such as images and textual metadata. While effective, it remains unclear whether their gains stem from true multimodal understanding…

Information Retrieval · Computer Science 2025-08-07 Claudio Pomo , Matteo Attimonelli , Danilo Danese , Fedelucio Narducci , Tommaso Di Noia

Due to the huge amount of parameters, fine-tuning of pretrained language models (PLMs) is prone to overfitting in the low resource scenarios. In this work, we present a novel method that operates on the hidden representations of a PLM to…

Computation and Language · Computer Science 2023-05-29 Linlin Liu , Xingxuan Li , Megh Thakkar , Xin Li , Shafiq Joty , Luo Si , Lidong Bing

Code completion is one of the most useful features in the Integrated Development Environments (IDEs), which can accelerate software development by suggesting the next probable token based on the contextual code in real-time. Recent studies…

Software Engineering · Computer Science 2021-01-01 Fang Liu , Ge Li , Yunfei Zhao , Zhi Jin

Large Language Models (LLMs) have demonstrated remarkable efficacy in text embedding, yet current adaptation methods like LoRA face significant bottlenecks in computational efficiency and cross-architecture transferability. Whenever a new…

Computation and Language · Computer Science 2026-05-28 Yu-Che Tsai , Kuan-Yu Chen , Yuan-Hao Chen , Yu-Han Chang , Ching-Yu Tsai , Yu-Hsiang Chuang , Shou-De Lin

Fine-tuning Large Language Models (LLMs) with multimodal encoders on modality-specific data expands the modalities that LLMs can handle, leading to the formation of Multimodal LLMs (MLLMs). However, this paradigm heavily relies on…

Computation and Language · Computer Science 2025-05-26 Junlin Li , Guodong DU , Jing Li , Sim Kuan Goh , Wenya Wang , Yequan Wang , Fangming Liu , Ho-Kin Tang , Saleh Alharbi , Daojing He , Min Zhang

Multimodal language models (MLMs) integrate visual and textual information by coupling a vision encoder with a large language model through the specific adapter. While existing approaches commonly rely on a single pre-trained vision…

Computer Vision and Pattern Recognition · Computer Science 2025-02-24 Matvey Skripkin , Elizaveta Goncharova , Dmitrii Tarasov , Andrey Kuznetsov

Recent multimodal large language models (MLLMs) increasingly integrate multiple vision encoders to improve performance on various benchmarks, assuming that diverse pretraining objectives yield complementary visual signals. However, we show…

Computer Vision and Pattern Recognition · Computer Science 2026-02-16 Yizhou Wang , Song Mao , Yang Chen , Yufan Shen , Yinqiao Yan , Pinlong Cai , Ding Wang , Guohang Yan , Zhi Yu , Xuming Hu , Botian Shi

In recent times, the standard practice for developing MLLMs is to feed features from vision encoder(s) into the LLM and train with natural language supervision. This approach often causes models to lean towards language comprehension and…

Computer Vision and Pattern Recognition · Computer Science 2025-10-20 Jitesh Jain , Zhengyuan Yang , Humphrey Shi , Jianfeng Gao , Jianwei Yang

Large Language Models (LLMs) have demonstrated remarkable capabilities across a variety of software engineering and coding tasks. However, their application in the domain of code and compiler optimization remains underexplored. Training…

Programming Languages · Computer Science 2024-07-04 Chris Cummins , Volker Seeker , Dejan Grubisic , Baptiste Roziere , Jonas Gehring , Gabriel Synnaeve , Hugh Leather

Multimodal Machine Translation (MMT) aims to improve translation quality by leveraging auxiliary modalities such as images alongside textual input. While recent advances in large-scale pre-trained language and vision models have…

Computation and Language · Computer Science 2025-04-28 Zhuang Yu , Shiliang Sun , Jing Zhao , Tengfei Song , Hao Yang
‹ Prev 1 2 3 10 Next ›