Related papers: Import2vec - Learning Embeddings for Software Libr…

JEFL: Joint Embedding of Formal Proof Libraries

The heterogeneous nature of the logical foundations used in different interactive proof assistant libraries has rendered discovery of similar mathematical concepts among them difficult. In this paper, we compare a previously proposed…

Logic in Computer Science · Computer Science 2021-07-22 Qingxiang Wang , Cezary Kaliszyk

A Literature Study of Embeddings on Source Code

Natural language processing has improved tremendously after the success of word embedding techniques such as word2vec. Recently, the same idea has been applied on source code with encouraging results. In this survey, we aim to collect and…

Machine Learning · Computer Science 2019-04-08 Zimin Chen , Martin Monperrus

Learning Blended, Precise Semantic Program Embeddings

Learning neural program embeddings is key to utilizing deep neural networks in program languages research --- precise and efficient program representations enable the application of deep models to a wide range of program analysis tasks.…

Software Engineering · Computer Science 2019-07-12 Ke Wang , Zhendong Su

Learning Library Cell Representations in Vector Space

We propose Lib2Vec, a novel self-supervised framework to efficiently learn meaningful vector representations of library cells, enabling ML models to capture essential cell semantics. The framework comprises three key components: (1) an…

Machine Learning · Computer Science 2025-04-01 Rongjian Liang , Yi-Chen Lu , Wen-Hao Liu , Haoxing Ren

Class Vectors: Embedding representation of Document Classes

Distributed representations of words and paragraphs as semantic embeddings in high dimensional data are used across a number of Natural Language Understanding tasks such as retrieval, translation, and classification. In this work, we…

Computation and Language · Computer Science 2015-08-04 Devendra Singh Sachan , Shailesh Kumar

Embedding Word Similarity with Neural Machine Translation

Neural language models learn word representations, or embeddings, that capture rich linguistic and conceptual information. Here we investigate the embeddings learned by neural machine translation models, a recently-developed class of neural…

Computation and Language · Computer Science 2015-04-06 Felix Hill , Kyunghyun Cho , Sebastien Jean , Coline Devin , Yoshua Bengio

Utility of General and Specific Word Embeddings for Classifying Translational Stages of Research

Conventional text classification models make a bag-of-words assumption reducing text into word occurrence counts per document. Recent algorithms such as word2vec are capable of learning semantic meaning and similarity between words in an…

Computation and Language · Computer Science 2018-07-11 Vincent Major , Alisa Surkis , Yindalon Aphinyanaphongs

On the Embeddings of Variables in Recurrent Neural Networks for Source Code

Source code processing heavily relies on the methods widely used in natural language processing (NLP), but involves specifics that need to be taken into account to achieve higher quality. An example of this specificity is that the semantics…

Software Engineering · Computer Science 2021-04-28 Nadezhda Chirkova

From Word Vectors to Multimodal Embeddings: Techniques, Applications, and Future Directions For Large Language Models

Word embeddings and language models have transformed natural language processing (NLP) by facilitating the representation of linguistic elements in continuous vector spaces. This review visits foundational concepts such as the…

Computation and Language · Computer Science 2025-12-03 Charles Zhang , Benji Peng , Xintian Sun , Qian Niu , Junyu Liu , Keyu Chen , Ming Li , Pohsun Feng , Ziqian Bi , Ming Liu , Yichao Zhang , Xinyuan Song , Cheng Fei , Caitlyn Heqi Yin , Lawrence KQ Yan , Hongyang He , Tianyang Wang

Can Network Embedding of Distributional Thesaurus be Combined with Word Vectors for Better Representation?

Distributed representations of words learned from text have proved to be successful in various natural language processing tasks in recent times. While some methods represent words as vectors computed from text using predictive model…

Computation and Language · Computer Science 2018-02-20 Abhik Jana , Pawan Goyal

Imparting Interpretability to Word Embeddings while Preserving Semantic Structure

As an ubiquitous method in natural language processing, word embeddings are extensively employed to map semantic properties of words into a dense vector representation. They capture semantic and syntactic relations among words but the…

Computation and Language · Computer Science 2020-07-03 Lutfi Kerem Senel , Ihsan Utlu , Furkan Şahinuç , Haldun M. Ozaktas , Aykut Koç

Code Vectors: Understanding Programs Through Embedded Abstracted Symbolic Traces

With the rise of machine learning, there is a great deal of interest in treating programs as data to be fed to learning algorithms. However, programs do not start off in a form that is immediately amenable to most off-the-shelf learning…

Software Engineering · Computer Science 2018-08-21 Jordan Henkel , Shuvendu K. Lahiri , Ben Liblit , Thomas Reps

An Effective Approach to Embedding Source Code by Combining Large Language and Sentence Embedding Models

The advent of large language models (LLMs) has significantly advanced artificial intelligence (AI) in software engineering (SE), with source code embeddings playing a crucial role in tasks such as source code clone detection and source code…

Software Engineering · Computer Science 2025-06-04 Zixiang Xian , Chenhui Cui , Rubing Huang , Chunrong Fang , Zhenyu Chen

Req2Lib: A Semantic Neural Model for Software Library Recommendation

Third-party libraries are crucial to the development of software projects. To get suitable libraries, developers need to search through millions of libraries by filtering, evaluating, and comparing. The vast number of libraries places a…

Software Engineering · Computer Science 2020-05-26 Zhensu Sun , Yan Liu , Ziming Cheng , Chen Yang , Pengyu Che

Consistent Alignment of Word Embedding Models

Word embedding models offer continuous vector representations that can capture rich contextual semantics based on their word co-occurrence patterns. While these word vectors can provide very effective features used in many NLP tasks such as…

Computation and Language · Computer Science 2017-02-27 Cem Safak Sahin , Rajmonda S. Caceres , Brandon Oselio , William M. Campbell

Learning Meta-Embeddings by Using Ensembles of Embedding Sets

Word embeddings -- distributed representations of words -- in deep learning are beneficial for many tasks in natural language processing (NLP). However, different embedding sets vary greatly in quality and characteristics of the captured…

Computation and Language · Computer Science 2015-12-31 Wenpeng Yin , Hinrich Schütze

A Comprehensive Empirical Evaluation of Existing Word Embedding Approaches

Vector-based word representations help countless Natural Language Processing (NLP) tasks capture the language's semantic and syntactic regularities. In this paper, we present the characteristics of existing word embedding approaches and…

Computation and Language · Computer Science 2024-03-05 Obaidullah Zaland , Muhammad Abulaish , Mohd. Fazil

Analyzing Structures in the Semantic Vector Space: A Framework for Decomposing Word Embeddings

Word embeddings are rich word representations, which in combination with deep neural networks, lead to large performance gains for many NLP tasks. However, word embeddings are represented by dense, real-valued vectors and they are therefore…

Computation and Language · Computer Science 2019-12-24 Andreas Hanselowski , Iryna Gurevych

Table2Vec: Neural Word and Entity Embeddings for Table Population and Retrieval

Tables contain valuable knowledge in a structured form. We employ neural language modeling approaches to embed tabular data into vector spaces. Specifically, we consider different table elements, such caption, column headings, and cells,…

Information Retrieval · Computer Science 2019-06-04 Li Deng , Shuo Zhang , Krisztian Balog

Reproducing and learning new algebraic operations on word embeddings using genetic programming

Word-vector representations associate a high dimensional real-vector to every word from a corpus. Recently, neural-network based methods have been proposed for learning this representation from large corpora. This type of word-to-vector…

Computation and Language · Computer Science 2017-02-21 Roberto Santana