English
Related papers

Related papers: SCELMo: Source Code Embeddings from Language Model…

200 papers

Embedding from Language Models (ELMo) has shown to be effective for improving many natural language processing (NLP) tasks, and ELMo takes character information to compose word representation to train language models.However, the character…

Computation and Language · Computer Science 2019-09-19 Jiangtong Li , Hai Zhao , Zuchao Li , Wei Bi , Xiaojiang Liu

Contextual embeddings, such as ELMo and BERT, move beyond global word representations like Word2Vec and achieve ground-breaking performance on a wide range of natural language processing tasks. Contextual embeddings assign each word a…

Computation and Language · Computer Science 2020-04-14 Qi Liu , Matt J. Kusner , Phil Blunsom

Word embeddings and language models have transformed natural language processing (NLP) by facilitating the representation of linguistic elements in continuous vector spaces. This review visits foundational concepts such as the…

The advent of large language models (LLMs) has significantly advanced artificial intelligence (AI) in software engineering (SE), with source code embeddings playing a crucial role in tasks such as source code clone detection and source code…

Software Engineering · Computer Science 2025-06-04 Zixiang Xian , Chenhui Cui , Rubing Huang , Chunrong Fang , Zhenyu Chen

We present models for embedding words in the context of surrounding words. Such models, which we refer to as token embeddings, represent the characteristics of a word that are specific to a given context, such as word sense, syntactic…

Computation and Language · Computer Science 2017-06-13 Lifu Tu , Kevin Gimpel , Karen Livescu

Recent results show that deep neural networks using contextual embeddings significantly outperform non-contextual embeddings on a majority of text classification task. We offer precomputed embeddings from popular contextual ELMo model for…

Computation and Language · Computer Science 2022-06-01 Matej Ulčar , Marko Robnik-Šikonja

Text embeddings have become an essential part of a variety of language applications. However, methods for interpreting, exploring and reversing embedding spaces are limited, reducing transparency and precluding potentially valuable…

Computation and Language · Computer Science 2026-01-27 Brian Ondov , Chia-Hsuan Chang , Yujia Zhou , Mauro Giuffrè , Hua Xu

Word embeddings such as ELMo have recently been shown to model word semantics with greater efficacy through contextualized learning on large-scale language corpora, resulting in significant improvement in state of the art across many…

Computation and Language · Computer Science 2019-09-11 Shao-Yen Tseng , Panayiotis Georgiou , Shrikanth Narayanan

Neural program embeddings have shown much promise recently for a variety of program analysis tasks, including program synthesis, program repair, fault localization, etc. However, most existing program embeddings are based on syntactic…

Artificial Intelligence · Computer Science 2018-07-03 Ke Wang , Rishabh Singh , Zhendong Su

Contextualized word embeddings derived from pre-trained language models (LMs) show significant improvements on downstream NLP tasks. Pre-training on domain-specific corpora, such as biomedical articles, further improves their performance.…

Computation and Language · Computer Science 2019-04-05 Qiao Jin , Bhuwan Dhingra , William W. Cohen , Xinghua Lu

Despite the fast developmental pace of new sentence embedding methods, it is still challenging to find comprehensive evaluations of these different techniques. In the past years, we saw significant improvements in the field of sentence…

Computation and Language · Computer Science 2018-06-19 Christian S. Perone , Roberto Silveira , Thomas S. Paula

Natural language processing has improved tremendously after the success of word embedding techniques such as word2vec. Recently, the same idea has been applied on source code with encouraging results. In this survey, we aim to collect and…

Machine Learning · Computer Science 2019-04-08 Zimin Chen , Martin Monperrus

General embeddings like word2vec, GloVe and ELMo have shown a lot of success in natural language tasks. The embeddings are typically extracted from models that are built on general tasks such as skip-gram models and natural language…

Computation and Language · Computer Science 2020-11-03 Aparna Khare , Srinivas Parthasarathy , Shiva Sundaram

Large language models (LLMs) have revolutionized natural language processing by achieving state-of-the-art performance across various tasks. Recently, their effectiveness as embedding models has gained attention, marking a paradigm shift…

Computation and Language · Computer Science 2025-07-28 Chongyang Tao , Tao Shen , Shen Gao , Junshuo Zhang , Zhen Li , Kai Hua , Wenpeng Hu , Zhengwei Tao , Shuai Ma

Recent advances in generative AI have been largely driven by large language models (LLMs), deep neural networks that operate over discrete units called tokens. To represent text, the vast majority of LLMs use words or word fragments as the…

Contextualized representation models such as ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2018) have recently achieved state-of-the-art results on a diverse array of downstream NLP tasks. Building on recent token-level probing work,…

Contextualized word embedding models, such as ELMo, generate meaningful representations of words and their context. These models have been shown to have a great impact on downstream applications. However, in many cases, the contextualized…

Computation and Language · Computer Science 2019-09-27 Weijia Shi , Muhao Chen , Pei Zhou , Kai-Wei Chang

The ability of machine learning models to store input information in hidden layer vector embeddings, analogous to the concept of `memory', is widely employed but not well characterized. We find that language model embeddings typically…

Computation and Language · Computer Science 2026-05-20 Benjamin L. Badger

Although models using contextual word embeddings have achieved state-of-the-art results on a host of NLP tasks, little is known about exactly what information these embeddings encode about the context words that they are understood to…

Computation and Language · Computer Science 2020-05-06 Josef Klafka , Allyson Ettinger

Static bug localization techniques that locate bugs at method granularity have gained much attention from both researchers and practitioners. For a static method-level bug localization technique, a key but challenging step is to fully…

Software Engineering · Computer Science 2021-09-09 Weiqin Zou , Enming Li , Chunrong Fang
‹ Prev 1 2 3 10 Next ›