English
Related papers

Related papers: Language embeddings that preserve staging and safe…

200 papers

In the large language model (LLM) revolution, embedding is a key component of various systems, such as retrieving knowledge or memories for LLMs or building content moderation filters. As such cases span from English to other natural or…

Computation and Language · Computer Science 2025-05-23 Xin Zhang , Zehan Li , Yanzhao Zhang , Dingkun Long , Pengjun Xie , Meishan Zhang , Min Zhang

When creating a new domain-specific language (DSL) it is common to embed it as a part of a flexible host language, rather than creating it entirely from scratch. The semantics of an embedded DSL (EDSL) is either given directly as a set of…

Programming Languages · Computer Science 2016-12-06 Piotr Danilewski , Philipp Slusallek

Word embeddings are powerful representations that form the foundation of many natural language processing architectures, both in English and in other languages. To gain further insight into word embeddings, we explore their stability (e.g.,…

Computation and Language · Computer Science 2021-09-13 Laura Burdick , Jonathan K. Kummerfeld , Rada Mihalcea

Language models now provide an interface to express and often solve general problems in natural language, yet their ultimate computational capabilities remain a major topic of scientific debate. Unlike a formal computer, a language model is…

Computation and Language · Computer Science 2026-02-11 Alex Lewandowski , Marlos C. Machado , Dale Schuurmans

Continuous embeddings of tokens in computer programs have been used to support a variety of software development tools, including readability, code search, and program repair. Contextual embeddings are common in natural language processing…

Software Engineering · Computer Science 2020-04-29 Rafael - Michael Karampatsis , Charles Sutton

Contextual adaptation in token embeddings plays a central role in determining how well language models maintain coherence and retain semantic relationships over extended text sequences. Static embeddings often impose constraints on lexical…

Computation and Language · Computer Science 2025-03-27 Koinis Vassilis , Godfrey Milbourne , Harriet Featherstone , Xanthe Peverell , Yorick Bletchley , Zachary Montford

We give a universal kernel that renders all the regular languages linearly separable. We are not able to compute this kernel efficiently and conjecture that it is intractable, but we do have an efficient $\eps$-approximation.

Machine Learning · Computer Science 2007-12-07 Leonid , Kontorovich

We study when a programming language can emulate programs written in that same language without delegating the guest program back to the host evaluator or compiler. We call this property emulation-completeness. The central observation is…

Programming Languages · Computer Science 2026-04-20 Gregory Morse , Tamás Kozsik

Almost every programming language's syntax includes a notion of binder and corresponding bound occurrences, along with the accompanying notions of $\alpha$-equivalence, capture-avoiding substitution, typing contexts, runtime environments,…

Programming Languages · Computer Science 2021-10-13 Guillaume Allais , Robert Atkey , James Chapman , Conor McBride , James McKinna

A broad class of software engineering problems can be generalized as the "total recall problem". This short paper claims that identifying and exploring total recall language processing problems in software engineering is an important task…

Software Engineering · Computer Science 2018-11-13 Zhe Yu , Tim Menzies

Recent work has shown that a model's input word embeddings can serve as effective control variables for steering its behavior toward outputs that satisfy desired properties. However, this has only been demonstrated for pretrained…

Computation and Language · Computer Science 2026-04-30 Baturay Saglam , Dionysis Kalogerias

The ability of machine learning models to store input information in hidden layer vector embeddings, analogous to the concept of `memory', is widely employed but not well characterized. We find that language model embeddings typically…

Computation and Language · Computer Science 2026-05-20 Benjamin L. Badger

Word embeddings -- distributed representations of words -- in deep learning are beneficial for many tasks in natural language processing (NLP). However, different embedding sets vary greatly in quality and characteristics of the captured…

Computation and Language · Computer Science 2015-12-31 Wenpeng Yin , Hinrich Schütze

Classic grammars and regular expressions can be used for a variety of purposes, including parsing, intent detection, and matching. However, the comparisons are performed at a structural level, with constituent elements (words or characters)…

Computation and Language · Computer Science 2018-08-16 David Wingate , William Myers , Nancy Fulda , Tyler Etchart

Large pre-trained language models have recently been expanded and applied to programming language tasks with great success, often through further pre-training of a strictly-natural language model--where training sequences typically contain…

Computation and Language · Computer Science 2024-02-13 Fenia Christopoulou , Guchun Zhang , Gerasimos Lampouras

Researchers have recently suggested that models share common representations. In our work, we find numerous geometric similarities across the token embeddings of large language models. First, we find ``global'' similarities: token…

Computation and Language · Computer Science 2025-07-16 Andrew Lee , Melanie Weber , Fernanda Viégas , Martin Wattenberg

The increasing prevalence of Large Language Models (LMs) in critical applications highlights the need for controlled language generation strategies that are not only computationally efficient but that also enjoy performance guarantees. To…

Computation and Language · Computer Science 2026-03-16 Emily Cheng , Carmen Amo Alonso

Runtime efficiency and termination are crucial properties in the studies of program verification. Instead of dealing with these issues in an ad hoc manner, it would be useful to develop a robust framework in which such properties are…

Programming Languages · Computer Science 2026-04-06 Weijun Chen , Yuxi Fu , Huan Long

Haskell is a popular choice for hosting deeply embedded languages. A recurring challenge for these embeddings is how to seamlessly integrate user defined algebraic data types. In particular, one important, convenient, and expressive feature…

Programming Languages · Computer Science 2022-08-01 Trevor L. McDonell , Joshua D. Meredith , Gabriele Keller

Structured embedding transformations offer a promising approach for enhancing the efficiency and coherence of language model inference. The introduction of Structural Embedding Projection (SEP) provides a mechanism for refining token…

Computation and Language · Computer Science 2025-08-11 Vincent Enoasmo , Cedric Featherstonehaugh , Xavier Konstantinopoulos , Zacharias Huntington
‹ Prev 1 2 3 10 Next ›