Related papers: Efficient Solution of Language Equations Using Par…
Synthesizing programs from examples requires searching over a vast, combinatorial space of possible programs. In this search process, a key challenge is representing the behavior of a partially written program before it can be executed, to…
We propose new static word embeddings optimised for sentence semantic representation. We first extract word embeddings from a pre-trained Sentence Transformer, and improve them with sentence-level principal component analysis, followed by…
Automatically generating high-quality step-by-step solutions to math word problems has many applications in education. Recently, combining large language models (LLMs) with external tools to perform complex reasoning and calculation has…
The mathematical representation of semantics is a key issue for Natural Language Processing (NLP). A lot of research has been devoted to finding ways of representing the semantics of individual words in vector spaces. Distributional…
Efficient methods for storing and querying are critical for scaling high-order n-gram language models to large corpora. We propose a language model based on compressed suffix trees, a representation that is highly compact and can be easily…
This paper presents an efficient approach to image segmentation that approximates the piecewise-smooth (PS) functional in [12] with explicit solutions. By rendering some rational constraints on the initial conditions and the final solutions…
We present an unsupervised approach for discovering semantic representations of mathematical equations. Equations are challenging to analyze because each is unique, or nearly unique. Our method, which we call equation embeddings, finds good…
Optimization is a key task in a number of applications. When the set of feasible solutions under consideration is of combinatorial nature and described in an implicit way as a set of constraints, optimization is typically NP-hard.…
Solving partial differential equations (PDEs) with highly oscillatory solutions on complex domains remains a challenging and important problem. High-frequency oscillations and intricate geometries often result in prohibitively expensive…
Reasoning about real-life events is a unifying challenge in AI and NLP that has profound utility in a variety of domains, while fallacy in high-stake applications could be catastrophic. Able to work with diverse text in these domains, large…
In the field of natural language processing (NLP), continuous vector representations are crucial for capturing the semantic meanings of individual words. Yet, when it comes to the representations of sets of words, the conventional…
Distributed representations of meaning are a natural way to encode covariance relationships between words and phrases in NLP. By overcoming data sparsity problems, as well as providing information about semantic relatedness which is not…
Simple weighted averaging of word vectors often yields effective representations for sentences which outperform sophisticated seq2seq neural models in many tasks. While it is desirable to use the same method to represent documents as well,…
As the first step in automated natural language processing, representing words and sentences is of central importance and has attracted significant research attention. Different approaches, from the early one-hot and bag-of-words…
Rule-based machine translation is more data efficient than the big data-based machine translation approaches, making it appropriate for languages with low bilingual corpus resources -- i.e., minority languages. However, the rule-based…
I present the most fundamental features of an implemented system designed to manipulate representations of regular languages. The system is structured into two layers, allowing regular languages to be represented in an increasingly compact,…
Finite-state transducers give efficient representations of many Natural Language phenomena. They allow to account for complex lexicon restrictions encountered, without involving the use of a large set of complex rules difficult to analyze.…
This work is concerned with the propagation of uncertainty across coupled domain problems with high-dimensional random inputs. A stochastic model reduction approach based on low-rank separated representations is proposed for the partitioned…
One major deficiency of most semantic representation techniques is that they usually model a word type as a single point in the semantic space, hence conflating all the meanings that the word can have. Addressing this issue by learning…
Most recent neural semi-supervised learning algorithms rely on adding small perturbation to either the input vectors or their representations. These methods have been successful on computer vision tasks as the images form a continuous…