Related papers: Sentence Compression via DC Programming Approach
Sentence compression is an important problem in natural language processing with wide applications in text summarization, search engine and human-AI interaction system etc. In this paper, we design a hybrid extractive sentence compression…
A popular approach to sentence compression is to formulate the task as a constrained optimization problem and solve it with integer linear programming (ILP) tools. Unfortunately, dependence on ILP may make the compressor prohibitively slow,…
This paper presents a tree-to-tree transduction method for sentence compression. Our model is based on synchronous tree substitution grammar, a formalism that allows local distortion of the tree topology and can thus naturally capture…
We consider the problem of using sentence compression techniques to facilitate query-focused multi-document summarization. We present a sentence-compression-based framework for the task, and design a series of learning-based compression…
Vector averaging remains one of the most popular sentence embedding methods in spite of its obvious disregard for syntactic structure. While more complex sequential or convolutional networks potentially yield superior classification…
We address the minimization of a smooth objective function under an $\ell_0$-constraint and simple convex constraints. When the problem has no constraints except the $\ell_0$-constraint, some efficient algorithms are available; for example,…
Large language models (LLMs) have triggered a new stream of research focusing on compressing the context length to reduce the computational cost while ensuring the retention of helpful information for LLMs to answer the given question.…
We propose a special-purpose class of compression algorithms for efficient compression of Prolog programs. It is a dictionary-based compression method, specially designed for the compression of Prolog code, and therefore we name it PCA…
Sparse optimization refers to an optimization problem involving the zero-norm in objective or constraints. In this paper, nonconvex approximation approaches for sparse optimization have been studied with a unifying point of view in DC…
In this paper, we study possible extensions of the main ideas and methods of constrained DC optimization to the case of nonlinear semidefinite programming problems and more general nonlinear and nonsmooth cone constrained optimization…
We develop two penalty based difference of convex (DC) algorithms for solving chance constrained programs. First, leveraging a rank-based DC decomposition of the chance constraint, we propose a proximal penalty based DC algorithm in the…
Despite the recent success of Large Language Models (LLMs), it remains challenging to feed LLMs with long prompts due to the fixed size of LLM inputs. As a remedy, prompt compression becomes a promising solution by removing redundant tokens…
In this paper, we propose a novel deep coherence model (DCM) using a convolutional neural network architecture to capture the text coherence. The text coherence problem is investigated with a new perspective of learning sentence…
Large Language Models (LLMs) have shown outstanding performance across a variety of tasks, partly due to advanced prompting techniques. However, these techniques often require lengthy prompts, which increase computational costs and can…
In this paper, we propose Dynamic Compressive Transformer (DCT), a transformer-based framework for modeling the unbounded sequence. In contrast to the previous baselines which append every sentence representation to memory, conditionally…
Minimizing the difference of two submodular (DS) functions is a problem that naturally occurs in various machine learning problems. Although it is well known that a DS problem can be equivalently formulated as the minimization of the…
Traditional sentence embedding methods employ token-level contrastive learning on non-generative pre-trained models. Recently, there have emerged embedding methods based on generative large language models (LLMs). These methods either rely…
Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or…
We propose `Dracula', a new framework for unsupervised feature selection from sequential data such as text. Dracula learns a dictionary of $n$-grams that efficiently compresses a given corpus and recursively compresses its own dictionary;…
In this paper, we study the convergence rate of the DCA (Difference-of-Convex Algorithm), also known as the convex-concave procedure, with two different termination criteria that are suitable for smooth and nonsmooth decompositions…