Related papers: Learning from Uncurated Regular Expressions

Low-rank Dictionary Learning for Unsupervised Feature Selection

There exist many high-dimensional data in real-world applications such as biology, computer vision, and social networks. Feature selection approaches are devised to confront with high-dimensional data challenges with the aim of efficient…

Machine Learning · Computer Science 2021-06-22 Mohsen Ghassemi Parsa , Hadi Zare , Mehdi Ghatee

Unsupervised Expressive Rules Provide Explainability and Assist Human Experts Grasping New Domains

Approaching new data can be quite deterrent; you do not know how your categories of interest are realized in it, commonly, there is no labeled data at hand, and the performance of domain adaptation methods is unsatisfactory. Aiming to…

Computation and Language · Computer Science 2020-10-20 Eyal Shnarch , Leshem Choshen , Guy Moshkowich , Noam Slonim , Ranit Aharonov

On the difficulty of a distributional semantics of spoken language

In the domain of unsupervised learning most work on speech has focused on discovering low-level constructs such as phoneme inventories or word-like units. In contrast, for written language, where there is a large body of work on…

Computation and Language · Computer Science 2018-10-29 Grzegorz Chrupała , Lieke Gelderloos , Ákos Kádár , Afra Alishahi

Efficient Attribute Unlearning: Towards Selective Removal of Input Attributes from Feature Representations

Recently, the enactment of privacy regulations has promoted the rise of the machine unlearning paradigm. Existing studies of machine unlearning mainly focus on sample-wise unlearning, such that a learnt model will not expose user's privacy…

Machine Learning · Computer Science 2022-04-19 Tao Guo , Song Guo , Jiewei Zhang , Wenchao Xu , Junxiao Wang

Unsupervised Language Acquisition

This thesis presents a computational theory of unsupervised language acquisition, precisely defining procedures for learning language from ordinary spoken or written utterances, with no explicit help from a teacher. The theory is based…

cmp-lg · Computer Science 2008-02-03 Carl de Marcken

Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features

The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We…

Computation and Language · Computer Science 2018-12-31 Matteo Pagliardini , Prakhar Gupta , Martin Jaggi

A New Sentence Extraction Strategy for Unsupervised Extractive Summarization Methods

In recent years, text summarization methods have attracted much attention again thanks to the researches on neural network models. Most of the current text summarization methods based on neural network models are supervised methods which…

Computation and Language · Computer Science 2024-01-25 Dehao Tao , Yingzhu Xiong , Zhongliang Yang , Yongfeng Huang

Three Concrete Challenges and Two Hopes for the Safety of Unsupervised Elicitation

To steer language models towards truthful outputs on tasks which are beyond human capability, previous work has suggested training models on easy tasks to steer them on harder ones (easy-to-hard generalization), or using unsupervised…

Machine Learning · Computer Science 2026-02-25 Callum Canavan , Aditya Shrivastava , Allison Qi , Jonathan Michala , Fabien Roger

Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge

This paper explores the task of translating natural language queries into regular expressions which embody their meaning. In contrast to prior work, the proposed neural model does not utilize domain-specific crafting, learning to translate…

Computation and Language · Computer Science 2016-08-11 Nicholas Locascio , Karthik Narasimhan , Eduardo DeLeon , Nate Kushman , Regina Barzilay

Bayesian Inference of Regular Expressions from Human-Generated Example Strings

In programming by example, users "write" programs by generating a small number of input-output examples and asking the computer to synthesize consistent programs. We consider a challenging problem in this domain: learning regular…

Artificial Intelligence · Computer Science 2018-09-28 Long Ouyang

Learning Purified Feature Representations from Task-irrelevant Labels

Learning an empirically effective model with generalization using limited data is a challenging task for deep neural networks. In this paper, we propose a novel learning framework called PurifiedLearning to exploit task-irrelevant features…

Machine Learning · Computer Science 2022-07-12 Yinghui Li , Chen Wang , Yangning Li , Hai-Tao Zheng , Ying Shen

Neural Summarization by Extracting Sentences and Words

Traditional approaches to extractive summarization rely heavily on human-engineered features. In this work we propose a data-driven approach based on neural networks and continuous sentence features. We develop a general framework for…

Computation and Language · Computer Science 2016-07-04 Jianpeng Cheng , Mirella Lapata

Refining Language Models with Compositional Explanations

Pre-trained language models have been successful on text classification tasks, but are prone to learning spurious correlations from biased datasets, and are thus vulnerable when making inferences in a new domain. Prior work reveals such…

Computation and Language · Computer Science 2022-01-03 Huihan Yao , Ying Chen , Qinyuan Ye , Xisen Jin , Xiang Ren

Unsupervised Data Selection for Supervised Learning

Recent research put a big effort in the development of deep learning architectures and optimizers obtaining impressive results in areas ranging from vision to language processing. However little attention has been addressed to the need of a…

Computer Vision and Pattern Recognition · Computer Science 2018-12-20 Gabriele Valvano , Andrea Leo , Daniele Della Latta , Nicola Martini , Gianmarco Santini , Dante Chiappino , Emiliano Ricciardi

Representation Learning for Weakly Supervised Relation Extraction

Recent years have seen rapid development in Information Extraction, as well as its subtask, Relation Extraction. Relation Extraction is able to detect semantic relations between entities in sentences. Currently, many efficient approaches…

Computation and Language · Computer Science 2024-03-19 Zhuang Li

A Preliminary Empirical Study on Prompt-based Unsupervised Keyphrase Extraction

Pre-trained large language models can perform natural language processing downstream tasks by conditioning on human-designed prompts. However, a prompt-based approach often requires "prompt engineering" to design different prompts,…

Computation and Language · Computer Science 2024-05-28 Mingyang Song , Yi Feng , Liping Jing

Acoustic data-driven lexicon learning based on a greedy pronunciation selection framework

Speech recognition systems for irregularly-spelled languages like English normally require hand-written pronunciations. In this paper, we describe a system for automatically obtaining pronunciations of words for which pronunciations are not…

Computation and Language · Computer Science 2017-06-13 Xiaohui Zhang , Vimal Manohar , Daniel Povey , Sanjeev Khudanpur

A Theory of Feature Learning

Feature Learning aims to extract relevant information contained in data sets in an automated fashion. It is driving force behind the current deep learning trend, a set of methods that have had widespread empirical success. What is lacking…

Machine Learning · Statistics 2015-04-02 Brendan van Rooyen , Robert C. Williamson

PatternRank: Leveraging Pretrained Language Models and Part of Speech for Unsupervised Keyphrase Extraction

Keyphrase extraction is the process of automatically selecting a small set of most relevant phrases from a given text. Supervised keyphrase extraction approaches need large amounts of labeled training data and perform poorly outside the…

Computation and Language · Computer Science 2023-01-03 Tim Schopf , Simon Klimek , Florian Matthes

Data Extraction via Semantic Regular Expression Synthesis

Many data extraction tasks of practical relevance require not only syntactic pattern matching but also semantic reasoning about the content of the underlying text. While regular expressions are very well suited for tasks that require only…

Programming Languages · Computer Science 2023-08-28 Qiaochu Chen , Arko Banerjee , Çağatay Demiralp , Greg Durrett , Isil Dillig