English
Related papers

Related papers: Computing Word Classes Using Spectral Clustering

200 papers

Brown clustering is a hard, hierarchical, bottom-up clustering of words in a vocabulary. Words are assigned to clusters based on their usage pattern in a given corpus. The resulting clusters and hierarchical structure can be used in…

Computation and Language · Computer Science 2016-08-05 Manuel R. Ciosici

We analyze here a particular kind of linguistic network where vertices representwords and edges stand for syntactic relationships between words. The statisticalproperties of these networks have been recently studied and various features…

Statistical Mechanics · Physics 2007-05-23 Ramon Ferrer i Cancho , Andrea Capocci , Guido Caldarelli

Spectral clustering is known as a powerful technique in unsupervised data analysis. The vast majority of approaches to spectral clustering are driven by a single modality, leaving the rich information in multi-modal representations…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Bo Peng , Yuanwei Hu , Bo Liu , Ling Chen , Jie Lu , Zhen Fang

Clustering Text has been an important problem in the domain of Natural Language Processing. While there are techniques to cluster text based on using conventional clustering techniques on top of contextual or non-contextual vector space…

Computation and Language · Computer Science 2022-01-11 Lovedeep Singh

Clustering data objects into homogeneous groups is one of the most important tasks in data mining. Spectral clustering is arguably one of the most important algorithms for clustering, as it is appealing for its theoretical soundness and is…

Machine Learning · Statistics 2024-03-12 Dylan Soemitro , Jeova Farias Sales Rocha Neto

Dictionary learning and sparse coding have been widely studied as mechanisms for unsupervised feature learning. Unsupervised learning could bring enormous benefit to the processing of hyperspectral images and to other remote sensing data…

Image and Video Processing · Electrical Eng. & Systems 2022-02-03 Joshua Bruton , Hairong Wang

Speaker clustering is the task of differentiating speakers in a recording. In a way, the aim is to answer "who spoke when" in audio recordings. A common method used in industry is feature extraction directly from the recording thanks to…

Sound · Computer Science 2018-03-23 Maxime Jumelle , Taqiyeddine Sakmeche

The unsupervised text clustering is one of the major tasks in natural language processing (NLP) and remains a difficult and complex problem. Conventional \mbox{methods} generally treat this task using separated steps, including text…

Computation and Language · Computer Science 2019-03-25 Jie Zhou , Xingyi Cheng , Jinchao Zhang

Clustering is the problem of separating a set of objects into groups (called clusters) so that objects within the same cluster are more similar to each other than to those in different clusters. Spectral clustering is a now well-known…

Machine Learning · Computer Science 2012-11-16 B. Cung , T. Jin , J. Ramirez , A. Thompson , C. Boutsidis , D. Needell

Text clustering serves as a fundamental technique for organizing and interpreting unstructured textual data, particularly in contexts where manual annotation is prohibitively costly. With the rapid advancement of Large Language Models…

Computation and Language · Computer Science 2025-10-08 Chen Huang , Guoxiu He

This paper (cmp-lg/yymmnnn) has been accepted for publication in the student session of EACL-95. It outlines ongoing work using statistical and unsupervised neural network methods for clustering words in untagged corpora. Such approaches…

cmp-lg · Computer Science 2008-02-03 Christopher C. Huckle

The objective functions used in spectral clustering are usually composed of two terms: i) a term that minimizes the local quadratic variation of the cluster assignments on the graph and; ii) a term that balances the clustering partition and…

Machine Learning · Computer Science 2022-11-29 Filippo Maria Bianchi

Text classification is a task of automatic classification of text into one of the predefined categories. The problem of text classification has been widely studied in different communities like natural language processing, data mining and…

Computation and Language · Computer Science 2014-06-24 Reshma Prasad , Mary Priya Sebastian

We study supervised learning problems using clustering constraints to impose structure on either features or samples, seeking to help both prediction and interpretation. The problem of clustering features arises naturally in text…

Machine Learning · Computer Science 2016-09-20 Vincent Roulet , Fajwel Fogel , Alexandre d'Aspremont , Francis Bach

Considering that words with different characteristic in the text have different importance for classification, grouping them together separately can strengthen the semantic expression of each part. Thus we propose a new text representation…

Computation and Language · Computer Science 2019-06-19 Xiaoye Tan , Rui Yan , Chongyang Tao , Mingrui Wu

Distributional text clustering delivers semantically informative representations and captures the relevance between each word and semantic clustering centroids. We extend the neural text clustering approach to text classification tasks by…

Computation and Language · Computer Science 2020-11-25 Yekun Chai , Haidong Zhang , Shuo Jin

Spectral clustering is a powerful technique for clustering high-dimensional data, utilizing graph-based representations to detect complex, non-linear structures and non-convex clusters. The construction of a similarity graph is essential…

Machine Learning · Computer Science 2025-01-27 Kamal Berahmand , Farid Saberi-Movahed , Razieh Sheikhpour , Yuefeng Li , Mahdi Jalili

We describe and experimentally evaluate a method for automatically clustering words according to their distribution in particular syntactic contexts. Deterministic annealing is used to find lowest distortion sets of clusters. As the…

cmp-lg · Computer Science 2008-02-03 Fernando Pereira , Naftali Tishby , Lillian Lee

Sentence embedding methods offer a powerful approach for working with short textual constructs or sequences of words. By representing sentences as dense numerical vectors, many natural language processing (NLP) applications have improved…

Computation and Language · Computer Science 2021-10-05 Yuan An , Alexander Kalinowski , Jane Greenberg

Spectral clustering is one of the most prominent clustering approaches. The distance-based similarity is the most widely used method for spectral clustering. However, people have already noticed that this is not suitable for multi-scale…

Machine Learning · Computer Science 2020-09-11 Hengrui Wang , Yubo Zhang , Mingzhi Chen , Tong Yang
‹ Prev 1 2 3 10 Next ›