English
Related papers

Related papers: Learning Order Forest for Qualitative-Attribute Da…

200 papers

Clustering is a popular machine learning technique for data mining that can process and analyze datasets to automatically reveal sample distribution patterns. Since the ubiquitous categorical data naturally lack a well-defined metric space…

Machine Learning · Computer Science 2025-09-01 Yiqun Zhang , Mingjie Zhao , Hong Jia , Yang Lu , Mengke Li , Yiu-ming Cheung

Categorical attributes with qualitative values are ubiquitous in cluster analysis of real datasets. Unlike the Euclidean distance of numerical attributes, the categorical attributes lack well-defined relationships of their possible values…

Machine Learning · Computer Science 2025-11-13 Mingjie Zhao , Zhanpei Huang , Yang Lu , Mengke Li , Yiqun Zhang , Weifeng Su , Yiu-ming Cheung

Random forests are a machine learning method used to automatically classify datasets and consist of a multitude of decision trees. While these random forests often have higher performance and generalize better than a single decision tree,…

Machine Learning · Computer Science 2025-07-31 Max Sondag , Christofer Meinecke , Dennis Collaris , Tatiana von Landesberger , Stef van den Elzen

We introduce a cluster evaluation technique called Tree Index. Our Tree Index algorithm aims at describing the structural information of the clustering rather than the quantitative format of cluster-quality indexes (where the representation…

Machine Learning · Computer Science 2020-03-25 A. H. Beg , Md Zahidul Islam , Vladimir Estivill-Castro

Hierarchical clustering is a class of algorithms that seeks to build a hierarchy of clusters. It has been the dominant approach to constructing embedded classification schemes since it outputs dendrograms, which capture the hierarchical…

Machine Learning · Statistics 2018-08-28 Xiaofei Ma , Satya Dhavala

Clustering is an unsupervised machine learning methodology where unlabeled elements/objects are grouped together aiming to the construction of well-established clusters that their elements are classified according to their similarity. The…

Machine Learning · Statistics 2023-10-20 Dimitrios Saligkaras , Vasileios E. Papageorgiou

Tree structures appear in many fields of the life sciences, including phylogenetics, developmental biology and nucleic acid structures. Trees can be used to represent RNA secondary structures, which directly relate to the function of…

Machine Learning · Computer Science 2026-01-22 Pengyu Liu , Mariel Vázquez , Nataša Jonoska

Datasets composed of numerical and categorical attributes (also called mixed data hereinafter) are common in real clustering tasks. Differing from numerical attributes that indicate tendencies between two concepts (e.g., high and low…

Machine Learning · Computer Science 2026-03-06 Yiqun Zhang , Mingjie Zhao , Yizhou Chen , Yang Lu , Yiu-ming Cheung

Clustering is a fundamental learning task widely used as a first step in data analysis. For example, biologists use cluster assignments to analyze genome sequences, medical records, or images. Since downstream analysis is typically…

Machine Learning · Computer Science 2024-06-11 Jonathan Svirsky , Ofir Lindenbaum

The wealth of data being gathered about humans and their surroundings drives new machine learning applications in various fields. Consequently, more and more often, classifiers are trained using not only numerical data but also complex data…

Machine Learning · Computer Science 2022-04-13 Maciej Piernik , Dariusz Brzezinski , Pawel Zawadzki

Data clustering, the task of grouping observations according to their similarity, is a key component of unsupervised learning -- with real world applications in diverse fields such as biology, medicine, and social science. Often in these…

Machine Learning · Computer Science 2023-09-20 Anne Sophie Riis Damstrup , Sofie Tosti Madsen , Michele Coscia

Clustering is an underspecified task: there are no universal criteria for what makes a good clustering. This is especially true for relational data, where similarity can be based on the features of individuals, the relationships between…

Machine Learning · Statistics 2017-09-29 Sebastijan Dumancic , Hendrik Blockeel

With the rapid development of online social media, online shopping sites and cyber-physical systems, heterogeneous information networks have become increasingly popular and content-rich over time. In many cases, such networks contain…

Databases · Computer Science 2012-02-01 Yizhou Sun , Charu C. Aggarwal , Jiawei Han

Distributional text clustering delivers semantically informative representations and captures the relevance between each word and semantic clustering centroids. We extend the neural text clustering approach to text classification tasks by…

Computation and Language · Computer Science 2020-11-25 Yekun Chai , Haidong Zhang , Shuo Jin

In this paper, we address an issue of finding explainable clusters of class-uniform data in labelled datasets. The issue falls into the domain of interpretable supervised clustering. Unlike traditional clustering, supervised clustering aims…

Machine Learning · Computer Science 2023-07-18 Natallia Kokash , Leonid Makhnist

We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries needed to achieve…

Machine Learning · Computer Science 2019-10-15 Fabio Vitale , Anand Rajagopalan , Claudio Gentile

Clustering is an important part of many modern data analysis pipelines, including network analysis and data retrieval. There are many different clustering algorithms developed by various communities, and it is often not clear which…

Machine Learning · Computer Science 2019-10-04 Maria-Florina Balcan , Travis Dick , Manuel Lang

An appropriate distance metric is crucial for categorical data clustering, as the distance between categorical data cannot be directly calculated. However, the distances between attribute values usually vary in different clusters induced by…

Machine Learning · Computer Science 2026-03-09 Taixi Chen , Yiu-ming Cheung , Yiqun Zhang

We consider the problem of personalized federated learning when there are known cluster structures within users. An intuitive approach would be to regularize the parameters so that users in the same cluster share similar model weights. The…

Machine Learning · Computer Science 2022-04-29 Boxiang Lyu , Filip Hanzely , Mladen Kolar

Predictive clustering trees (PCTs) are a well established generalization of standard decision trees, which can be used to solve a variety of predictive modeling tasks, including structured output prediction. Combining them into ensembles…

Machine Learning · Computer Science 2020-11-06 Tomaž Stepišnik , Dragi Kocev
‹ Prev 1 2 3 10 Next ›