Related papers: Statistical Industry Classification

Semi-supervised clustering methods

Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning…

Methodology · Statistics 2014-07-11 Eric Bair

Spectral Clustering of Categorical and Mixed-type Data via Extra Graph Nodes

Clustering data objects into homogeneous groups is one of the most important tasks in data mining. Spectral clustering is arguably one of the most important algorithms for clustering, as it is appealing for its theoretical soundness and is…

Machine Learning · Statistics 2024-03-12 Dylan Soemitro , Jeova Farias Sales Rocha Neto

Partitioning Clustering algorithms for handling numerical and categorical data: a review

Clustering is widely used in different field such as biology, psychology, and economics. Most traditional clustering algorithms are limited to handling datasets that contain either numeric or categorical attributes. However, datasets with…

Databases · Computer Science 2019-07-03 Trupti M. Kodinariya Dr. Prashant R. Makwana

Relation between Financial Market Structure and the Real Economy: Comparison between Clustering Methods

We quantify the amount of information filtered by different hierarchical clustering methods on correlations between stock returns comparing it with the underlying industrial activity structure. Specifically, we apply, for the first time to…

Statistical Finance · Quantitative Finance 2023-07-19 Nicolo Musmeci , Tomaso Aste , Tiziana Di Matteo

Comparison three methods of clustering: k-means, spectral clustering and hierarchical clustering

Comparison of three kind of the clustering and find cost function and loss function and calculate them. Error rate of the clustering methods and how to calculate the error percentage always be one on the important factor for evaluating the…

Machine Learning · Computer Science 2014-11-14 Kamran Kowsari

A Short Survey on Data Clustering Algorithms

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…

Data Structures and Algorithms · Computer Science 2015-12-01 Ka-Chun Wong

Clustering by Constructing Hyper-Planes

As a kind of basic machine learning method, clustering algorithms group data points into different categories based on their similarity or distribution. We present a clustering algorithm by finding hyper-planes to distinguish the data…

Computer Vision and Pattern Recognition · Computer Science 2020-04-28 Luhong Diao , Jinying Gao1 , Manman Deng

Predictive K-means with local models

Supervised classification can be effective for prediction but sometimes weak on interpretability or explainability (XAI). Clustering, on the other hand, tends to isolate categories or profiles that can be meaningful but there is no…

Machine Learning · Computer Science 2021-04-27 Vincent Lemaire , Oumaima Alaoui Ismaili , Antoine Cornuéjols , Dominique Gay

Categorical data clustering: 25 years beyond K-modes

The clustering of categorical data is a common and important task in computer science, offering profound implications across a spectrum of applications. Unlike purely numerical data, categorical data often lack inherent ordering as in…

Machine Learning · Computer Science 2025-01-28 Tai Dinh , Wong Hauchi , Philippe Fournier-Viger , Daniil Lisik , Minh-Quyet Ha , Hieu-Chi Dam , Van-Nam Huynh

Hierarchical Qualitative Clustering: clustering mixed datasets with critical qualitative information

Clustering can be used to extract insights from data or to verify some of the assumptions held by the domain experts, namely data segmentation. In the literature, few methods can be applied in clustering qualitative values using the context…

Machine Learning · Computer Science 2020-07-07 Diogo Seca , João Mendes-Moreira , Tiago Mendes-Neves , Ricardo Sousa

Clustering -- Basic concepts and methods

We review clustering as an analysis tool and the underlying concepts from an introductory perspective. What is clustering and how can clusterings be realised programmatically? How can data be represented and prepared for a clustering task?…

Machine Learning · Computer Science 2022-12-05 Jan-Oliver Felix Kapp-Joswig , Bettina G. Keller

Practical Introduction to Clustering Data

Data clustering is an approach to seek for structure in sets of complex data, i.e., sets of "objects". The main objective is to identify groups of objects which are similar to each other, e.g., for classification. Here, an introduction to…

Data Analysis, Statistics and Probability · Physics 2016-02-17 Alexander K. Hartmann

A Rapid Review of Clustering Algorithms

Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data. They play an important role in today's life, such as in marketing and e-commerce, healthcare, data…

Machine Learning · Computer Science 2024-01-17 Hui Yin , Amir Aryani , Stephen Petrie , Aishwarya Nambissan , Aland Astudillo , Shengyuan Cao

Quartile Clustering: A quartile based technique for Generating Meaningful Clusters

Clustering is one of the main tasks in exploratory data analysis and descriptive statistics where the main objective is partitioning observations in groups. Clustering has a broad range of application in varied domains like climate,…

Databases · Computer Science 2012-03-20 Saptarsi Goswami , Amlan Chakrabarti

Clustering Mixed Numeric and Categorical Data: A Cluster Ensemble Approach

Clustering is a widely used technique in data mining applications for discovering patterns in underlying data. Most traditional clustering algorithms are limited to handling datasets that contain either numeric or categorical attributes.…

Artificial Intelligence · Computer Science 2007-05-23 Zengyou He , Xiaofei Xu , Shengchun Deng

Determining Optimal Number of k-Clusters based on Predefined Level-of-Similarity

This paper proposes a centroid-based clustering algorithm which is capable of clustering data-points with n-features, without having to specify the number of clusters to be formed. The core logic behind the algorithm is a similarity…

Machine Learning · Computer Science 2020-10-08 Rabindra Lamsal , Shubham Katiyar

A Multimodal Embedding-Based Approach to Industry Classification in Financial Markets

Industry classification schemes provide a taxonomy for segmenting companies based on their business activities. They are relied upon in industry and academia as an integral component of many types of financial and economic analysis.…

Statistical Finance · Quantitative Finance 2022-11-14 Rian Dolphin , Barry Smyth , Ruihai Dong

An Overview on Clustering Methods

Clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. Clustering is the process of grouping similar…

Data Structures and Algorithms · Computer Science 2012-05-08 T. Soni Madhulatha

Open Source Fundamental Industry Classification

We provide complete source code for building a fundamental industry classification based on publically available and freely downloadable data. We compare various fundamental industry classifications by running a horserace of short-horizon…

General Finance · Quantitative Finance 2017-12-25 Zura Kakushadze , Willie Yu

A penalized criterion for selecting the number of clusters for K-medians

Clustering is a usual unsupervised machine learning technique for grouping the data points into groups based upon similar features. We focus here on unsupervised clustering for contaminated data, i.e in the case where K-medians should be…

Statistics Theory · Mathematics 2024-02-28 Antoine Godichon-Baggioni , Sobihan Surendran