Related papers: Classification under Streaming Emerging New Classe…
When performing data classification over a stream of continuously occurring instances, a key challenge is to develop an open-world classifier that anticipates instances from an unknown class. Studies addressing this problem, typically…
Decision forests, including random forests and gradient boosting trees, remain the leading machine learning methods for many real-world data problems, especially on tabular data. However, most of the current implementations only operate in…
Encrypted network traffic Classification tackles the problem from different approaches and with different goals. One of the common approaches is using Machine learning or Deep Learning-based solutions on a fixed number of classes, leading…
Graph convolutional network (GCN) provides a powerful means for graph-based semi-supervised tasks. However, as a localized first-order approximation of spectral graph convolution, the classic GCN can not take full advantage of unlabeled…
Stream mining poses unique challenges to machine learning: predictive models are required to be scalable, incrementally trainable, must remain bounded in size (even when the data stream is arbitrarily long), and be nonparametric in order to…
One of the key challenges of performing label prediction over a data stream concerns with the emergence of instances belonging to unobserved class labels over time. Previously, this problem has been addressed by detecting such instances and…
Various modifications of decision trees have been extensively used during the past years due to their high efficiency and interpretability. Tree node splitting based on relevant feature selection is a key step of decision tree learning, at…
Semi-supervised semantic segmentation learns from small amounts of labelled images and large amounts of unlabelled images, which has witnessed impressive progress with the recent advance of deep neural networks. However, it often suffers…
The non-stationary nature of data streams strongly challenges traditional machine learning techniques. Although some solutions have been proposed to extend traditional machine learning techniques for handling data streams, these approaches…
Seeding then expanding is a commonly used scheme to discover overlapping communities in a network. Most seeding methods are either too complex to scale to large networks or too simple to select high-quality seeds, and the non-principled…
Unlabelled data appear in many domains and are particularly relevant to streaming applications, where even though data is abundant, labelled data is rare. To address the learning problems associated with such data, one can ignore the…
Anomaly detection is being regarded as an unsupervised learning task as anomalies stem from adversarial or unlikely events with unknown distributions. However, the predictive performance of purely unsupervised anomaly detection often fails…
Distribution regression refers to the supervised learning problem where labels are only available for groups of inputs instead of individual inputs. In this paper, we develop a rigorous mathematical framework for distribution regression…
We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes. This setting is similar to semi-supervised learning, but significantly harder because there are no labelled examples for the…
Big Data streams are being generated in a faster, bigger, and more commonplace. In this scenario, Hoeffding Trees are an established method for classification. Several extensions exist, including high-performing ensemble setups such as…
Computing universal distributed representations of sentences is a fundamental task in natural language processing. We propose ConsSent, a simple yet surprisingly powerful unsupervised method to learn such representations by enforcing…
In (\cite{zhang2014nonlinear,zhang2014nonlinear2}), we have viewed machine learning as a coding and dimensionality reduction problem, and further proposed a simple unsupervised dimensionality reduction method, entitled deep distributed…
In collaborative learning, learners coordinate to enhance each of their learning performances. From the perspective of any learner, a critical challenge is to filter out unqualified collaborators. We propose a framework named meta…
One of the challenges in deploying a machine learning model is that the model's performance degrades as the operating environment changes. To maintain the performance, streaming active learning is used, in which the model is retrained by…
Sequential sensor data is generated in a wide variety of practical applications. A fundamental challenge involves learning effective classifiers for such sequential data. While deep learning has led to impressive performance gains in recent…