English
Related papers

Related papers: Efficient Subspace Search in Data Streams

200 papers

Big data problems frequently require processing datasets in a streaming fashion, either because all data are available at once but collectively are larger than available memory or because the data intrinsically arrive one data point at a…

Computation · Statistics 2018-08-08 Andrea Giovannucci , Victor Minden , Cengiz Pehlevan , Dmitri B. Chklovskii

The growing popularity of dynamic applications such as social networks provides a promising way to detect valuable information in real time. Efficient analysis over high-speed data from dynamic applications is of great significance. Data…

Databases · Computer Science 2018-09-05 Youhuan Li , Lei Zou , M. Tamer Ozsu , Dongyan Zhao

This paper presents a novel high speed clustering scheme for high dimensional data streams. Data stream clustering has gained importance in different applications, for example, in network monitoring, intrusion detection, and real-time…

Databases · Computer Science 2015-10-13 Irshad Ahmed , Irfan Ahmed , Waseem Shahzad

Given a stream of entries in a multi-aspect data setting i.e., entries having multiple dimensions, how can we detect anomalous activities in an unsupervised manner? For example, in the intrusion detection setting, existing work seeks to…

Machine Learning · Computer Science 2021-06-09 Siddharth Bhatia , Arjit Jain , Pan Li , Ritesh Kumar , Bryan Hooi

In an era of ubiquitous large-scale streaming data, the availability of data far exceeds the capacity of expert human analysts. In many settings, such data is either discarded or stored unprocessed in datacenters. This paper proposes a…

Machine Learning · Statistics 2016-09-13 Xin Jiang , Rebecca Willett

Anomaly detection is critical for finding suspicious behavior in innumerable systems. We need to detect anomalies in real-time, i.e. determine if an incoming entity is anomalous or not, as soon as we receive it, to minimize the effects of…

Machine Learning · Computer Science 2023-01-31 Siddharth Bhatia

Finding patterns in large highly connected datasets is critical for value discovery in business development and scientific research. This work focuses on the problem of subgraph matching on streaming graphs, which provides utility in a…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-06-22 Bibek Bhattarai , Howie Huang

Data streams (streaming data) consist of transiently observed, evolving in time, multidimensional data sequences that challenge our computational and/or inferential capabilities. In this paper we propose user friendly approaches for robust…

Applications · Statistics 2015-01-20 Daniel Kosiorowski

We discuss the problem of extending data mining approaches to cases in which data points arise in the form of individual graphs. Being able to find the intrinsic low-dimensionality in ensembles of graphs can be useful in a variety of…

Social and Information Networks · Computer Science 2016-12-12 Karthikeyan Rajendran , Assimakis A. Kattis , Alexander Holiday , Risi Kondor , Ioannis G. Kevrekidis

We study high-dimensional robust statistics tasks in the streaming model. A recent line of work obtained computationally efficient algorithms for a range of high-dimensional robust estimation tasks. Unfortunately, all previous algorithms…

Data Structures and Algorithms · Computer Science 2023-05-04 Ilias Diakonikolas , Daniel M. Kane , Ankit Pensia , Thanasis Pittas

We analyze the dynamics of streaming stochastic gradient descent (SGD) in the high-dimensional limit when applied to generalized linear models and multi-index models (e.g. logistic regression, phase retrieval) with general data-covariance.…

Optimization and Control · Mathematics 2023-08-21 Elizabeth Collins-Woodfin , Courtney Paquette , Elliot Paquette , Inbar Seroussi

Similarity matching and join of time series data streams has gained a lot of relevance in today's world that has large streaming data. This process finds wide scale application in the areas of location tracking, sensor networks, object…

Databases · Computer Science 2013-12-11 R H Vishwanath , T V Samartha , K C Srikantaiah , K R Venugopal , L M Patnaik

We study the space complexity of solving the bias-regularized SVM problem in the streaming model. This is a classic supervised learning problem that has drawn lots of attention, including for developing fast algorithms for solving the…

Data Structures and Algorithms · Computer Science 2020-07-08 Alexandr Andoni , Collin Burns , Yi Li , Sepideh Mahabadi , David P. Woodruff

High-dimensional streaming data are becoming increasingly ubiquitous in many fields. They often lie in multiple low-dimensional subspaces, and the manifold structures may change abruptly on the time scale due to pattern shift or occurrence…

Machine Learning · Statistics 2022-04-13 Ruiyu Xu , Jianguo Wu , Xiaowei Yue , Yongxiang Li

Our work focuses on anomaly detection in cyber-physical systems. Prior literature has three limitations: (1) Failing to capture long-delayed patterns in system anomalies; (2) Ignoring dynamic changes in sensor connections; (3) The curse of…

Machine Learning · Computer Science 2023-02-28 Ehtesamul Azim , Dongjie Wang , Yanjie Fu

We present a framework for supervised subspace tracking, when there are two time series $x_t$ and $y_t$, one being the high-dimensional predictors and the other being the response variables and the subspace tracking needs to take into…

Machine Learning · Computer Science 2015-09-02 Yao Xie , Ruiyang Song , Hanjun Dai , Qingbin Li , Le Song

Similarity search is the task of retrieving data items that are similar to a given query. In this paper, we introduce the time-sensitive notion of similarity search over endless data-streams (SSDS), which takes into account data quality and…

Information Retrieval · Computer Science 2017-08-08 Naama Kraus , David Carmel , Idit Keidar

Given a stream of heterogeneous graphs containing different types of nodes and edges, how can we spot anomalous ones in real-time while consuming bounded memory? This problem is motivated by and generalizes from its application in security…

Social and Information Networks · Computer Science 2016-02-23 Emaad A. Manzoor , Sadegh Momeni , Venkat N. Venkatakrishnan , Leman Akoglu

Community detection is a fundamental task in graph analysis, with methods often relying on fitting models like the Stochastic Block Model (SBM) to observed networks. While many algorithms can accurately estimate SBM parameters when the…

Machine Learning · Statistics 2025-06-05 Leonardo Martins Bianco , Christine Keribin , Zacharie Naulet

Emerging applications of machine learning in numerous areas involve continuous gathering of and learning from streams of data. Real-time incorporation of streaming data into the learned models is essential for improved inference in these…

Machine Learning · Computer Science 2020-12-01 Matthew Nokleby , Haroon Raja , Waheed U. Bajwa
‹ Prev 1 2 3 10 Next ›