Related papers: Network Sampling Based on NN Representatives

Network Sampling: An Overview and Comparative Analysis

Network sampling is a crucial technique for analyzing large or partially observable networks. However, the effectiveness of different sampling methods can vary significantly depending on the context. In this study, we empirically compare…

Social and Information Networks · Computer Science 2025-05-05 Quoc Chuong Nguyen

Graph Sampling Approach for Reducing Computational Complexity of Large-Scale Social Network

Online social network services provide a platform for human social interactions. Nowadays, many kinds of online interactions generate large-scale social network data. Network analysis helps to mine knowledge and pattern from the…

Social and Information Networks · Computer Science 2021-02-19 Andry Alamsyah , Yahya Peranginangin , Intan Muchtadi-Alamsyah , Budi Rahardjo , Kuspriyanto

Edge sampling using network local information

Edge sampling is an important topic in network analysis. It provides a natural way to reduce network size while retaining desired features of the original network. Sampling methods that only use local information are common in practice as…

Statistics Theory · Mathematics 2020-08-12 Can M. Le

Practical Characterization of Large Networks Using Neighborhood Information

Characterizing large online social networks (OSNs) through node querying is a challenging task. OSNs often impose severe constraints on the query rate, hence limiting the sample size to a small fraction of the total network. Various ad-hoc…

Social and Information Networks · Computer Science 2013-11-14 Pinghui Wang , Bruno Ribeiro , Junzhou Zhao , John C. S. Lui , Don Towsley , Xiaohong Guan

Space-Efficient Sampling from Social Activity Streams

In order to efficiently study the characteristics of network domains and support development of network systems (e.g. algorithms, protocols that operate on networks), it is often necessary to sample a representative subgraph from a large…

Social and Information Networks · Computer Science 2012-06-22 Nesreen K. Ahmed , Jennifer Neville , Ramana Kompella

Adaptive Sampling Towards Fast Graph Representation Learning

Graph Convolutional Networks (GCNs) have become a crucial tool on learning representations of graph vertices. The main challenge of adapting GCNs on large-scale graphs is the scalability issue that it incurs heavy cost both in computation…

Computer Vision and Pattern Recognition · Computer Science 2018-11-20 Wenbing Huang , Tong Zhang , Yu Rong , Junzhou Huang

Knowledge Acquisition from Social Platforms Based on Network Distributions Fitting

The uniqueness of online social networks makes it possible to implement new methods that increase the quality and effectiveness of research processes. While surveys are one of the most important tools for research, the representativeness of…

Social and Information Networks · Computer Science 2015-05-13 Jarosław Jankowski , Radosław Michalski , Piotr Bródka , Przemysław Kazienko , Sonja Utz

How Many Samples are Needed to Estimate a Convolutional or Recurrent Neural Network?

It is widely believed that the practical success of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) owes to the fact that CNNs and RNNs use a more compact parametric representation than their Fully-Connected Neural…

Machine Learning · Statistics 2019-07-02 Simon S. Du , Yining Wang , Xiyu Zhai , Sivaraman Balakrishnan , Ruslan Salakhutdinov , Aarti Singh

Distributed Nearest Neighbor Classification

Nearest neighbor is a popular nonparametric method for classification and regression with many appealing properties. In the big data era, the sheer volume and spatial/temporal disparity of big data may prohibit centrally processing and…

Statistics Theory · Mathematics 2018-12-13 Jiexin Duan , Xingye Qiao , Guang Cheng

Sampling for Approximate Bipartite Network Projection

Bipartite networks manifest as a stream of edges that represent transactions, e.g., purchases by retail customers. Many machine learning applications employ neighborhood-based measures to characterize the similarity among the nodes, such as…

Social and Information Networks · Computer Science 2018-05-09 Nesreen K. Ahmed , Nick Duffield , Liangzhen Xia

Measuring Fundamental Properties of Real-World Complex Networks

Complex networks, modeled as large graphs, received much attention during these last years. However, data on such networks is only available through intricate measurement procedures. Until recently, most studies assumed that these…

Networking and Internet Architecture · Computer Science 2007-05-23 Matthieu Latapy , Clemence Magnien

Compressing network populations with modal networks reveals structural diversity

Analyzing relational data consisting of multiple samples or layers involves critical challenges: How many networks are required to capture the variety of structures in the data? And what are the structures of these representative networks?…

Physics and Society · Physics 2023-06-26 Alec Kirkley , Alexis Rojas , Martin Rosvall , Jean-Gabriel Young

Distance Queries from Sampled Data: Accurate and Efficient

Distance queries are a basic tool in data analysis. They are used for detection and localization of change for the purpose of anomaly detection, monitoring, or planning. Distance queries are particularly useful when data sets such as…

Data Structures and Algorithms · Computer Science 2015-03-20 Edith Cohen

Topology-based Representative Datasets to Reduce Neural Network Training Resources

One of the main drawbacks of the practical use of neural networks is the long time required in the training process. Such a training process consists of an iterative change of parameters trying to minimize a loss function. These changes are…

Machine Learning · Computer Science 2024-03-14 Rocio Gonzalez-Diaz , Miguel A. Gutiérrez-Naranjo , Eduardo Paluzo-Hidalgo

On the Question of Effective Sample Size in Network Modeling: An Asymptotic Inquiry

The modeling and analysis of networks and network data has seen an explosion of interest in recent years and represents an exciting direction for potential growth in statistics. Despite the already substantial amount of work done in this…

Statistics Theory · Mathematics 2015-08-06 Pavel N. Krivitsky , Eric D. Kolaczyk

Distributed Adaptive Nearest Neighbor Classifier: Algorithm and Theory

When data is of an extraordinarily large size or physically stored in different locations, the distributed nearest neighbor (NN) classifier is an attractive tool for classification. We propose a novel distributed adaptive NN classifier for…

Machine Learning · Statistics 2023-06-06 Ruiqi Liu , Ganggang Xu , Zuofeng Shang

Scalable Sampling for High Utility Patterns

Discovering valuable insights from data through meaningful associations is a crucial task. However, it becomes challenging when trying to identify representative patterns in quantitative databases, especially with large datasets, as…

Databases · Computer Science 2024-10-31 Lamine Diop , Marc Plantevit

Respondent-Driven Sampling: An Assessment of Current Methodology

Respondent-Driven Sampling (RDS) employs a variant of a link-tracing network sampling strategy to collect data from hard-to-reach populations. By tracing the links in the underlying social network, the process exploits the social structure…

Applications · Statistics 2009-04-14 Krista J. Gile , Mark S. Handcock

Sampling promotes community structure in social and information networks

Any network studied in the literature is inevitably just a sampled representative of its real-world analogue. Additionally, network sampling is lately often applied to large networks to allow for their faster and more efficient analysis.…

Social and Information Networks · Computer Science 2015-04-14 Neli Blagus , Lovro Šubelj , Gregor Weiss , Marko Bajec

Predictive Subsampling for Scalable Inference in Networks

Network datasets appear across a wide range of scientific fields, including biology, physics, and the social sciences. To enable data-driven discoveries from these networks, statistical inference techniques like estimation and hypothesis…

Methodology · Statistics 2026-02-19 Arpan Kumar , Minh Tang , Srijan Sengupta