English
Related papers

Related papers: A Sampling-based Framework for Hypothesis Testing …

200 papers

Motivated by gene set enrichment analysis, we investigate the problem of combined hypothesis testing on a graph. We introduce a general framework to effectively use the structural information of the underlying graph when testing…

Methodology · Statistics 2016-10-26 Shulei Wang , Ming Yuan

Hypothesis testing for graphs has been an important tool in applied research fields for more than two decades, and still remains a challenging problem as one often needs to draw inference from few replicates of large graphs. Recent studies…

Machine Learning · Statistics 2018-12-03 Debarghya Ghoshdastidar , Ulrike von Luxburg

Topological data analysis involves the statistical characterization of the shape of data. Persistent homology is a primary tool of topological data analysis, which can be used to analyze topological features and perform statistical…

Methodology · Statistics 2023-03-01 Chul Moon , Nicole A. Lazar

The graph based approach to multiple testing is an intuitive method that enables a study team to represent clearly, through a directed graph, its priorities for hierarchical testing of multiple hypotheses, and for propagating the available…

Methodology · Statistics 2025-01-07 Cyrus Mehta , Ajoy Mukhopadhyay , Martin Posch

Estimating characteristics of large graphs via sampling is a vital part of the study of complex networks. Current sampling methods such as (independent) random vertex and random walks are useful but have drawbacks. Random vertex sampling…

Data Structures and Algorithms · Computer Science 2010-09-08 Bruno Ribeiro , Don Towsley

We consider the multiple hypothesis testing (MHT) problem over the joint domain formed by a graph and a measure space. On each sample point of this joint domain, we assign a hypothesis test and a corresponding $p$-value. The goal is to make…

Signal Processing · Electrical Eng. & Systems 2025-06-05 Xingchao Jian , Martin Gölz , Feng Ji , Wee Peng Tay , Abdelhak M. Zoubir

Sampling is a standard approach in big-graph analytics; the goal is to efficiently estimate the graph properties by consulting a sample of the whole population. A perfect sample is assumed to mirror every property of the whole population.…

Social and Information Networks · Computer Science 2014-03-18 Nesreen K. Ahmed , Nick Duffield , Jennifer Neville , Ramana Kompella

Graph sampling is a technique to pick a subset of vertices and/ or edges from original graph. It has a wide spectrum of applications, e.g. survey hidden population in sociology [54], visualize social graph [29], scale down Internet AS graph…

Social and Information Networks · Computer Science 2013-08-28 Pili Hu , Wing Cheong Lau

Network datasets appear across a wide range of scientific fields, including biology, physics, and the social sciences. To enable data-driven discoveries from these networks, statistical inference techniques like estimation and hypothesis…

Methodology · Statistics 2026-02-19 Arpan Kumar , Minh Tang , Srijan Sengupta

In clinical trials, hypotheses are frequently organized into hierarchically ordered families, requiring specialized testing strategies that account for these structured relationships. Existing gatekeeping methods-including serial, parallel,…

Methodology · Statistics 2026-04-14 Zhiying Qiu , Li Yu , Wenge Guo

Graph-based tests are a class of non-parametric two-sample tests useful for analyzing high-dimensional data. The test statistics are constructed from similarity graphs (such as K-minimum spanning tree), and consequently, their performance…

Methodology · Statistics 2025-06-23 Yichuan Bai , Lynna Chu

High-dimensional feature selection is a central problem in a variety of application domains such as machine learning, image analysis, and genomics. In this paper, we propose graph-based tests as a useful basis for feature selection. We…

Methodology · Statistics 2024-08-13 Swarnadip Ghosh , Somabha Mukherjee , Divyansh Agarwal , Yichen He , Mingzhi Song , Xuejiao Pei

We develop a new sampling method to estimate eigenvector centrality on incomplete networks. Our goal is to estimate this global centrality measure having at disposal a limited amount of data. This is the case in many real-world scenarios…

Social and Information Networks · Computer Science 2020-10-29 Nicolò Ruggeri , Caterina De Bacco

As large graph datasets become increasingly common across many fields, sampling is often needed to reduce the graphs into manageable sizes. This procedure raises critical questions about representativeness as no sample can capture the…

Social and Information Networks · Computer Science 2025-02-25 Alan Zhu , Jiaqi Ma , Qiaozhu Mei

A two-sample hypothesis test is a statistical procedure used to determine whether the distributions generating two samples are identical. We consider the two-sample testing problem in a new scenario where the sample measurements (or sample…

Machine Learning · Computer Science 2024-07-01 Weizhi Li , Prad Kadambi , Pouria Saidi , Karthikeyan Natesan Ramamurthy , Gautam Dasarathy , Visar Berisha

Sparse exchangeable graphs on $\mathbb{R}_+$, and the associated graphex framework for sparse graphs, generalize exchangeable graphs on $\mathbb{N}$, and the associated graphon framework for dense graphs. We develop the graphex framework as…

Statistics Theory · Mathematics 2016-11-04 Victor Veitch , Daniel M. Roy

Two-sample tests utilizing a similarity graph on observations are useful for high-dimensional and non-Euclidean data due to their flexibility and good performance under a wide range of alternatives. Existing works mainly focused on sparse…

Statistics Theory · Mathematics 2023-11-14 Yejiong Zhu , Hao Chen

Rejecting the null hypothesis in two-sample testing is a fundamental tool for scientific discovery. Yet, aside from concluding that two samples do not come from the same probability distribution, it is often of interest to characterize how…

Statistics Theory · Mathematics 2021-09-08 Boris Landa , Rihao Qu , Joseph Chang , Yuval Kluger

High dimensional hypothesis test deals with models in which the number of parameters is significantly larger than the sample size. Existing literature develops a variety of individual tests. Some of them are sensitive to the dense and small…

Statistics Theory · Mathematics 2018-08-09 Cheng Zhou , Xinsheng Zhang , Wenxin Zhou , Han Liu

Graphlets are induced subgraph patterns and have been frequently applied to characterize the local topology structures of graphs across various domains, e.g., online social networks (OSNs) and biological networks. Discovering and computing…

Social and Information Networks · Computer Science 2016-10-19 Xiaowei Chen , Yongkun Li , Pinghui Wang , John C. S. Lui
‹ Prev 1 2 3 10 Next ›