Related papers: MIRAGE: An Iterative MapReduce based FrequentSubgr…
Given a labeled graph, the frequent-subgraph mining (FSM) problem asks to find all the $k$-vertex subgraphs that appear with frequency greater than a given threshold. FSM has numerous applications ranging from biology to network science, as…
To effectively leverage user-specific data, retrieval augmented generation (RAG) is employed in multimodal large language model (MLLM) applications. However, conventional retrieval approaches often suffer from limited retrieval accuracy.…
Frequent Subgraph Mining (FSM) is the key task in many graph mining and machine learning applications. Numerous systems have been proposed for FSM in the past decade. Although these systems show good performance for small patterns (with no…
Large reasoning models (LRMs) have shown significant progress in test-time scaling through chain-of-thought prompting. Current approaches like search-o1 integrate retrieval augmented generation (RAG) into multi-step reasoning processes but…
Recently, graph mining approaches have become very popular, especially in domains such as bioinformatics, chemoinformatics and social networks. In this scope, one of the most challenging tasks is frequent subgraph discovery. This task has…
Finding frequently occurring subgraph patterns or network motifs in neural architectures is crucial for optimizing efficiency, accelerating design, and uncovering structural insights. However, as the subgraph size increases,…
Frequent Subgraph Mining (FSM) is the process of identifying common subgraph patterns that surpass a predefined frequency threshold. While FSM is widely applicable in fields like bioinformatics, chemical analysis, and social network anomaly…
Identifying frequent subgraphs, also called network motifs, is crucial in analyzing and predicting properties of real-world networks. However, finding large commonly-occurring motifs remains a challenging problem not only due to its NP-hard…
Nowadays, frequent pattern mining (FPM) on large graphs receives increasing attention, since it is crucial to a variety of applications, e.g., social analysis. Informally, the FPM problem is defined as finding all the patterns in a large…
We introduce Mirage, the first multi-level superoptimizer for tensor programs. A key idea in Mirage is $\mu$Graphs, a uniform representation of tensor programs at the kernel, thread block, and thread levels of the GPU compute hierarchy.…
Retrieval-Augmented Generation (RAG) has gained prominence as an effective method for enhancing the generative capabilities of Large Language Models (LLMs) through the incorporation of external knowledge. However, the evaluation of RAG…
Large multimodal models (LMMs) have achieved high performance in vision-language tasks involving single image but they struggle when presented with a collection of multiple images (Multiple Image Question Answering scenario). These tasks,…
While building machine learning models, Feature selection (FS) stands out as an essential preprocessing step used to handle the uncertainty and vagueness in the data. Recently, the minimum Redundancy and Maximum Relevance (mRMR) approach…
Rationale discovery is defined as finding a subset of the input data that maximally supports the prediction of downstream tasks. In the context of graph machine learning, graph rationale is defined to locate the critical subgraph in the…
Mining labeled subgraph is a popular research task in data mining because of its potential application in many different scientific domains. All the existing methods for this task explicitly or implicitly solve the subgraph isomorphism task…
Hypergraphs serve as an effective tool widely adopted to characterize higher-order interactions in complex systems. The most intuitive and commonly used mathematical instrument for representing a hypergraph is the incidence matrix, in which…
To be useful for downstream applications, vision decoding models that are trained to reconstruct seen images from human brain activity must be able to generalize to internally generated visual representations, i.e., mental images. In an…
The mining of frequent subgraphs from labeled graph data has been studied extensively. Furthermore, much attention has recently been paid to frequent pattern mining from graph sequences. A method, called GTRACE, has been proposed to mine…
Streaming graphs are drawing increasing attention in both academic and industrial communities as many graphs in real applications evolve over time. Continuous subgraph matching (shorted as CSM) aims to report the incremental matches of a…
Recently there has been a surge of interest in designing graph embedding methods. Few, if any, can scale to a large-sized graph with millions of nodes due to both computational complexity and memory requirements. In this paper, we relax…