计算机科学
Many combinatorial designs ask for equal distribution of given symbols across the entries of a matrix. The paramount examples are Latin squares, where each symbol from $\{1,\dots,n\}$ appears once per row and column of an $n\times n$…
Inference-time scaling is a promising paradigm to improve generative models, especially when outputs must satisfy structural constraints or optimize downstream rewards. We consider Masked Diffusion Model (MDM) and introduce MDM-VGB, a…
Understanding how performance scales jointly with model size and data is a central problem in modern machine learning. Existing theoretical works on scaling laws typically describe generalization as a function of data or compute, often in…
Causal representation learning for time series has developed strong identifiability results in discrete-time latent causal models, but identifiability in continuous-time latent stochastic differential equation (SDE) models remains largely…
Temporal link prediction is usually evaluated by predictive performance on unseen edges, but in probabilistic temporal graphs this criterion can conflate model error with irreducible uncertainty. We study this issue by characterising an…
Physics-informed neural networks (PINNs) have recently emerged as a promising framework for addressing the Calder\'on inverse problem from limited boundary data. In this work, we revisit neural Calder\'on inversion by introducing multiscale…
Last-iterate convergence and generalization guarantees in first-order convex learning hinge on the monotonicity of the update operator. While linear averaging preserves the monotonicity of gradient updates, this property is often violated…
Strategic behaviour in queueing systems has been studied extensively in the behavioural queueing literature, but almost exclusively for systems that admit closed-form expressions for the cost or utility experienced by a strategic user.…
Search and recommender systems have produced highly relevant search results. A natural next step in the development of such systems in e-commerce is to rerank these results to increase the platform's revenue from paid promotion products.…
Insurance fraud remains costly and operationally difficult, particularly in call-centre workflows where many customer interactions begin at FNOL. While recent fraud detection methods mainly rely on structured data, text, or images, repeated…
Symbolic music alignment links notes in a symbolic performance to their counterparts in a score. While existing alignment encoding formats provide unique correspondences between these notes, there are various musical practices and forms…
Guessing Random Additive Noise Decoding (GRAND) performs decoding by sequentially guessing channel error patterns (EPs). Ordered Reliability Bits GRAND (ORBGRAND) is a notable instance suitable for efficient implementation, as it schedules…
Insurance fraud imposes substantial financial losses and operational inefficiencies, raising premiums and impacting trust among legitimate policyholders. Early detection at FNOL remains a persistent challenge. Existing approaches rely…
Benchmarks of machine learning models often include many datasets, making evaluation expensive. For efficiency, it is preferable to perform evaluations on small, representative datasets instead. The selection of such subsets typically…
Long-form audio exhibits an inherent hierarchy: fine-grained events form sub-activities, which in turn constitute higher-level activities. Prior work often models these levels separately, leading to cross-level inconsistencies and requiring…
AI agents are promising tools that can act as flexible behavioral nudges to enhance human cooperation in addressing large-scale societal problems. However, evidence on whether AI agents can effectively boost cooperation remains mixed. We…
Protein language models are standard priors for biological sequence generation, but steering them toward explicit distributional design targets remains largely unexplored. We study a constrained protein generation problem in which sequences…
The widespread collection of fine-grained location data by commercial data brokers creates a re-identification risk that is not widely recognised by the public. While prior research has established that mobility traces are highly unique and…
When researchers need a mathematical model for a research problem, they face a fragmented landscape: relevant formulas, quantities, assumptions, and model variants are scattered across publications and domain-specific conventions. The…
Journal recommendation is an important task in scholarly information systems. Existing approaches typically rely on supervised learning models, manually engineered features, or historical interaction data, which may limit their…