Related papers: Separating Principles Below WKL0
The separation between two theorems in reverse mathematics is usually done by constructing a Turing ideal satisfying a theorem P and avoiding the solutions to a fixed instance of a theorem Q. Lerman, Solomon and Towsner introduced a forcing…
A function is diagonally non-computable (d.n.c.) if it diagonalizes against the universal partial computable function. D.n.c. functions play a central role in algorithmic randomness and reverse mathematics. Flood and Towsner asked for which…
We propose a new method for constructing Turing ideals satisfying principles of reverse mathematics below the Chain-Antichain Principle (CAC). Using this method, we are able to prove several new separations in the presence of Weak Konig's…
Mathematical theorem proving is an important testbed for large language models' deep and abstract reasoning capability. This paper focuses on improving LLMs' ability to write proofs in formal languages that permit automated proof…
Table Question Answering (TQA) aims to answer natural language questions over structured tables. Large Language Models (LLMs) enable promising solutions to this problem, with operator-centric solutions that generate table manipulation…
We combine constrained literals for model representation with key concepts from first-order superposition and propositional conflict-driven clause learning (CDCL) to create the new calculus Non-Redundant Clause Learning (NRCL) deciding the…
It remains an open question whether LLMs can acquire or generalize genuinely new reasoning strategies, beyond the sharpened skills encoded in their parameters during pre-training or post-training. To attempt to answer this debate, we…
We study L\"owenheim-Skolem and Omitting Types theorems in Transition Algebra, a logical system obtained by enhancing many sorted first-order logic with features from dynamic logic. The sentences we consider include compositions, unions,…
LLMs show remarkable emergent abilities, such as inferring concepts from presumably out-of-distribution prompts, known as in-context learning. Though this success is often attributed to the Transformer architecture, our systematic…
This paper proposes algorithms for learning two-level Boolean rules in Conjunctive Normal Form (CNF, i.e. AND-of-ORs) or Disjunctive Normal Form (DNF, i.e. OR-of-ANDs) as a type of human-interpretable classification model, aiming for a…
Large language models (LLMs) have achieved remarkable performance on Natural Language Processing (NLP) tasks, but they are hindered by high computational costs and memory requirements. Ternarization, an extreme form of quantization, offers…
Learning nonparametric systems of Ordinary Differential Equations (ODEs) dot x = f(t,x) from noisy data is an emerging machine learning topic. We use the well-developed theory of Reproducing Kernel Hilbert Spaces (RKHS) to define candidates…
State-of-the-art NLP methods achieve human-like performance on many tasks, but make errors nevertheless. Characterizing these errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors,…
We introduce Constr-DRKM, a deep kernel method for the unsupervised learning of disentangled data representations. We propose augmenting the original deep restricted kernel machine formulation for kernel PCA by orthogonality constraints on…
We introduce two notions of effective reducibility for set-theoretical statements, based on computability with Ordinal Turing Machines (OTMs), one of which resembles Turing reducibility while the other is modelled after Weihrauch…
No natural principle is currently known to be strictly between the arithmetic comprehension axiom (ACA) and Ramsey's theorem for pairs (RT^2_2) in reverse mathematics. The tree theorem for pairs (TT^2_2) is however a good candidate. The…
Knowledge distillation (KD) has been a predominant method for compressing Large Language Models (LLMs). In this paper, we first revisit KD and Low-Rank Adaption (LoRA) and demonstrate that they follow the same paradigm. Inspired by this…
Large language models (LLMs) have exhibited remarkable few-shot learning capabilities and unified the paradigm of NLP tasks through the in-context learning (ICL) technique. Despite the success of ICL, the quality of the exemplar…
To generate data from trained diffusion models, most inference algorithms, such as DDPM, DDIM, and other variants, rely on discretizing the reverse SDEs or their equivalent ODEs. In this paper, we view such approaches as decomposing the…
Restricted kernel machines (RKMs) have demonstrated a significant impact in enhancing generalization ability in the field of machine learning. Recent studies have introduced various methods within the RKM framework, combining kernel…