Related papers: LULU operators for functions of continuous argumen…
The LULU operators, well known in the nonlinear multiresolution analysis of sequences, are extended to functions defined on continuous domain, namely, a real interval $\Omega\subseteq\mathbb{R}$. Similar to their discrete counterparts, for…
The LULU operators for sequences are extended to multi-dimensional arrays via the morphological concept of connection in a way which preserves their essential properties, e.g. they are separators and form a four element fully ordered…
In this paper we consider higher order Schr\"odinger operators $$\mathcal L u=Lu+Vu,$$ where $L$ denotes a fourth order operator and $V\geq 0$ a suitable potential. We initiate our analysis by considering the constant coefficients…
Exponential Linear Units (ELUs) are a useful rectifier for constructing deep learning architectures, as they may speed up and otherwise improve learning by virtue of not have vanishing gradients and by having mean activations near zero.…
This paper analyzes representations of continuous piecewise linear functions with infinite width, finite cost shallow neural networks using the rectified linear unit (ReLU) as an activation function. Through its integral representation, a…
Pseudodifferential operators of several variables are formal Laurent series in the formal inverses of $\partial_1, ..., \partial_n$ with $\partial_i = d$ $1 \leq i \leq n$. As in the single variable case, Lax equations can be constructed…
In recent years, functional neural networks have been proposed and studied in order to approximate nonlinear continuous functionals defined on $L^p([-1, 1]^s)$ for integers $s\ge1$ and $1\le p<\infty$. However, their theoretical properties…
Length generalization remains a persistent challenge for neural networks: recurrent models tend to suffer from positional biases, while transformers are constrained by fixed computational depth. Regular languages provide a frequently used…
We propose the Moderate Adaptive Linear Unit (MoLU), a novel activation function for deep neural networks, defined analytically as: f(x)=x \times (1+tanh(x))/2. MoLU combines mathematical elegance with empirical effectiveness, exhibiting…
In this note, we present a characterization of semistable unitary operators on $L^2(\mathbb{R})$, under the assumption that the operator is (i) translation-invariant, (ii) symmetric, and (iii) locally uniformly continuous (LUC) under…
We give estimates for the convolution product of an arbitrary number of endlessly continuable functions. This allows us to deal with nonlinear operations for the corresponding resurgent series, e.g. substitution into a convergent power…
Incremental processing allows interactive systems to respond based on partial inputs, which is a desirable property e.g. in dialogue agents. The currently popular Transformer architecture inherently processes sequences as a whole,…
In this paper, we study a class of convolution operators on the space of distributions that enlarge the well-studied class of passive operators. In this larger class, we are able to associate, to each operator, a holomorphic function in the…
Recurrent neural networks such as the GRU and LSTM found wide adoption in natural language processing and achieve state-of-the-art results for many tasks. These models are characterized by a memory state that can be written to and read from…
The nonlinearity of activation functions used in deep learning models are crucial for the success of predictive models. There are several commonly used simple nonlinear functions, including Rectified Linear Unit (ReLU) and Leaky-ReLU…
Despite their prevalence in neural networks we still lack a thorough theoretical characterization of ReLU layers. This paper aims to further our understanding of ReLU layers by studying how the activation function ReLU interacts with the…
Successive linear transforms followed by nonlinear "activation" functions can approximate nonlinear functions to arbitrary precision given sufficient layers. The number of necessary layers is dependent on, in part, by the nature of the…
In this paper we develop the calculus of pseudo-differential operators corresponding to the quantizations of the form $$ Au(x)=\int_{\mathbb{R}^n}\int_{\mathbb{R}^n}e^{i(x-y)\cdot\xi}\sigma(x+\tau(y-x),\xi)u(y)dyd\xi, $$ where…
Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training. To encourage more systematic numerical extrapolation, we…
In this work, we present Lexical Unit Analysis (LUA), a framework for general sequence segmentation tasks. Given a natural language sentence, LUA scores all the valid segmentation candidates and utilizes dynamic programming (DP) to extract…