Related papers: ConvMath: A Convolutional Sequence Network for Mat…

Translating Math Formula Images to LaTeX Sequences Using Deep Neural Networks with Sequence-level Training

In this paper we propose a deep neural network model with an encoder-decoder architecture that translates images of math formulas into their LaTeX markup sequences. The encoder is a convolutional neural network (CNN) that transforms images…

Machine Learning · Computer Science 2019-09-11 Zelun Wang , Jyh-Charn Liu

Dual Convolutional LSTM Network for Referring Image Segmentation

We consider referring image segmentation. It is a problem at the intersection of computer vision and natural language understanding. Given an input image and a referring expression in the form of a natural language sentence, the goal is to…

Computer Vision and Pattern Recognition · Computer Science 2020-02-03 Linwei Ye , Zhi Liu , Yang Wang

Sequence to Sequence Learning for Optical Character Recognition

We propose an end-to-end recurrent encoder-decoder based sequence learning approach for printed text Optical Character Recognition (OCR). In contrast to present day existing state-of-art OCR solution which uses connectionist temporal…

Computer Vision and Pattern Recognition · Computer Science 2015-12-29 Devendra Kumar Sahu , Mohak Sukhwani

Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression Recognition

Handwritten mathematical expression recognition is a challenging problem due to the complicated two-dimensional structures, ambiguous handwriting input and variant scales of handwritten math symbols. To settle this problem, we utilize the…

Computer Vision and Pattern Recognition · Computer Science 2018-02-01 Jianshu Zhang , Jun Du , Lirong Dai

Convolutional Sequence to Sequence Learning

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks. Compared to…

Computation and Language · Computer Science 2017-07-26 Jonas Gehring , Michael Auli , David Grangier , Denis Yarats , Yann N. Dauphin

A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction

We improve automatic correction of grammatical, orthographic, and collocation errors in text using a multilayer convolutional encoder-decoder neural network. The network is initialized with embeddings that make use of character N-gram…

Computation and Language · Computer Science 2018-01-29 Shamil Chollampatt , Hwee Tou Ng

VideoMAC: Video Masked Autoencoders Meet ConvNets

Recently, the advancement of self-supervised learning techniques, like masked autoencoders (MAE), has greatly influenced visual representation learning for images and videos. Nevertheless, it is worth noting that the predominant approaches…

Computer Vision and Pattern Recognition · Computer Science 2024-03-01 Gensheng Pei , Tao Chen , Xiruo Jiang , Huafeng Liu , Zeren Sun , Yazhou Yao

ConvTransSeg: A Multi-resolution Convolution-Transformer Network for Medical Image Segmentation

Convolutional neural networks (CNNs) achieved the state-of-the-art performance in medical image segmentation due to their ability to extract highly complex feature representations. However, it is argued in recent studies that traditional…

Computer Vision and Pattern Recognition · Computer Science 2025-03-31 Zhendi Gong , Andrew P. French , Guoping Qiu , Xin Chen

ConvMAE: Masked Convolution Meets Masked Autoencoders

Vision Transformers (ViT) become widely-adopted architectures for various vision tasks. Masked auto-encoding for feature pretraining and multi-scale hybrid convolution-transformer architectures can further unleash the potentials of ViT,…

Computer Vision and Pattern Recognition · Computer Science 2022-05-20 Peng Gao , Teli Ma , Hongsheng Li , Ziyi Lin , Jifeng Dai , Yu Qiao

Recognizing Handwritten Mathematical Expressions as LaTex Sequences Using a Multiscale Robust Neural Network

In this paper, a robust multiscale neural network is proposed to recognize handwritten mathematical expressions and output LaTeX sequences, which can effectively and correctly focus on where each step of output should be concerned and has a…

Computer Vision and Pattern Recognition · Computer Science 2020-03-03 Hongyu Wang , Guangcun Shan

Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

Capturing the compositional process which maps the meaning of words to that of documents is a central challenge for researchers in Natural Language Processing and Information Retrieval. We introduce a model that is able to represent the…

Computation and Language · Computer Science 2014-06-17 Misha Denil , Alban Demiraj , Nal Kalchbrenner , Phil Blunsom , Nando de Freitas

A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction

Deep convolutional neural networks have led to breakthrough results in numerous practical machine learning tasks such as classification of images in the ImageNet data set, control-policy-learning to play Atari games or the board game Go,…

Information Theory · Computer Science 2017-10-25 Thomas Wiatowski , Helmut Bölcskei

Scene Text Recognition with Sliding Convolutional Character Models

Scene text recognition has attracted great interests from the computer vision and pattern recognition community in recent years. State-of-the-art methods use concolutional neural networks (CNNs), recurrent neural networks with long…

Computer Vision and Pattern Recognition · Computer Science 2017-09-07 Fei Yin , Yi-Chao Wu , Xu-Yao Zhang , Cheng-Lin Liu

Deconvolutional Latent-Variable Model for Text Sequence Matching

A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives. To alleviate typical optimization challenges in latent-variable models for text, we…

Computation and Language · Computer Science 2017-11-23 Dinghan Shen , Yizhe Zhang , Ricardo Henao , Qinliang Su , Lawrence Carin

Convolutional Character Networks

Recent progress has been made on developing a unified framework for joint text detection and recognition in natural images, but existing joint models were mostly built on two-stage framework by involving ROI pooling, which can degrade the…

Computer Vision and Pattern Recognition · Computer Science 2019-10-18 Linjie Xing , Zhi Tian , Weilin Huang , Matthew R. Scott

Convolutional Prompting meets Language Models for Continual Learning

Continual Learning (CL) enables machine learning models to learn from continuously shifting new training data in absence of data from old tasks. Recently, pretrained vision transformers combined with prompt tuning have shown promise for…

Computer Vision and Pattern Recognition · Computer Science 2024-04-01 Anurag Roy , Riddhiman Moulick , Vinay K. Verma , Saptarshi Ghosh , Abir Das

Robust Encoder-Decoder Learning Framework towards Offline Handwritten Mathematical Expression Recognition Based on Multi-Scale Deep Neural Network

Offline handwritten mathematical expression recognition is a challenging task, because handwritten mathematical expressions mainly have two problems in the process of recognition. On one hand, it is how to correctly recognize different…

Computer Vision and Pattern Recognition · Computer Science 2020-05-29 Guangcun Shan , Hongyu Wang , Wei Liang

A Hybrid Vision Transformer Approach for Mathematical Expression Recognition

One of the crucial challenges taken in document analysis is mathematical expression recognition. Unlike text recognition which only focuses on one-dimensional structure images, mathematical expression recognition is a much more complicated…

Computer Vision and Pattern Recognition · Computer Science 2026-03-10 Anh Duy Le , Van Linh Pham , Vinh Loi Ly , Nam Quan Nguyen , Huu Thang Nguyen , Tuan Anh Tran

CNN Features off-the-shelf: an Astounding Baseline for Recognition

Recent results indicate that the generic descriptors extracted from the convolutional neural networks are very powerful. This paper adds to the mounting evidence that this is indeed the case. We report on a series of experiments conducted…

Computer Vision and Pattern Recognition · Computer Science 2014-05-13 Ali Sharif Razavian , Hossein Azizpour , Josephine Sullivan , Stefan Carlsson

ConvTransformer: A Convolutional Transformer Network for Video Frame Synthesis

Deep Convolutional Neural Networks (CNNs) are powerful models that have achieved excellent performance on difficult computer vision tasks. Although CNNs perform well whenever large labeled training samples are available, they work badly on…

Computer Vision and Pattern Recognition · Computer Science 2021-06-03 Zhouyong Liu , Shun Luo , Wubin Li , Jingben Lu , Yufan Wu , Shilei Sun , Chunguo Li , Luxi Yang