English
Related papers

Related papers: Web Table Classification based on Visual Features

200 papers

Information extraction from semi-structured webpages provides valuable long-tailed facts for augmenting knowledge graph. Relational Web tables are a critical component containing additional entities and attributes of rich and diverse…

Information Retrieval · Computer Science 2021-02-19 Daheng Wang , Prashant Shiralkar , Colin Lockard , Binxuan Huang , Xin Luna Dong , Meng Jiang

The abundance of the data in the Internet facilitates the improvement of extraction and processing tools. The trend in the open data publishing encourages the adoption of structured formats like CSV and RDF. However, there is still a…

Information Retrieval · Computer Science 2016-08-08 Mikhail Galkin , Dmitry Mouromtsev , Sören Auer

Visual attributes play an essential role in real applications based on image retrieval. For instance, the extraction of attributes from images allows an eCommerce search engine to produce retrieval results with higher precision. The…

Computer Vision and Pattern Recognition · Computer Science 2021-04-02 Andres Baloian , Nils Murrugarra-Llerena , Jose M. Saavedra

Recent deep learning approaches in table detection achieved outstanding performance and proved to be effective in identifying document layouts. Currently, available table detection benchmarks have many limitations, including the lack of…

Computer Vision and Pattern Recognition · Computer Science 2023-12-01 Mrinal Haloi , Shashank Shekhar , Nikhil Fande , Siddhant Swaroop Dash , Sanjay G

Most of the previous methods for table recognition rely on training datasets containing many richly annotated table images. Detailed table image annotation, e.g., cell or text bounding box annotation, however, is costly and often…

Computer Vision and Pattern Recognition · Computer Science 2023-03-16 Nam Tuan Ly , Atsuhiro Takasu , Phuc Nguyen , Hideaki Takeda

Understanding the connections between unstructured text and semi-structured table is an important yet neglected problem in natural language processing. In this work, we focus on content-based table retrieval. Given a query, the task is to…

Computation and Language · Computer Science 2017-06-09 Zhao Yan , Duyu Tang , Nan Duan , Junwei Bao , Yuanhua Lv , Ming Zhou , Zhoujun Li

Tables on the Web contain a vast amount of knowledge in a structured form. To tap into this valuable resource, we address the problem of table retrieval: answering an information need with a ranked list of tables. We investigate this…

Information Retrieval · Computer Science 2021-05-14 Shuo Zhang , Krisztian Balog

In this paper, we present a methodology and the corresponding Python library 1 for the classification of webpages. Our method retrieves a fixed number of images from a given webpage, and based on them classifies the webpage into a set of…

Computer Vision and Pattern Recognition · Computer Science 2019-12-19 Leonardo Espinosa Leal , Kaj-Mikael Björk , Amaury Lendasse , Anton Akusok

The volume of data generated by internet and social networks is increasing every day, and there is a clear need for efficient ways of extracting useful information from them. As those data can take different forms, it is important to use…

Machine Learning · Statistics 2017-05-25 Bertrand Lebichot , Marco Saerens

High-quality Web tables are rich sources of information that can be used to populate Knowledge Graphs (KG). The focus of this paper is an evaluation of methods for table-to-class annotation, which is a sub-task of Table Interpretation (TI).…

Machine Learning · Computer Science 2021-10-29 Aneta Koleva , Martin Ringsquandl , Mitchell Joblin , Volker Tresp

Many real world systems need to operate on heterogeneous information networks that consist of numerous interacting components of different types. Examples include systems that perform data analysis on biological information networks; social…

Artificial Intelligence · Computer Science 2017-07-26 Parisa Kordjamshidi , Sameer Singh , Daniel Khashabi , Christos Christodoulopoulos , Mark Summons , Saurabh Sinha , Dan Roth

Web page categorization is one of the challenging tasks in the world of ever increasing web technologies. There are many ways of categorization of web pages based on different approach and features. This paper proposes a new dimension in…

Neural and Evolutionary Computing · Computer Science 2010-09-28 S. M. Kamruzzaman

The existing image feature extraction methods are primarily based on the content and structure information of images, and rarely consider the contextual semantic information. Regarding some types of images such as scenes and objects, the…

Computer Vision and Pattern Recognition · Computer Science 2020-01-23 Chiranjibi Sitaula , Yong Xiang , Anish Basnet , Sunil Aryal , Xuequan Lu

Document structure analysis, such as zone segmentation and table recognition, is a complex problem in document processing and is an active area of research. The recent success of deep learning in solving various computer vision and machine…

Computer Vision and Pattern Recognition · Computer Science 2019-07-04 Shah Rukh Qasim , Hassan Mahmood , Faisal Shafait

The first phase of table recognition is to detect the tabular area in a document. Subsequently, the tabular structures are recognized in the second phase in order to extract information from the respective cells. Table detection and…

Computer Vision and Pattern Recognition · Computer Science 2021-05-11 Khurram Azeem Hashmi , Marcus Liwicki , Didier Stricker , Muhammad Adnan Afzal , Muhammad Ahtsham Afzal , Muhammad Zeshan Afzal

When applying learning to rank algorithms to Web search, a large number of features are usually designed to capture the relevance signals. Most of these features are computed based on the extracted textual elements, link analysis, and user…

Information Retrieval · Computer Science 2017-10-20 Yixing Fan , Jiafeng Guo , Yanyan Lan , Jun Xu , Liang Pang , Xueqi Cheng

The abundant semi-structured data on the Web, such as HTML-based tables and lists, provide commercial search engines a rich information source for question answering (QA). Different from plain text passages in Web documents, Web tables and…

Computation and Language · Computer Science 2020-10-15 Xingyao Zhang , Linjun Shou , Jian Pei , Ming Gong , Lijie Wen , Daxin Jiang

Identification of tree species plays a key role in forestry related tasks like forest conservation, disease diagnosis and plant production. There had been a debate regarding the part of the tree to be used for differentiation, whether it…

Computer Vision and Pattern Recognition · Computer Science 2022-10-18 Sahil Faizal

Convolution neural network models are widely used in image classification tasks. However, the running time of such models is so long that it is not the conforming to the strict real-time requirement of mobile devices. In order to optimize…

Computer Vision and Pattern Recognition · Computer Science 2019-06-06 Yuntao Liu , Yong Dou , Ruochun Jin , Rongchun Li

We propose a novel high-performance and interpretable canonical deep tabular data learning architecture, TabNet. TabNet uses sequential attention to choose which features to reason from at each decision step, enabling interpretability and…

Machine Learning · Computer Science 2020-12-10 Sercan O. Arik , Tomas Pfister
‹ Prev 1 2 3 10 Next ›