Related papers: Web Data Knowledge Extraction

Web Data Extraction, Applications and Techniques: A Survey

Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and…

Information Retrieval · Computer Science 2017-03-07 Emilio Ferrara , Pasquale De Meo , Giacomo Fiumara , Robert Baumgartner

An Agent based Approach towards Metadata Extraction, Modelling and Information Retrieval over the Web

Web development is a challenging research area for its creativity and complexity. The existing raised key challenge in web technology technologic development is the presentation of data in machine read and process able format to take…

Artificial Intelligence · Computer Science 2010-08-10 Zeeshan Ahmed , Detlef Gerhard

Learning from Web: Review of Approaches

Knowledge discovery is defined as non-trivial extraction of implicit, previously unknown and potentially useful information from given data. Knowledge extraction from web documents deals with unstructured, free-format documents whose number…

Neural and Evolutionary Computing · Computer Science 2007-05-23 Vitaly Schetinin

Extracting Event-Centric Document Collections from Large-Scale Web Archives

Web archives are typically very broad in scope and extremely large in scale. This makes data analysis appear daunting, especially for non-computer scientists. These collections constitute an increasingly important source for researchers in…

Digital Libraries · Computer Science 2017-07-31 Gerhard Gossen , Elena Demidova , Thomas Risse

A Methodology to Extract Social Network from the Web Snippet

The Web has been chosen as a basic infrastructure to gain the social structure information, through the social network extraction, from all over the world. However, most of the web documents are unstructured and lack of semantics. Moreover,…

Social and Information Networks · Computer Science 2012-11-27 Mahyuddin K. M. Nasution , Shahrul Azman Noah

New Methods for Metadata Extraction from Scientific Literature

Within the past few decades we have witnessed digital revolution, which moved scholarly communication to electronic media and also resulted in a substantial increase in its volume. Nowadays keeping track with the latest scientific…

Digital Libraries · Computer Science 2017-10-30 Dominika Tkaczyk

A Model for Personalized Keyword Extraction from Web Pages using Segmentation

The World Wide Web caters to the needs of billions of users in heterogeneous groups. Each user accessing the World Wide Web might have his / her own specific interest and would expect the web to respond to the specific requirements. The…

Information Retrieval · Computer Science 2017-11-22 K. S. Kuppusamy , G. Aghila

Deep Learning based Key Information Extraction from Business Documents: Systematic Literature Review

Extracting key information from documents represents a large portion of business workloads and therefore offers a high potential for efficiency improvements and process automation. With recent advances in Deep Learning, a plethora of Deep…

Information Retrieval · Computer Science 2025-07-21 Alexander Michael Rombach , Peter Fettke

Instantly Deployable Expert Knowledge - Networks of Knowledge Engines

Knowledge and information are becoming the primary resources of the emerging information society. To exploit the potential of available expert knowledge, comprehension and application skills (i.e. expert competences) are necessary. The…

Information Retrieval · Computer Science 2018-11-08 Bernhard Bergmair , Thomas Buchegger , Johann Hoffelner , Gerald Schatz , Siegfried Silber , Johannes Klinglmayr

Social Network Extraction: Superficial Method and Information Retrieval

Social network has become one of the themes of government issues, mainly dealing with the chaos. The use of web is steadily gaining ground in these issues. However, most of the web documents are unstructured and lack of semantic. In this…

Information Retrieval · Computer Science 2016-01-13 Mahyuddin K. M. Nasution , Shahrul Azman Mohd. Noah , Saidah Saad

Information Extraction from Scientific Literature for Method Recommendation

As a research community grows, more and more papers are published each year. As a result there is increasing demand for improved methods for finding relevant papers, automatically understanding the key ideas and recommending potential…

Information Retrieval · Computer Science 2019-01-03 Yi Luan

Metaknowledge Extraction Based on Multi-Modal Documents

The triple-based knowledge in large-scale knowledge bases is most likely lacking in structural logic and problematic of conducting knowledge hierarchy. In this paper, we introduce the concept of metaknowledge to knowledge engineering…

Computer Vision and Pattern Recognition · Computer Science 2021-02-08 Shukan Liu , Ruilin Xu , Boying Geng , Qiao Sun , Li Duan , Yiming Liu

Web Content Extraction - a Meta-Analysis of its Past and Thoughts on its Future

In this paper, we present a meta-analysis of several Web content extraction algorithms, and make recommendations for the future of content extraction on the Web. First, we find that nearly all Web content extractors do not consider a very…

Information Retrieval · Computer Science 2015-08-19 Tim Weninger , Rodrigo Palacios , Valter Crescenzi , Thomas Gottron , Paolo Merialdo

Open Domain Knowledge Extraction for Knowledge Graphs

The quality of a knowledge graph directly impacts the quality of downstream applications (e.g. the number of answerable questions using the graph). One ongoing challenge when building a knowledge graph is to ensure completeness and…

Computation and Language · Computer Science 2023-12-18 Kun Qian , Anton Belyi , Fei Wu , Samira Khorshidi , Azadeh Nikfarjam , Rahul Khot , Yisi Sang , Katherine Luna , Xianqi Chu , Eric Choi , Yash Govind , Chloe Seivwright , Yiwen Sun , Ahmed Fakhry , Theo Rekatsinas , Ihab Ilyas , Xiaoguang Qi , Yunyao Li

New Datasets and a Benchmark of Document Network Embedding Methods for Scientific Expert Finding

The scientific literature is growing faster than ever. Finding an expert in a particular scientific domain has never been as hard as today because of the increasing amount of publications and because of the ever growing diversity of…

Information Retrieval · Computer Science 2020-04-09 Robin Brochier , Antoine Gourru , Adrien Guille , Julien Velcin

Knowledge Graph Extension by Entity Type Recognition

Knowledge graphs have emerged as a sophisticated advancement and refinement of semantic networks, and their deployment is one of the critical methodologies in contemporary artificial intelligence. The construction of knowledge graphs is a…

Artificial Intelligence · Computer Science 2024-05-07 Daqian Shi

A Web Scale Entity Extraction System

Understanding the semantic meaning of content on the web through the lens of entities and concepts has many practical advantages. However, when building large-scale entity extraction systems, practitioners are facing unique challenges…

Computation and Language · Computer Science 2021-10-04 Xuanting Cai , Quanbin Ma , Pan Li , Jianyu Liu , Qi Zeng , Zhengkan Yang , Pushkar Tripathi

Scale Up Event Extraction Learning via Automatic Training Data Generation

The task of event extraction has long been investigated in a supervised learning paradigm, which is bound by the number and the quality of the training instances. Existing training data must be manually generated through a combination of…

Computation and Language · Computer Science 2017-12-12 Ying Zeng , Yansong Feng , Rong Ma , Zheng Wang , Rui Yan , Chongde Shi , Dongyan Zhao

LLM-Based Information Extraction to Support Scientific Literature Research and Publication Workflows

The increasing volume of scholarly publications requires advanced tools for efficient knowledge discovery and management. This paper introduces ongoing work on a system using Large Language Models (LLMs) for the semantic extraction of key…

Digital Libraries · Computer Science 2025-10-07 Samy Ateia , Udo Kruschwitz , Melanie Scholz , Agnes Koschmider , Moayad Almohaishi

Extracting Procedural Knowledge from Technical Documents

Procedures are an important knowledge component of documents that can be leveraged by cognitive assistants for automation, question-answering or driving a conversation. It is a challenging problem to parse big dense documents like product…

Artificial Intelligence · Computer Science 2020-10-21 Shivali Agarwal , Shubham Atreja , Vikas Agarwal