Related papers: Grain - A Java Analysis Framework for Total Data R…
Data selection methods, such as active learning and core-set selection, are useful tools for improving the data efficiency of deep learning models on large-scale datasets. However, recent deep learning models have moved forward from…
Graph neural networks (GNNs) have shown significant success in learning graph representations. However, recent studies reveal that GNNs often fail to outperform simple MLPs on heterophilous graph tasks, where connected nodes may differ in…
The notion of complex systems is common to many domains, from Biology to Economy, Computer Science, Physics, etc. Often, these systems are made of sets of entities moving in an evolving environment. One of their major characteristics is the…
BEANS software is a web based, easy to install and maintain, new tool to store and analyse data in a distributed way for a massive amount of data. It provides a clear interface for querying, filtering, aggregating, and plotting data from an…
Federated learning claims to enable collaborative model training among multiple clients with data privacy by transmitting gradient updates instead of the actual client data. However, recent studies have shown the client privacy is still at…
Data transformation correctness is a fundamental challenge in data engineering: how can we verify that pipelines produce correct results before executing on production data? Existing practice relies on iterative testing over materialized…
Knowing the precise format of a program's input is a necessary prerequisite for systematic testing. Given a program and a small set of sample inputs, we (1) track the data flow of inputs to aggregate input fragments that share the same data…
Cereal grains are a vital part of human diets and are important commodities for people's livelihood and international trade. Grain Appearance Inspection (GAI) serves as one of the crucial steps for the determination of grain quality and…
JaTeCS is an open source Java library that supports research on automatic text categorization and other related problems, such as ordinal regression and quantification, which are of special interest in opinion mining applications. It covers…
aTrain is an open-source and offline tool for transcribing audio data in multiple languages with CPU and NVIDIA GPU support. It is specifically designed for researchers using qualitative data generated from various forms of speech…
Rice is one of the most widely cultivated crops globally and has been developed into numerous varieties. The quality of rice during cultivation is primarily determined by its cultivar and characteristics. Traditionally, rice classification…
Ray tracing has been typically known as a graphics rendering method capable of producing highly realistic imagery and visual effects generated by computers. More recently the performance improvements in Graphics Processing Units (GPUs) have…
In the past, the features of a user interface were limited by those available in the existing graphical widgets it used. Now, improvements in processor speed have fostered the emergence of interpreted languages, in which the appropriate…
Grain Boundaries govern many properties of polycrystalline materials, including the vast majority of engineering materials. Evolutionary algorithm can be applied to predict the grain boundary structures in different systems. However, the…
The present paper introduces the open-source Java Event Tracer (JETracer) framework for real-time tracing of GUI events within applications based on the AWT, Swing or SWT graphical toolkits. Our framework provides a common event model for…
This report introduces Juno, a modular Python package for optical design and simulation. Juno consists of a complete library that includes a graphical user interface to design and visualise arbitrary optical elements, set up wave…
Agricultural research has been profited by technical advances such as automation, data mining. Today, data mining is used in a vast areas and many off-the-shelf data mining system products and domain specific data mining application soft…
With the increasing physical event rate and number of electronic channels, traditional readout scheme meets the challenge of improving readout speed caused by the limited bandwidth of crate backplane. In this paper, a high-speed data…
Fast, incremental evolution of physics instrumentation raises the question of efficient software abstraction and transferability of algorithms across similar technologies. This contribution aims to provide an answer by introducing Track…
High energy physics experiments including those at the Tevatron and the upcoming LHC require analysis of large data sets which are best handled by distributed computation. We present the design and development of a distributed data analysis…