Related papers: RenderCore -- a new WebGPU-based rendering engine …
Interactive dynamic simulators are an accelerator for developing novel robotic control algorithms and complex systems involving humans and robots. In user training and synthetic data generation applications, high-fidelity visualizations…
All modern web browsers - Internet Explorer, Firefox, Chrome, Opera, and Safari - have a core rendering engine written in C++. This language choice was made because it affords the systems programmer complete control of the underlying…
Edge computing enables data processing closer to the source, significantly reducing latency, an essential requirement for real-time vision-based analytics such as object detection in surveillance and smart city environments. However, these…
Recent reasoning-first models (e.g., OpenAI o1, DeepSeek R1) have spurred a resurgence of interest in RLVR. Nevertheless, advances are dominated by mathematics (e.g., AIME), with competitive-programming code generation underexplored and…
A new event visualization tool for the ZEUS experiment is nearing completion, which will provide the functionality required by the new detector components implemented during the recently achieved HERA luminosity upgrade. The new design is…
Event-stream representation is the first step for many computer vision tasks using event cameras. It converts the asynchronous event-streams into a formatted structure so that conventional machine learning models can be applied easily.…
This work tackles the challenging task of achieving real-time novel view synthesis for reflective surfaces across various scenes. Existing real-time rendering methods, especially those based on meshes, often have subpar performance in…
Existing prompt-optimization techniques rely on local signals to update behavior, often neglecting broader and recurring patterns across tasks, leading to poor generalization; they further rely on full-prompt rewrites or unstructured…
Complex Event Recognition (CER) systems are a prominent technology for finding user-defined query patterns over large data streams in real time. CER query evaluation is known to be computationally challenging, since it requires maintaining…
Constructing controllable visual data is a major bottleneck for image editing and multimodal understanding. Useful supervision is rarely produced by a single rendering pass; instead it emerges through iterative generation, inspection,…
Streaming video large language models (LLMs) are increasingly used for real-time multimodal tasks such as video captioning, question answering, conversational agents, and augmented reality. However, these models face fundamental memory and…
The reconstruction of event-level information, such as the direction or energy of a neutrino interacting in IceCube DeepCore, is a crucial ingredient to many physics analyses. Algorithms to extract this high level information from the…
We have recently seen tremendous progress in neural rendering (NR) advances, i.e., NeRF, for photo-real free-view synthesis. Yet, as a local technique based on a single computer/GPU, even the best-engineered Instant-NGP or i-NGP cannot…
Moving Object Detection (MOD) is a critical vision task for successfully achieving safe autonomous driving. Despite plausible results of deep learning methods, most existing approaches are only frame-based and may fail to reach reasonable…
Deploying multiple machine learning models on resource-constrained robotic platforms for different perception tasks often results in redundant computations, large memory footprints, and complex integration challenges. In response, this work…
This paper introduces MovieCORE, a novel video question answering (VQA) dataset designed to probe deeper cognitive understanding of movie content. Unlike existing datasets that focus on surface-level comprehension, MovieCORE emphasizes…
Visual Question Answering with Natural Language Explanation (VQA-NLE) task is challenging due to its high demand for reasoning-based inference. Recent VQA-NLE studies focus on enhancing model networks to amplify the model's reasoning…
Visual Emotion Analysis (VEA) aims at finding out how people feel emotionally towards different visual stimuli, which has attracted great attention recently with the prevalence of sharing images on social networks. Since human emotion…
Interest in many-core architectures applied to real time selections is growing in High Energy Physics (HEP) experiments. In this paper we describe performance measurements of many-core devices when applied to a typical HEP online task: the…
Building on the affective dream-replay reinforcement learning framework of CosmoCore, we introduce CosmoCore-Evo, an extension that incorporates evolutionary algorithms to enhance adaptability and novelty in code generation tasks. Inspired…