Related papers: Refinement Module based on Parse Graph for Human P…
In this paper, we concern on the bottom-up paradigm in multi-person pose estimation (MPPE). Most previous bottom-up methods try to consider the relation of instances to identify different body parts during the post processing, while…
Recent advancements in retrieval-augmented generation (RAG) have enhanced large language models in question answering by integrating external knowledge. However, challenges persist in achieving global understanding and aligning responses…
Graph Neural Networks (GNNs) have been widely adopted for Protein Representation Learning (PRL), as residue interaction networks can be naturally represented as graphs. Current GNN-based PRL methods typically rely on single-perspective…
This paper studies the task of estimating the 3D human poses of multiple persons from multiple calibrated camera views. Following the top-down paradigm, we decompose the task into two stages, i.e. person localization and pose estimation.…
In this research, we address the challenge faced by existing deep learning-based human mesh reconstruction methods in balancing accuracy and computational efficiency. These methods typically prioritize accuracy, resulting in large network…
Recommendation systems aim to provide users with relevant suggestions, but often lack interpretability and fail to capture higher-level semantic relationships between user behaviors and profiles. In this paper, we propose a novel approach…
Person re-identification has achieved great progress with deep convolutional neural networks. However, most previous methods focus on learning individual appearance feature embedding, and it is hard for the models to handle difficult…
Grounding referring expressions aims to locate in an image an object referred to by a natural language expression. The linguistic structure of a referring expression provides a layout of reasoning over the visual contents, and it is often…
Adaptive mesh refinement is central to the efficient solution of partial differential equations (PDEs) via the finite element method (FEM). Classical $r$-adaptivity optimizes vertex positions but requires solving expensive auxiliary PDEs…
Recent 2D-to-3D human pose estimation works tend to utilize the graph structure formed by the topology of the human skeleton. However, we argue that this skeletal topology is too sparse to reflect the body structure and suffer from serious…
Over the past two decades, recommendation systems (RSs) have used machine learning (ML) solutions to recommend items, e.g., movies, books, and restaurants, to clients of a business or an online platform. Recipe recommendation, however, has…
The reasoning segmentation task involves segmenting objects within an image by interpreting implicit user instructions, which may encompass subtleties such as contextual cues and open-world knowledge. Despite significant advancements made…
Estimating 3D full-body pose from sparse sensor data is a pivotal technique employed for the reconstruction of realistic human motions in Augmented Reality and Virtual Reality. However, translating sparse sensor signals into comprehensive…
Multi-person pose estimation in the wild is challenging. Although state-of-the-art human detectors have demonstrated good performance, small errors in localization and recognition are inevitable. These errors can cause failures for a…
Marker-free human pose estimation (HPE) has found increasing applications in various fields. Current HPE suffers from occasional errors in keypoint recognition and random fluctuation in keypoint trajectories when analyzing kinematic human…
Exponential Random Graph Models (ERGMs) are an inferential model for analysing statistical networks. Recent development in ERGMs uses hierarchical Bayesian setup to jointly model a group of networks, which is called a multiple-network…
Reconstructing multi-human body mesh from a single monocular image is an important but challenging computer vision problem. In addition to the individual body mesh models, we need to estimate relative 3D positions among subjects to generate…
We revisit millimeter-wave (mmWave) human pose estimation (HPE) from a signal preprocessing perspective. A single mmWave frame provides structured dimensions that map directly to human geometry and motion: range, angle, and Doppler,…
Human pose estimation (HPE) has received increasing attention recently due to its wide application in motion analysis, virtual reality, healthcare, etc. However, it suffers from the lack of labeled diverse real-world datasets due to the…
The reasoning-based pose estimation (RPE) benchmark has emerged as a widely adopted evaluation standard for pose-aware multimodal large language models (MLLMs). Despite its significance, we identified critical reproducibility and…