Related papers: Predicting Heart Activity from Speech using Data-d…
A non-invasive method for the monitoring of heart activity can help to reduce the deaths caused by heart disorders such as stroke, arrhythmia and heart attack. The human voice can be considered as a biometric data that can be used for…
To date, research on sensor-equipped mobile devices has primarily focused on the purely supervised task of human activity recognition (walking, running, etc), demonstrating limited success in inferring high-level health outcomes from…
Heart rate is an important vital sign used in the diagnosis of many medical conditions. Conventionally, heart rate is measured using a medical device such as pulse oxymeter. Physiological parameters such as heart rate bear a correlation to…
In their everyday life, the speech recognition performance of human listeners is influenced by diverse factors, such as the acoustic environment, the talker and listener positions, possibly impaired hearing, and optional hearing devices.…
Communicative gestures and speech acoustic are tightly linked. Our objective is to predict the timing of gestures according to the acoustic. That is, we want to predict when a certain gesture occurs. We develop a model based on a recurrent…
Self-supervised speech representation learning has recently been a prosperous research topic. Many algorithms have been proposed for learning useful representations from large-scale unlabeled data, and their applications to a wide range of…
Most prior work in dialogue modeling has been on written conversations mostly because of existing data sets. However, written dialogues are not sufficient to fully capture the nature of spoken conversations as well as the potential speech…
Learning to produce contact-rich, dynamic behaviors from raw sensory data has been a longstanding challenge in robotics. Prominent approaches primarily focus on using visual or tactile sensing, where unfortunately one fails to capture…
Monitoring exercise intensity is critical for safe and effective physical activity, particularly for individuals with cardiovascular disease, where overexertion can pose serious risks. Although physiological measures such as heart rate are…
Despite known differences between reading and listening in the brain, recent work has shown that text-based language models predict both text-evoked and speech-evoked brain activity to an impressive degree. This poses the question of what…
Wearable devices such as smartwatches are becoming increasingly popular tools for objectively monitoring physical activity in free-living conditions. To date, research has primarily focused on the purely supervised task of human activity…
The relationship between brain structure and function has been probed using a variety of approaches, but how the underlying structural connectivity of the human brain drives behavior is far from understood. To investigate the effect of…
Stress is a major threat to well-being that manifests in a variety of physiological and mental symptoms. Utilising speech samples collected while the subject is undergoing an induced stress episode has recently shown promising results for…
Both speech and sensor time series data encode information in both the time- and frequency- domains, like spectral powers and waveform shapelets. We show that speech foundation models learn representations that generalize beyond the speech…
While there has been significant progress towards modelling coherence in written discourse, the work in modelling spoken discourse coherence has been quite limited. Unlike the coherence in text, coherence in spoken discourse is also…
Emotions play a central role in human communication, shaping trust, engagement, and social interaction. As artificial intelligence systems powered by large language models become increasingly integrated into everyday life, enabling them to…
Conventional methods for diagnosing Social Anxiety Disorder (SAD), such as clinical interviews and self-reported questionnaires, often face accessibility barriers and subjective biases, underscoring the need for objective physiological…
Conversational assistants are increasingly popular across diverse real-world applications, highlighting the need for advanced multimodal speech modeling. Speech, as a natural mode of communication, encodes rich user-specific characteristics…
There has been a surge of interest in leveraging speech as a marker of health for a wide spectrum of conditions. The underlying premise is that any neurological, mental, or physical deficits that impact speech production can be objectively…
This work analyzes the efficacy of verbal and nonverbal features of group conversation for the task of automatic prediction of group task performance. We describe a new publicly available survival task dataset that was collected and…