PhysFlow: Frequency Decoupled with Dual-Field Rectified Flow for Remote Photoplethysmography

2026-06-22 • Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition

AI summaryⓘ

The authors present PhysFlow, a new method to measure heartbeat signals from face videos without touching the person. Unlike previous methods that mix different parts of the signal together, PhysFlow separates the heartbeat signal into two parts and learns them separately to avoid confusion from things like lighting changes or head movements. This approach helps make the heartbeat signal clearer and more reliable, even when conditions are difficult. The authors tested PhysFlow on several datasets and found it works better than existing methods at estimating heart rate and reconstructing the heartbeat waveforms.

Remote Photoplethysmography (rPPG)pulse estimationfacial videossignal decompositionfrequency decouplingrectified flowheart rate estimationordinary differential equations (ODE)deep learningphysiological signal processing

Authors

Zixu Li, jianjun Qian, Hang Shao, Lei Luo, Jian Yang

Abstract

Remote Photoplethysmography (rPPG) enables contactless pulse estimation from facial videos, serving as a vital tool for health monitoring. However, current deep learning methods often struggle under complex disturbances, particularly varying illumination, facial expressions, and unconstrained head movements. In such scenarios, subtle physiological signals are easily dominated by external interference, making the recovered rPPG waveform unstable and unreliable. One important reason is that most existing methods directly model the rPPG signal in a unified manner, where different signal components are coupled during reconstruction. This makes it difficult to preserve weak pulse-related variations when strong disturbance-induced changes are present. To address this challenge, we propose PhysFlow, a frequency-decoupled dual-field rectified flow framework tailored for robust rPPG estimation. Specifically, the ground-truth rPPG signal is decomposed into trend and amplitude components, which are used as separate supervisory targets. Based on the extracted facial features, PhysFlow learns two component-specific conditional velocity fields to model the two components separately. This design reduces mutual interference between different components and improves the robustness of rPPG reconstruction under complex disturbances. Moreover, the rectified flow formulation enables efficient waveform reconstruction with only a few ordinary differential equation (ODE) integration steps. Extensive experiments on multiple benchmark datasets demonstrate that PhysFlow outperforms state-of-the-art methods in both heart-rate estimation and rPPG waveform reconstruction across diverse challenging scenarios.

View PDFOpen arXiv