PerBite: A Curated Diagnostic Workflow for Bite-Aware Food Volume Estimation
2026-06-01 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors tested whether 3D digital models (meshes) of food can reliably estimate how much food was eaten. They used a method to create detailed 3D food shapes from pictures, scaled them using the plate size, and measured volume changes before and after eating. Their approach ranked best in a competition and showed reasonable accuracy in estimating the volume of food present and consumed, though some errors remained. They emphasize that different steps—like making accurate shapes, scaling correctly, and handling the 3D models—need separate checks when using this for diet tracking.
3D reconstructionfood volume estimationmesh processingChamfer distancewatertight meshICP alignmentmean absolute percentage error (MAPE)dietary assessmentscale calibrationcontinuous 3D reconstruction
Authors
Ahmad AlMughrabi, Farid Al-Areqi, David Fernández Gómez, Umair Haroon, Marc Bolaños, Ricardo Marques, Petia Radeva
Abstract
Can a visually plausible food mesh be trusted to estimate the volume of consumed food? \method investigates this question using selected paired before- and after-consumption states from the MetaFood CVPR 2026 Continuous 3D Reconstruction While Eating Challenge. The submitted workflow follows a curated reconstruction protocol: SAM~3 segments the food and plate regions; Hunyuan3D/SAM~3D generates a dimensionless food mesh; the plate diameter provides the metric scale; the plate geometry is removed in Blender; and the remaining mesh is hole-filled, made watertight, and integrated to estimate volume. MoGe-2 is used only as an auxiliary cue for initial dish-diameter estimation when direct plate measurement is uncertain; it is not the primary scale source for the reported challenge result. \method ranks first, with an average Chamfer distance of 8.31 across 34 meshes using rigid ICP without scale correction. On 17 before- and after-pairs, it achieves 33.87\% state-level volume MAPE and zero monotonicity violations, while consumed-volume MAPE remains 53.74\%. The results show that surface reconstruction, metric scale, controlled mesh cleanup, watertight volume integration, and physical depletion consistency should be evaluated separately for dietary assessment. Source code and evaluation scripts will be available at \href{https://github.com/GCVCG/PerBite-CVPR-MetaFood-2026}{github.com/GCVCG/PerBite-CVPR-MetaFood-2026}.