LUMINA-26: Low-Light Understanding for Modeling and Interpreting Night-time Actions

2026-06-22Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors created a new video dataset called LUMINA-26 with many different actions filmed in natural low-light settings to help computers better recognize human activities at night. They also designed a smart model called Illumi-Net that uses information about how bright the video is to improve understanding of movements. Their approach performed better than previous methods on an existing low-light dataset and set a strong example on their new dataset. This work provides a useful tool for future research on recognizing actions when lighting is poor.

low-light human action recognitiondatasetillumination-adaptivemixture-of-experts networktransformerspatio-temporal feature extractionvideo-level illumination cuesbenchmarkclass distributionmotion ambiguity
Authors
Aman Kumar Pandey, Anil Singh Parihar
Abstract
Low-light human action recognition remains a challenging problem due to poor illumination, amplified noise, motion ambiguity, and diverse real-world scenes. Existing low-light datasets often lack sufficient action diversity, capture realism, or balanced class distribution, limiting the development of robust models. To address this, we introduce LUMINA-26: Low-Light Understanding for Modeling and Interpreting Night-time Actions, comprising 6,784 clips across 26 action classes, recorded from 22 subjects across 20 indoor and outdoor locations under naturally occurring low-light conditions. We also propose Illumi-Net: An Illumination-Adaptive Mixture-of-Experts Network, which leverages video-level illumination cues to guide adaptive enhancement and transformer-based spatio-temporal feature extraction, with expert-conditioned decision fusion. Our method surpasses previous state-of-the-art performance on ELLAR (Top-1: 55.13%, Top-5: 78.87%) and establishes a strong baseline on LUMINA-26 (Top-1: 75.95%, Top-5: 93.58%), offering a practical benchmark for future low-light action recognition research.