Hist2Style: Histogram-Guided Stylization with Bilateral Grids
2026-06-01 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors present Hist2Style, a new method for changing the colors and tones of photos while keeping details intact. Unlike big, slow models that can make mistakes or be hard to control, their approach is fast and works in real time, even on high-res images. They trained a small network to mimic a large model, using color histograms so users can easily adjust the look by changing color distributions. This method keeps the original photo’s structure without creating unwanted artifacts.
photorealistic style transferbilateral gridlocally affine transformscolor histogramreal-time image editinghigh-resolution imagesimage stylizationlanguage-vision modelscontent preservationinteractive color adjustment
Authors
Dekel Galor, Adam Pikielny, Zhoutong Zhang, Ke Wang, Laura Waller, Jiawen Chen, Ilya Chugunov
Abstract
Photorealistic style transfer aims to match the color and tone of an input image to that of a style target while preserving the content and details of the original scene. Although existing large image models can facilitate these kinds of appearance edits, their high computational demands, potential for hallucinations, and limited user control make them unsuitable for high-resolution, real-time workflows. We introduce Hist2Style, a bilateral-grid formulation for fast, edge-aware stylization that preserves visual fidelity by constraining operations to locally affine transforms in bilateral space. Our model distills a large image editing model into a lightweight network by training on a large supervised corpus generated with language and vision-language models, targeting spatially varying color edits. The network conditions on a histogram-based embedding of the style target to provide an interpretable interface for adjusting the output style by modifying the target color distribution. Overall, Hist2Style maintains content structure by construction, avoids hallucinations, and supports real-time, high-resolution photorealistic stylization with interactive user-controllable color and tone adjustments.