Presentation + Paper
1 August 2021 A differentiable VMAF proxy as a loss function for video noise reduction
Author Affiliations +
Abstract
Traditional metrics for evaluating video quality do not completely capture the nuances of the Human Visual System (HVS), however they are simple to use for quantitatively optimizing parameters in enhancement or restoration. Modern Full-Reference Perceptual Visual Quality Metrics (PVQMs) such as the video multi-method assessment fusion (VMAF) function are more robust than traditional metrics in terms of the HVS, but they are generally complex and non-differentiable. This lack of differentiability means that they cannot be readily used in optimization scenarios for enhancement or restoration. In this paper we look at the formulation of a perceptually motivated restoration framework for video. We deploy this process in the context of denoising by training a spatio-temporal denoiser deep convultional neural network (DCNN). We design DCNNs as a differentiable proxy for both a spatial and temporal version of VMAF. These proxies are used as part of the proposed loss function in updating the weights of the spatio-temporal DCNNs. We use these proxies and traditional losses to propose a perceptually motivated loss function for video. Our results show that using the perceptual loss function as a fine tuning step yields a higher VMAF score and lower PSNR, when compared to the spatio-temporal network that is trained using the traditional mean squared error loss. Using the perceptual loss function for the entirety of training yields a lower VMAF and PSNR, but has visibly less noise in its output.
Conference Presentation
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Darren Ramsook, Anil Kokaram, Noel O'Connor, Neil Birkbeck, Yeping Su, and Balu Adsumilli "A differentiable VMAF proxy as a loss function for video noise reduction", Proc. SPIE 11842, Applications of Digital Image Processing XLIV, 118420X (1 August 2021); https://doi.org/10.1117/12.2594164
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Denoising

Composites

Network architectures

Image quality

Motion models

Neural networks

Back to Top