CoordFlow: Coordinate Flow for Pixel-wise Neural Video Representation
Abstract
In the field of video compression, the pursuit for better quality at lower bit rates remains a long-lasting goal. Recent developments have demonstrated the potential of Implicit Neural Representation (INR) as a promising alternative to traditional transform-based methodologies. Video INRs can be roughly divided into frame-wise and pixel-wise methods according to the structure the network outputs. While the pixel-based methods are better for upsampling and parallelization, frame-wise methods demonstrated better performance. We introduce CoordFlow, a novel pixel-wise INR for video compression. It yields state-of-the-art results compared to other pixel-wise INRs and on-par performance compared to leading frame-wise techniques. The method is based on the separation of the visual information into visually consistent layers, each represented by a dedicated network that compensates for the layer's motion. When integrated, a byproduct is an unsupervised segmentation of video sequence. Objects motion trajectories are implicitly utilized to compensate for visual-temporal redundancies. Additionally, the proposed method provides inherent video upsampling, stabilization, inpainting, and denoising capabilities.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2025
- DOI:
- arXiv:
- arXiv:2501.00975
- Bibcode:
- 2025arXiv250100975S
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning