Video super-resolution based on a spatio-temporal matching network
Abstract
Deep spatio-temporal neural networks have shown promising performance for video super-resolution (VSR) in recent years. However, most of them heavily rely on accuracy motion estimations. In this paper, we propose a novel spatio-temporal matching network (STMN) for video super-resolution, which works on the wavelet domain to reduce dependence on motion estimations. Specifically, our STMN consists of three major components: a temporal fusion wavelet network (TFWN), a non-local matching network (NLMN), and a global wavelet domain residual connection (GWDRC). TFWN adaptively extracts temporal fusion wavelet maps via three 3d convolutional layers and a discrete wavelet transform (DWT) decomposition layer. The extracted temporal fusion wavelet maps are rich in spatial information and knowledge of different frequencies from consecutive frames, which are feed to NLMN for learning deep wavelet representations. NLMN integrates super-resolution and denoising into a unified module by pyramidally stacking non-local matching residual blocks (NLMRB). At last, GWDRC reconstructs the super-resolved frames from the deep wavelet representations by using global wavelet domain residual information. Consequently, our STMN can efficiently enhance reconstruction quality by capturing different frequencies wavelet representations in consecutive frames, and does not require any motion compensation. Extensive experiments conducted on benchmark datasets demonstrate the effectiveness of our method compared with state-of-the-art methods.
- Publication:
-
Pattern Recognition
- Pub Date:
- February 2021
- DOI:
- Bibcode:
- 2021PatRe.11007619Z
- Keywords:
-
- Deep matching;
- Wavelet domain;
- Non-local matching;
- Residual learning;
- Video super-resolution