Fast 2D Convolutions and Cross-Correlations Using Scalable Architectures
Abstract
The manuscript describes fast and scalable architectures and associated algorithms for computing convolutions and cross-correlations. The basic idea is to map 2D convolutions and cross-correlations to a collection of 1D convolutions and cross-correlations in the transform domain. This is accomplished through the use of the Discrete Periodic Radon Transform (DPRT) for general kernels and the use of SVD-LU decompositions for low-rank kernels. The approach uses scalable architectures that can be fitted into modern FPGA and Zynq-SOC devices. Based on different types of available resources, for $P\times P$ blocks, 2D convolutions and cross-correlations can be computed in just $O(P)$ clock cycles up to $O(P^2)$ clock cycles. Thus, there is a trade-off between performance and required numbers and types of resources. We provide implementations of the proposed architectures using modern programmable devices (Virtex-7 and Zynq-SOC). Based on the amounts and types of required resources, we show that the proposed approaches significantly outperform current methods.
- Publication:
-
IEEE Transactions on Image Processing
- Pub Date:
- May 2017
- DOI:
- 10.1109/TIP.2017.2678799
- arXiv:
- arXiv:2112.13150
- Bibcode:
- 2017ITIP...26.2230C
- Keywords:
-
- Computer Science - Hardware Architecture;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing;
- Electrical Engineering and Systems Science - Image and Video Processing;
- Electrical Engineering and Systems Science - Signal Processing
- E-Print:
- The paper develops the fastest known methods for computing 2D convolutions in hardware