A Truly Sparse and General Implementation of Gradient-Based Synaptic Plasticity
Abstract
Online synaptic plasticity rules derived from gradient descent achieve high accuracy on a wide range of practical tasks. However, their software implementation often requires tediously hand-derived gradients or using gradient backpropagation which sacrifices the online capability of the rules. In this work, we present a custom automatic differentiation (AD) pipeline for sparse and online implementation of gradient-based synaptic plasticity rules that generalizes to arbitrary neuron models. Our work combines the programming ease of backpropagation-type methods for forward AD while being memory-efficient. To achieve this, we exploit the advantageous compute and memory scaling of online synaptic plasticity by providing an inherently sparse implementation of AD where expensive tensor contractions are replaced with simple element-wise multiplications if the tensors are diagonal. Gradient-based synaptic plasticity rules such as eligibility propagation (e-prop) have exactly this property and thus profit immensely from this feature. We demonstrate the alignment of our gradients with respect to gradient backpropagation on an synthetic task where e-prop gradients are exact, as well as audio speech classification benchmarks. We demonstrate how memory utilization scales with network size without dependence on the sequence length, as expected from forward AD methods.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2025
- arXiv:
- arXiv:2501.11407
- Bibcode:
- 2025arXiv250111407L
- Keywords:
-
- Computer Science - Neural and Evolutionary Computing;
- Computer Science - Artificial Intelligence;
- Computer Science - Machine Learning
- E-Print:
- 8 pages, 7 figures