Big Atomics
Abstract
In this paper, we give theoretically and practically efficient implementations of Big Atomics, i.e., $k$-word linearizable registers that support the load, store, and compare-and-swap (CAS) operations. While modern hardware supports $k = 1$ and sometimes $k = 2$ (e.g., double-width compare-and-swap in x86), our implementations support arbitrary $k$. Big Atomics are useful in many applications, including atomic manipulation of tuples, version lists, and implementing load-linked/store-conditional (LL/SC). We design fast, lock-free implementations of big atomics based on a novel fast-path-slow-path approach we develop. We then use them to develop an efficient concurrent hash table, as evidence of their utility. We experimentally validate the approach by comparing a variety of implementations of big atomics under a variety of workloads (thread counts, load/store ratios, contention, oversubscription, and number of atomics). The experiments compare two of our lock-free variants with C++ std::atomic, a lock-based version, a version using sequence locks, and an indirect version. The results show that our approach is close to the fastest under all conditions and far outperforms others under oversubscription. We also compare our big atomics based concurrent hash table to a variety of other state-of-the-art hash tables that support arbitrary length keys and values, including implementations from Intel's TBB, Facebook's Folly, libcuckoo, and a recent release from Boost. The results show that our approach of using big atomics in the design of hash tables is a promising direction.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2025
- DOI:
- arXiv:
- arXiv:2501.07503
- Bibcode:
- 2025arXiv250107503A
- Keywords:
-
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing;
- Computer Science - Data Structures and Algorithms