From Raw Data to Structural Semantics: Trade-offs among Distortion, Rate, and Inference Accuracy

doi:10.48550/arXiv.2412.19825

From Raw Data to Structural Semantics: Trade-offs among Distortion, Rate, and Inference Accuracy

This work explores the advantages of using persistence diagrams (PDs), topological signatures of raw point cloud data, in a point-to-point communication setting. PD is a structural semantics in the sense that it carries information about the shape and structure of the data. Instead of transmitting raw data, the transmitter communicates its PD semantics, and the receiver carries out inference using the received semantics. We propose novel qualitative definitions for distortion and rate of PD semantics while quantitatively characterizing the trade-offs among the distortion, rate, and inference accuracy. Simulations demonstrate that unlike raw data or autoencoder (AE)-based latent representations, PD semantics leads to more effective use of transmission channels, enhanced degrees of freedom for incorporating error detection/correction capabilities, and improved robustness to channel imperfections. For instance, in a binary symmetric channel with nonzero crossover probability settings, the minimum rate required for Bose, Chaudhuri, and Hocquenghem (BCH)-coded PD semantics to achieve an inference accuracy over 80% is approximately 15 times lower than the rate required for the coded AE-latent representations. Moreover, results suggest that the gains of PD semantics are even more pronounced when compared with the rate requirements of raw data.

Publication:

arXiv e-prints

Pub Date:

December 2024

DOI:

10.48550/arXiv.2412.19825

arXiv:

arXiv:2412.19825

Bibcode:

2024arXiv241219825A

Keywords:

Computer Science - Information Theory

E-Print:

13 pages, 8 figures

ADS

From Raw Data to Structural Semantics: Trade-offs among Distortion, Rate, and Inference Accuracy

Abstract