Facticity as the amount of selfdescriptive information in a data set
Abstract
Using the theory of Kolmogorov complexity the notion of facticity {\phi}(x) of a string is defined as the amount of selfdescriptive information it contains. It is proved that (under reasonable assumptions: the existence of an empty machine and the availability of a faithful index) facticity is definite, i.e. random strings have facticity 0 and for compressible strings 0 < {\phi}(x) < 1/2 x + O(1). Consequently facticity measures the tension in a data set between structural and adhoc information objectively. For binary strings there is a socalled facticity threshold that is dependent on their entropy. Strings with facticty above this threshold have no optimal stochastic model and are essentially computational. The shape of the facticty versus entropy plot coincides with the wellknown sawtooth curves observed in complex systems. The notion of factic processes is discussed. This approach overcomes problems with earlier proposals to use twopart code to define the meaningfulness or usefulness of a data set.
 Publication:

arXiv eprints
 Pub Date:
 March 2012
 DOI:
 10.48550/arXiv.1203.2245
 arXiv:
 arXiv:1203.2245
 Bibcode:
 2012arXiv1203.2245A
 Keywords:

 Computer Science  Information Theory
 EPrint:
 10 pages, 2 figures