2 code implementations • 24 Jul 2023 • Łukasz Dębowski
The article introduces corrections to Zipf's and Heaps' laws based on systematic models of the proportion of hapaxes, i. e., words that occur once.
no code implementations • 17 Feb 2023 • Łukasz Dębowski
It was observed that large language models exhibit a power-law decay of cross entropy with respect to the number of parameters and training tokens.
no code implementations • 27 Sep 2022 • Łukasz Dębowski
It has been known that such minimal codes are strongly universal for a strictly positive entropy rate, whereas the number of rules in the minimal grammar constitutes an upper bound for the mutual information of the source.
no code implementations • 25 Nov 2020 • Łukasz Dębowski
Using both results, we show that all (also uncomputable) sources of a finite unifilar order exhibit sub-power-law growth of algorithmic mutual information and of the unifilar order estimator.
Information Theory Information Theory 62M05 (Primary) 60G10, 94A17, 94A29 (Secondary)
no code implementations • 14 Jun 2017 • Łukasz Dębowski
As we discuss, a stationary stochastic process is nonergodic when a random persistent topic can be detected in the infinite random text sampled from the process, whereas we call the process strongly nonergodic when an infinite sequence of independent random bits, called probabilistic facts, is needed to describe this topic completely.
no code implementations • 31 Oct 2013 • Łukasz Dębowski
Hilberg's conjecture about natural language states that the mutual information between two adjacent long blocks of text grows like a power of the block length.
no code implementations • 27 Apr 2013 • Ramon Ferrer-i-Cancho, Łukasz Dębowski, Fermín Moscoso del Prado Martín
We show that constant entropy rate (CER) and two interpretations for uniform information density (UID), full UID and strong UID, are inconsistent with these laws.