← all papers · overview

On the near-tightness of $ḩi łeq 2r$: a general $σ$-ary construction and a binary case via LFSRs

Abstract

In the field of compressed string indexes, recent work has introduced suffixient sets and their corresponding repetitiveness measure $\chi$. In particular, researchers have explored its relationship to other repetitiveness measures, notably $r$, the number of runs in the Burrows--Wheeler Transform (BWT) of a string. Navarro et al. (2025) proved that $\chi \leq 2r$, although empirical results by Cenzato et al. (2024) suggest that this bound is loose, with real data bounding $\chi$ by around $1.13r$ to $1.33r$ when the size of the alphabet is $\sigma = 4$. To better understand this gap, we present two cases for the asymptotic tightness of the $\chi \leq 2r$ bound: a general construction for arbitrary $\sigma$ values, and a binary alphabet case, consisting of de Bruijn sequences constructed by linear-feedback shift registers (LFSRs) from primitive polynomials over $\mathbb{F}_2$. The second is a novel characterization of which de Bruijn sequences achieve the literature run-minimal pattern for the cyclic BWT. Moreover, we show that de Bruijn sequences fail to close the gap for $\sigma \geq 3$.

Related papers

Ranked by semantic similarity — how closely each paper's abstract matches this one (100% = near-identical topic).