Abstract
We study the problem of counting $k$-hypergraphlets, an interesting but surprisingly ignored primitive, with the aim of understanding whether efficient algorithms exist. To this end, we consider color coding, a well-known technique for approximately counting $k$-graphlets in graphs. Our first result is that, on hypergraphs, color coding encounters a quadratic barrier: under the Orthogonal Vector Conjecture, no implementation can run in sub-quadratic time in the input size. We then introduce a simple property, $(\alpha,\beta)$-niceness, that hypergraphs from real-world datasets appear to satisfy for small values of $\alpha$ and $\beta$. Intuitively, an $(\alpha,\beta)$-nice hypergraph can be split into two sub-hypergraphs having respectively rank at most $\alpha$ and degree at most $\beta$. By applying different techniques to each sub-hypergraph and carefully combining the outputs, we show how to run color coding in time $2^{O(k)} \cdot (2^\beta |V| + \alpha^k |E| + \alpha^2 \beta \|H\|)$, where $H=(V,E)$ is the input hypergraph. Afterwards, we can sample colorful $k$-hypergraphlets uniformly in expected $k^{O(k)} \cdot (\beta^2 + \ln |V|)$ time per sample. Experiments on real-world hypergraphs show that our algorithm significantly outperforms the naive quadratic algorithm, sometimes by more than an order of magnitude.