Abstract
arXiv:2512.09786v2 Announce Type: replace-cross Abstract: Examples of embedded intelligence include a wide variety of tiny neural networks used on-board wireless sensors and actuators, which are expected to continuously perform inference on time-series of the data they sense. In order to fit lifetime and energy consumption requirements when operating on battery, such hardware is exclusively based on microcontroller with as little memory as possible, e.g., 128 kB of RAM. In this context, optimizing data flows during inference across neural network layers becomes crucial. In this paper, we introduce a new framework, TinyD\'ej\`aVu, and novel algorithms we designed to drastically reduce the RAM budget required by inference using various neural network models for sensor data time-series on typical microcontroller hardware. We publish the implementation of TinyD\'ej\`aVu as open source, and we perform reproducible benchmarks on common microcontroller hardware (Arm Cortex-M). We show that TinyD\'ej\`aVu can save up to 90\% of RAM usage with equal compute latency compared to prior work (StreamiNNC) on overlapping sliding window inputs.