Ssm-net: Feature Learning For Music Structure Analysis Using A Self-similarity-matrix Based Loss

Abstract

In this paper, we propose a new paradigm to learn audio features for Music Structure Analysis (MSA). We train a deep encoder to learn features such that the Self-Similarity-Matrix (SSM) resulting from those approximates a ground-truth SSM. This is done by minimizing a loss between both SSMs. Since this loss is differentiable w.r.t. its input features we can train the encoder in a straightforward way. We successfully demonstrate the use of this training paradigm using the Area Under the Curve ROC (AUC) on the RWC-Pop dataset.

Ssm-net: Feature Learning For Music Structure Analysis Using A Self-similarity-matrix Based Loss

Abstract

Authors

Tags

Stats

Related papers