An Analysis of the GTZAN Music Genre Dataset, 2012

Authors: Bob L. Sturm
Type: Conference paper
Conference: ACM Multimedia 2012
Titel: Proceedings of the second international ACM workshop on Music information retrieval with user-centered and multimodal strategies
Year: 2012

Abstract: Most research in automatic music genre recognition has used the dataset assembled by Tzanetakis et al. in 2001. The composition and integrity of this dataset, however, has never been formally analyzed. For the first time, we provide an analysis of its composition, and create a machine-readable index of artist and song titles, identifying nearly all excerpts. We also catalog numerous problems with its integrity, including replications, mislabelings, and distortion.

