Total: 1
At present, neural network-based models, including transformers, struggle to generate memorable and readily comprehensible music from unified and repetitive musical material due to a lack of understanding of musical structure. Consequently, these models are rarely employed by the games industry. It is hypothesised by many scholars that the modelling of musical structure may inform models at a higher level, thereby enhancing the quality of music generation. The aim of this study is to explore the performance of supervised learning methods in the task of structural segmentation, which is the initial step in music structure modelling. An audio game music dataset with 309 structural annotations was created to train the proposed method, which combines convolutional neural networks and recurrent neural networks, achieving performance comparable to the state-of-the-art unsupervised learning methods with fewer training resources.