Researchers at Deezer have developed an AI system capable of associating certain songs with moods and intensities, as spotted by VentureBeat. The work is described in a newly published paper on Arxiv.org titled "Music Mood Detection Based on Audio Lyrics With Deep Neural Nets."
To determine a song's musical mood, the team considered both the audio signal and the lyrics. To start, they fed audio signals into a neural network, along with models that reconstruct the linguistic contexts of words. Then, to teach it how to determine the mood of a song, they used the Million Song Dataset (MSD), which is a collection of metadata for over 1 million contemporary songs. In particular, they used Last.fm's dataset, which assigns identifiers to tracks from over 500,000 unique tags. Many of these tags are mood-related, and over 14,000 English words from these tags were given two scale ratings correlating to how negative or positive a word is, and also how calm or energetic a word is in order to train the system.
The Million Song Database just contains metadata for songs, not the songs themselves, so the team then paired all this information to Deezer's catalog using identifiers like song titles, artist names, and album titles. About 60 percent of the resulting dataset (18,644 tracks) was used to train the AI, with the rest used to validate and further test the system.
In the end, the researchers concluded that the AI was better able to detect how calm or energetic a song was better than more traditional approaches that did not use AI, and performed about the same when it came to detecting whether a song was positive or negative. "It seems that this gain of performance is the result of the capacity of our model to unveil and use mid-level correlations between audio and lyrics, particularly when it comes to predicting valence," the researchers wrote in the paper.
It's noted in the paper that in order to really leverage this work, a "database with synchronized lyrics and audio would be of great help to go further." If such a database existed, the team believes they could more finely determine the ambiguity in the mood of tracks, as "in some cases, there can be significant variability between listeners" (people might not always agree on if a song is positive or negative, for example). Ultimately, the researchers believe this sort of work is seen as a way to further look into how music, lyrics, and mood correlate, as well as the possibility of having deep learning models be able to sort through and find unlabeled data in high volume.
This is far from the first time Deezer has attempted to use AI in order to sort through music. Last year, it took on a challenge at Sónar festival to answer the question, "When a user is at home, how can we detect the context in which they are listening to music and recommend music accordingly?" Deezer could theoretically use this type of machine learning in the future to automatically sort and catalog music — not just with basic metadata, like the artist's name or genre of music, but something more nuanced like mood.