Deezer研究人員開發出一種能夠檢測歌曲的音樂情緒的人工智能系統

  • 流覽次數:: 96
  • 分類: 產業區
  • 分享次數:
  • 作者: 音樂地圖
  • Deezer研究人員開發出一種能夠檢測歌曲的音樂情緒的人工智能系統

      201810/2706:40

    ◎Deezer的研究人員已經開發出一種能夠將某些歌曲與情緒和強度聯結起來的人工智能系統,這項研究在最近於Arxiv.org上發表的一篇題為「基於音頻歌詞和深度神經網絡的音樂情緒檢測」的新文章中有所描述。

    ◎為了確定一首歌曲的音樂情緒,該團隊考量了音頻信號和歌詞。首先,他們將音頻信號輸入神經網絡並重建語言語境的模型,然後,為了教它如何確定一首歌的情緒,他們使用了百萬歌曲數據集(MSD),這是一組有超過100萬首當代歌曲的元數據。他們還特別使用了Last.fm的數據集,該數據集從超過50萬個獨特標籤中來為單曲分配識別符號。這些標籤中的許多都是與情緒相關的,且這些標籤中有超過14,000個英語單字被賦予了兩種與該單字有多負面或多正面以及有多平靜或多有精力相關的計量評等尺度,好用來訓練該系統。

    ◎百萬歌曲數據集只包含歌曲的元數據而不包含歌曲本身,因此團隊後續要使用像是歌曲標題、藝人姓名和專輯標題等識別符號,把所有這些信息與Deezer的曲目進行配對。大約60%的結果數據集(18,644首歌曲)被用來訓練人工智能,其餘用於驗證和進一步測試系統。最後,研究人員得出結論,人工智能在檢測一首歌是平靜或充滿活力的部分比沒有使用人工智能的傳統方法更好,不過在檢測一首歌是正面還是負面時表現得差不多。

    ◎在報告中指出,為了真正利用這項工作,「具有同步歌詞和音頻的數據庫將對進一步發展有很大幫助。」如果存在這樣的數據庫,團隊相信他們可以更精確地確定歌曲的情緒其中的模糊性,「在某些情況下,聽眾之間可能存在顯著的差異」(例如,人們可能並不總是同意一首歌曲是正面的還是負面的)。最終,研究人員認為這種工作被視為進一步研究音樂、歌詞和情緒如何相關的一種方式,以及讓深度學習模型能夠分類並大量找到未標籤數據的可能性。

    ◎這並非Deezer第一次嘗試使用AI來排序音樂,去年該平台在Sónar音樂節上接受挑戰要回答「當用戶在家時,我們怎樣才能檢測出他們聽的音樂的背景並相應地推薦音樂?」這個問題,Deezer理論上未來可以使用這種類型的機器學習來自動對音樂進行排序和編目,不僅僅是基本的元數據,比如藝人的名字或音樂類型,還有像是心情這樣更細微的差別。

    詳細全文:

    Researchers at Deezer have developed an AI system capable of associating certain songs with moods and intensities, as spotted by VentureBeat. The work is described in a newly published paper on Arxiv.org titled "Music Mood Detection Based on Audio Lyrics With Deep Neural Nets."

    To determine a song's musical mood, the team considered both the audio signal and the lyrics. To start, they fed audio signals into a neural network, along with models that reconstruct the linguistic contexts of words. Then, to teach it how to determine the mood of a song, they used the Million Song Dataset (MSD), which is a collection of metadata for over 1 million contemporary songs. In particular, they used Last.fm's dataset, which assigns identifiers to tracks from over 500,000 unique tags. Many of these tags are mood-related, and over 14,000 English words from these tags were given two scale ratings correlating to how negative or positive a word is, and also how calm or energetic a word is in order to train the system.

    The Million Song Database just contains metadata for songs, not the songs themselves, so the team then paired all this information to Deezer's catalog using identifiers like song titles, artist names, and album titles. About 60 percent of the resulting dataset (18,644 tracks) was used to train the AI, with the rest used to validate and further test the system.

    In the end, the researchers concluded that the AI was better able to detect how calm or energetic a song was better than more traditional approaches that did not use AI, and performed about the same when it came to detecting whether a song was positive or negative. "It seems that this gain of performance is the result of the capacity of our model to unveil and use mid-level correlations between audio and lyrics, particularly when it comes to predicting valence," the researchers wrote in the paper.

    It's noted in the paper that in order to really leverage this work, a "database with synchronized lyrics and audio would be of great help to go further." If such a database existed, the team believes they could more finely determine the ambiguity in the mood of tracks, as "in some cases, there can be significant variability between listeners" (people might not always agree on if a song is positive or negative, for example). Ultimately, the researchers believe this sort of work is seen as a way to further look into how music, lyrics, and mood correlate, as well as the possibility of having deep learning models be able to sort through and find unlabeled data in high volume.

    This is far from the first time Deezer has attempted to use AI in order to sort through music. Last year, it took on a challenge at Sónar festival to answer the question, "When a user is at home, how can we detect the context in which they are listening to music and recommend music accordingly?" Deezer could theoretically use this type of machine learning in the future to automatically sort and catalog music — not just with basic metadata, like the artist's name or genre of music, but something more nuanced like mood.

    The Verge

    https://bit.ly/2xN7cpZ