AIST Stories No2

Impact in the following fields! Community lifeIndustry◦Services◦‌‌Sports and leisure◦Education◦IT and Telecommunications◦Electronics◦Services◦‌‌Software and content27AIST supporting livelihoods! Toward making singing synthesis more natural Singing synthesis software, such as Hatsune Miku created by Crypton Future Media, uses VOCALOID, a singing synthesis technology developed by Yamaha. In the past it was difficult to produce natural-sounding singing with this technology but AIST overcame this problem with "VocaListener". Yamaha started to sell a commercial version of VocaListener in 2012. (From‌a‌Yamaha‌press‌release‌of‌September‌18,‌2012)ways to enjoy music are continuing to open up.Researching singing synthesis technology for creating musicThe advance of technology is bringing about a new culture of music. Singing synthesis technology, which has been attracting attention since 2007, has produced, for the first time in history, a culture that can positively enjoy songs with synthesized voices as main vocals. This innovative culture is an expression of Japan's strengths, and AIST is pursuing research to take this culture further forward.One such singing synthesis technology is called VocaListener ("Bokarisu"). This technology enables the synthesis of singing voices that imitate the way a user sings. The user need only input a sample of their singing into VocaListener, which then controls commercially available singing synthesis software with various voice timbres to naturally synthesize voices that imitate the singing.In this era of digital music, music-understanding technology is supporting access to the huge body of songs on the Internet, and singing synthesis technology is expanding the scope for enjoyment in creating music. These trends will not be limited to music but will spread to other forms of content, such as videos.Goto adds that "There is a problem of whether, as the digitalization of content produces a society in which nothing is forgotten, the future might be buried under the huge and continually growing body of content from the past. For listeners, selecting music is getting more difficult, and for creators, their creations can easily just disappear into obscurity. In addition, as more and more similar content appears, the risk that a person's creations will be criticized for being too similar to other pieces increases, and there is a danger that we will end up with a society in which people hesitate to create and publish their own content. By continuing with research and development powered by AIST’s core technologies, we are aiming for a "Content-Symbiotic Society" in which everyone can enjoy appreciating and creating music without hesitation while paying due respect to older content."Songrium ▲Songrium visualizes various musical relations between music videos on video sharing websites. The user can keep encountering new songs and review a history of old songs.(From‌an‌AIST‌press‌release‌of‌August‌27,‌2013)(From‌an‌AIST‌press‌release‌of‌August‌29,‌2012)Songle▶‌‌The results of automatic understanding of the content of a song on the Internet are visualized as a music map. While viewing the structure of the song, the user may listen only to a rousing chorus or may compare repeated sections.Music content (MP3, Piapro, SoundCloud, Niconico, YouTube)Users (anyone can Music playback Internet (web)Active Music Listening ServiceMusic visualizationCorrection of estimation errorsChorus searchAutomatic music-understanding technologies“Music Map”Visualized music contentMusic structureBeat structureChordsMelody lineChorussectionsRepeated sectionsRoot note and chord typeTone of the vocal melodyMusical beats and bar linesuse for free)Using web browsersGiven the user’s singing voice and the lyrics, it can synthesize the singing voice mimicking the user.Even by switching either a singing synthesis software , or a sound source (singer’s voice), it can always synthesize the singing voice in the same way (tone and loudness of the voice).Example of the user’s singing voiceTone of the voiceNatural singing voice like a humanLoudness of the voiceTimeTimeSinging synthesis software ASeveral singing synthesis softwaresSeveral sound source (singer’s voice)Singing synthesis software BLyrics “Tachidomarutoki, mata futo furikaeru”AnalysisSynthesisSynthesisGuessing parameters