Welcome ;)

Here are the latest posts from me

All about Speech, Deep Learning and Life.


  • Popular Music Genres Introduction

    There are so many different types of music like Jazz, R&B, Blues etc. If you listen each one of these music several times, you may know what’s it like, but also, you don’t know why. So the motivation of this article is to describe popular music genres in a academic way. I’ll try to introduce each common music genre specifically. Some typical examples would be better. This is an interesting work. Keep doing.

  • Interspeech 2019 Voice Conversion Paper Review

    For Interspeech 2019, this year there are two sessions about Voice Conversion(VC). In this post, I would mostly review VC-related papers in session ’ Neural Techniques for Voice Conversion and Waveform Generation’, which is mainly about speaker information transformation. It is interesting that StarGAN becomes very popular this year. All the three papers about StarGAN tries to improve performace by modifying its architecture or training strategy. Also is One-shot Learning VC (three papers) which convert source speech to arbitrary target speaker with very limited target speaker corpus. One of them uses VAE while other two methods use PPG. There are also three VC works named on Tomoki Toda which all based on VAE framework. (W.I.P)