no code implementations • 7 Aug 2018 • Yu-Hsuan Wang, Hung-Yi Lee, Lin-shan Lee
In this paper, we extend audio Word2Vec from word-level to utterance-level by proposing a new segmental audio Word2Vec, in which unsupervised spoken word boundary segmentation and audio Word2Vec are jointly learned and mutually enhanced, so an utterance can be directly represented as a sequence of vectors carrying phonetic structure information.
1 code implementation • 22 Mar 2017 • Yu-Hsuan Wang, Cheng-Tao Chung, Hung-Yi Lee
In this paper we analyze the gate activation signals inside the gated recurrent neural networks, and find the temporal structure of such signals is highly correlated with the phoneme boundaries.