no code implementations • IWSLT 2017 • Hao Qin, Takahiro Shinozaki, Kevin Duh
Neural machine translation (NMT) systems have demonstrated promising results in recent years.
no code implementations • SIGDIAL (ACL) 2022 • Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki
In the slot self-attention layers, we force each slot to involve information from the other k prominent slots and mask the rest out.
Dialogue State Tracking Multi-domain Dialogue State Tracking +1
no code implementations • 9 Sep 2022 • Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takahiro Shinozaki
We confirm in experiments that our TS-ASR achieves comparable recognition performance with conventional cascade systems in the offline setting, while reducing computation costs and realizing streaming TS-ASR.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • COLING 2022 • Yidong Wang, Hao Wu, Ao Liu, Wenxin Hou, Zhen Wu, Jindong Wang, Takahiro Shinozaki, Manabu Okumura, Yue Zhang
Limited labeled data increase the risk of distribution shift between test data and training data.
4 code implementations • 12 Aug 2022 • Yidong Wang, Hao Chen, Yue Fan, Wang Sun, Ran Tao, Wenxin Hou, RenJie Wang, Linyi Yang, Zhi Zhou, Lan-Zhe Guo, Heli Qi, Zhen Wu, Yu-Feng Li, Satoshi Nakamura, Wei Ye, Marios Savvides, Bhiksha Raj, Takahiro Shinozaki, Bernt Schiele, Jindong Wang, Xing Xie, Yue Zhang
We further provide the pre-trained versions of the state-of-the-art neural models for CV tasks to make the cost affordable for further tuning.
4 code implementations • 15 May 2022 • Yidong Wang, Hao Chen, Qiang Heng, Wenxin Hou, Yue Fan, Zhen Wu, Jindong Wang, Marios Savvides, Takahiro Shinozaki, Bhiksha Raj, Bernt Schiele, Xing Xie
Semi-supervised Learning (SSL) has witnessed great success owing to the impressive performances brought by various methods based on pseudo labeling and consistency regularization.
1 code implementation • 14 Dec 2021 • Yidong Wang, BoWen Zhang, Wenxin Hou, Zhen Wu, Jindong Wang, Takahiro Shinozaki
The long-tailed class distribution in visual recognition tasks poses great challenges for neural networks on how to handle the biased predictions between head and tail classes, i. e., the model tends to classify tail classes as head classes.
2 code implementations • NeurIPS 2021 • BoWen Zhang, Yidong Wang, Wenxin Hou, Hao Wu, Jindong Wang, Manabu Okumura, Takahiro Shinozaki
However, like other modern SSL algorithms, FixMatch uses a pre-defined constant threshold for all classes to select unlabeled data that contribute to the training, thus failing to consider different learning status and learning difficulties of different classes.
2 code implementations • 18 May 2021 • Wenxin Hou, Han Zhu, Yidong Wang, Jindong Wang, Tao Qin, Renjun Xu, Takahiro Shinozaki
Based on our previous MetaAdapter that implicitly leverages adapters, we propose a novel algorithms called SimAdapter for explicitly learning knowledge from adapters.
Ranked #1 on Cross-Lingual ASR on Common Voice
1 code implementation • 15 Apr 2021 • Wenxin Hou, Jindong Wang, Xu Tan, Tao Qin, Takahiro Shinozaki
End-to-end automatic speech recognition (ASR) can achieve promising performance with large-scale training data.
Ranked #1 on Cross-environment ASR on Libri-Adapt
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • 25 Oct 2020 • Wenxin Hou, Yue Dong, Bairong Zhuang, Longfei Yang, Jiatong Shi, Takahiro Shinozaki
In this paper, we report a large-scale end-to-end language-independent multilingual model for joint automatic speech recognition (ASR) and language identification (LID).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 10 Nov 2017 • Taku Kato, Takahiro Shinozaki
The key problem here is the cost of transcribing speech data.