no code implementations • ECCV 2020 • Linlin Chao, Jingdong Chen, Wei Chu
However, CTC tends to output spiky distributions since it prefers to output blank symbol most of the time.
1 code implementation • 31 Jan 2024 • Xingning Dong, Qingpei Guo, Tian Gan, Qing Wang, Jianlong Wu, Xiangyuan Ren, Yuan Cheng, Wei Chu
By employing one shared BERT-type network to refine textual and cross-modal features simultaneously, SNP is lightweight and could support various downstream applications.
no code implementations • 9 Jan 2024 • Xuzheng Yu, Chen Jiang, Wei zhang, Tian Gan, Linlin Chao, Jianan Zhao, Yuan Cheng, Qingpei Guo, Wei Chu
With the explosive growth of video data in real-world applications, a comprehensive representation of videos becomes increasingly important.
2 code implementations • 12 Nov 2023 • Qiang Zhou, Zhibin Wang, Wei Chu, Yinghui Xu, Hao Li, Yuan Qi
Our experiments demonstrate that preserving the positional information of visual embeddings through the pool-adapter is particularly beneficial for tasks like visual grounding.
Ranked #67 on Visual Question Answering on MM-Vet
1 code implementation • 27 Sep 2023 • Weidi Xu, Jingwei Wang, Lele Xie, Jianshan He, Hongting Zhou, Taifeng Wang, Xiaopei Wan, Jingdong Chen, Chao Qu, Wei Chu
Integrating first-order logic constraints (FOLCs) with neural networks is a crucial but challenging problem since it involves modeling intricate correlations to satisfy the constraints.
1 code implementation • 20 Sep 2023 • Chen Jiang, Hong Liu, Xuzheng Yu, Qing Wang, Yuan Cheng, Jia Xu, Zhongyi Liu, Qingpei Guo, Wei Chu, Ming Yang, Yuan Qi
We thereby present a new Triplet Partial Margin Contrastive Learning (TPM-CL) module to construct partial order triplet samples by automatically generating fine-grained hard negatives for matched text-video pairs.
Ranked #4 on Video Retrieval on MSR-VTT-1kA
no code implementations • 20 Sep 2023 • Chen Jiang, Kaiming Huang, Sifeng He, Xudong Yang, Wei zhang, Xiaobo Zhang, Yuan Cheng, Lei Yang, Qing Wang, Furong Xu, Tan Pan, Wei Chu
SSAN is based on two newly proposed modules in video retrieval: (1) An efficient Self-supervised Keyframe Extraction (SKE) module to reduce redundant frame features, (2) A robust Similarity Pattern Detection (SPD) module for temporal alignment.
no code implementations • 25 Jun 2023 • Qingpei Guo, Kaisheng Yao, Wei Chu
They can achieve exceptional performances on specific tasks, but face a particularly challenging problem of modality mismatch because of diversity of input modalities and their fixed structures.
1 code implementation • CVPR 2023 • Tan Pan, Furong Xu, Xudong Yang, Sifeng He, Chen Jiang, Qingpei Guo, Feng Qian Xiaobo Zhang, Yuan Cheng, Lei Yang, Wei Chu
For traditional model upgrades, the old model will not be replaced by the new one until the embeddings of all the images in the database are re-computed by the new model, which takes days or weeks for a large amount of data.
no code implementations • 15 Apr 2023 • Ruchao Fan, Wei Chu, Peng Chang, Abeer Alwan
During inference, an error-based alignment sampling method is investigated in depth to reduce the alignment mismatch in the training and testing processes.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
1 code implementation • 28 Feb 2023 • Wen Li, Cheng Zou, Meng Wang, Furong Xu, Jianan Zhao, Ruobing Zheng, Yuan Cheng, Wei Chu
In this paper, we propose a Diverse and Compact Transformer (DC-Former) that can achieve a similar effect by splitting embedding space into multiple diverse and compact subspaces.
1 code implementation • 10 Feb 2023 • Lei Zhang, Xiaodong Yan, Jianshan He, Ruopeng Li, Wei Chu
Our experimental results show that our model effectively relieves the problem of over-smoothing in deep GCNs and outperforms the state-of-the-art (SOTA) methods on various benchmark datasets.
no code implementations • 2 Feb 2023 • Zhixuan Chu, Jianmin Huang, Ruopeng Li, Wei Chu, Sheng Li
Causal inference has numerous real-world applications in many domains, such as health care, marketing, political science, and online advertising.
no code implementations • CVPR 2023 • Jiangwei Lao, Weixiang Hong, Xin Guo, Yingying Zhang, Jian Wang, Jingdong Chen, Wei Chu
In this work, we propose a novel feature enhancement network to simultaneously model short- and long-term temporal correlation.
no code implementations • ACL 2022 • Mingzhe Li, Xiexiong Lin, Xiuying Chen, Jinxiong Chang, Qishen Zhang, Feng Wang, Taifeng Wang, Zhongyi Liu, Wei Chu, Dongyan Zhao, Rui Yan
Contrastive learning has achieved impressive success in generation tasks to militate the "exposure bias" problem and discriminatively exploit the different quality of references.
2 code implementations • 29 Apr 2022 • Linlin Chao, Xiexiong Lin, Taifeng Wang, Wei Chu
Meanwhile, the inference time grows log-linearly with the number of entities for all entities are traversed and compared.
1 code implementation • 3 Apr 2022 • Hao Wang, Tai-Wei Chang, Tianqiao Liu, Jianmin Huang, Zhichao Chen, Chao Yu, Ruopeng Li, Wei Chu
In this paper, we theoretically demonstrate that ESMM suffers from the following two problems: (1) Inherent Estimation Bias (IEB), where the estimated CVR of ESMM is inherently higher than the ground truth; (2) Potential Independence Priority (PIP) for CTCVR estimation, where there is a risk that the ESMM overlooks the causality from click to conversion.
2 code implementations • 13 Mar 2022 • Xiaojie Chu, Yongtao Wang, Chunhua Shen, Jingdong Chen, Wei Chu
The development of scene text recognition (STR) in the era of deep learning has been mainly focused on novel architectures of STR models.
no code implementations • CVPR 2022 • Weixiang Hong, Jiangwei Lao, Wang Ren, Jian Wang, Jingdong Chen, Wei Chu
Instead of proposing a specific vision transformer based detector, in this work, our goal is to reveal the insights of training vision transformer based detectors from scratch.
4 code implementations • 1 Jul 2021 • TingTing Liang, Xiaojie Chu, Yudong Liu, Yongtao Wang, Zhi Tang, Wei Chu, Jingdong Chen, Haibin Ling
With multi-scale testing, we push the current best single model result to a new record of 60. 1% box AP and 52. 3% mask AP without using extra training data.
Ranked #6 on Object Detection on COCO-O (using extra training data)
no code implementations • CVPR 2021 • Weixiang Hong, Qingpei Guo, Wei zhang, Jingdong Chen, Wei Chu
Panoptic segmentation is a challenging task aiming to simultaneously segment objects (things) at instance level and background contents (stuff) at semantic level.
no code implementations • CVPR 2021 • Furong Xu, Meng Wang, Wei zhang, Yuan Cheng, Wei Chu
Therefore, there is a need for a training mechanism that enforces the discriminativeness of all the elements in the feature to capture more the subtle visual cues.
no code implementations • 18 Jun 2021 • Jinhan Wang, Yunzheng Zhu, Ruchao Fan, Wei Chu, Abeer Alwan
~ 5 hours of transcribed data and ~ 60 hours of untranscribed data are provided to develop a German ASR system for children.
no code implementations • 18 Jun 2021 • Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao, Abeer Alwan
For the analyses, we plot attention weight distributions in the decoders to visualize the relationships between token-level acoustic embeddings.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • 23 May 2021 • Hao Huang, Yongtao Wang, Zhaoyu Chen, Yuze Zhang, Yuheng Li, Zhi Tang, Wei Chu, Jingdong Chen, Weisi Lin, Kai-Kuang Ma
Then, we design a two-level perturbation fusion strategy to alleviate the conflict between the adversarial watermarks generated by different facial images and models.
no code implementations • COLING 2020 • Weipeng Huang, Xingyi Cheng, Kunlong Chen, Taifeng Wang, Wei Chu
The ambiguous annotation criteria lead to divergence of Chinese Word Segmentation (CWS) datasets in various granularities.
1 code implementation • ACL 2021 • Linlin Chao, Jianshan He, Taifeng Wang, Wei Chu
Distance based knowledge graph embedding methods show promising results on link prediction task, on which two topics have been widely studied: one is the ability to handle complex relations, such as N-to-1, 1-to-N and N-to-N, the other is to encode various relation patterns, such as symmetry/antisymmetry.
Ranked #12 on Link Property Prediction on ogbl-biokg
no code implementations • 28 Oct 2020 • Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao
The information are used to extract acoustic representation for each token in parallel, referred to as token-level acoustic embedding which substitutes the word embedding in autoregressive transformer (AT) to achieve parallel generation in decoder.
no code implementations • EMNLP 2020 • Kunlong Chen, Weidi Xu, Xingyi Cheng, Zou Xiaochuan, Yuyu Zhang, Le Song, Taifeng Wang, Yuan Qi, Wei Chu
Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both natural language understanding and arithmetic computation.
Ranked #1 on Question Answering on DROP Test
no code implementations • ACL 2020 • Xiexiong Lin, Weiyu Jian, Jianshan He, Taifeng Wang, Wei Chu
Experiments demonstrate that our model with fewer parameters yields significant improvements over competitive baselines on two datasets Wizard-of-Wikipedia(average Bleu +87{\%}; abs.
no code implementations • 19 May 2020 • Shijun Wang, Baocheng Zhu, Chen Li, Mingzhe Wu, James Zhang, Wei Chu, Yuan Qi
In this paper, We propose a general Riemannian proximal optimization algorithm with guaranteed convergence to solve Markov decision process (MDP) problems.
1 code implementation • ACL 2020 • Xingyi Cheng, Weidi Xu, Kunlong Chen, Shaohua Jiang, Feng Wang, Taifeng Wang, Wei Chu, Yuan Qi
This paper proposes to incorporate phonological and visual similarity knowledge into language models for CSC via a specialized graph convolutional network (SpellGCN).
no code implementations • 19 Apr 2020 • Chao Qu, Hui Li, Chang Liu, Junwu Xiong, James Zhang, Wei Chu, Weiqiang Wang, Yuan Qi, Le Song
We propose a \emph{collaborative} multi-agent reinforcement learning algorithm named variational policy propagation (VPP) to learn a \emph{joint} policy through the interactions over agents.
Multi-agent Reinforcement Learning reinforcement-learning +2
1 code implementation • 8 Sep 2019 • Weidi Xu, Xingyi Cheng, Kunlong Chen, Wei Wang, Bin Bi, Ming Yan, Chen Wu, Luo Si, Wei Chu, Taifeng Wang
To remedy this, we propose to augment the NSP task to a 3-class categorization task, which includes a category for previous sentence prediction (PSP).
1 code implementation • 16 Aug 2019 • Weipeng Huang, Xingyi Cheng, Taifeng Wang, Wei Chu
Combining these three contributions, we enhance the information extracting ability of the multi-head selection model and achieve F1-score 0. 876 on testset-1 with a single model.
no code implementations • 11 Mar 2019 • Xin Chen, Wei Chu, Jinxi Guo, Ning Xu
F0 and aperiodic are obtained through the original singing voice, and used with acoustic features to reconstruct the target singing voice through a vocoder.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 11 Mar 2019 • Weipeng Huang, Xingyi Cheng, Kunlong Chen, Taifeng Wang, Wei Chu
The ambiguous annotation criteria lead to divergence of Chinese Word Segmentation (CWS) datasets in various granularities.
no code implementations • 18 Dec 2018 • Yichao Zhou, Wei Chu, Sam Young, Xin Chen
In the learning stage, a sequence of stylistically uniform, multiple-channel music samples was modeled by a RNN.
1 code implementation • 26 Nov 2018 • Chenchen Li, Xiang Yan, Xiaotie Deng, Yuan Qi, Wei Chu, Le Song, Junlong Qiao, Jianshan He, Junwu Xiong
Uplift modeling aims to directly model the incremental impact of a treatment on an individual response.
no code implementations • 21 Nov 2018 • Wanchen Sui, Qing Zhang, Jun Yang, Wei Chu
In this paper, we propose a novel integrated framework for learning both text detection and recognition.
no code implementations • CONLL 2019 • Xingyi Cheng, Weidi Xu, Taifeng Wang, Wei Chu
By disentangling the latent representation into the aspect-specific sentiment and the lexical context, our method induces the underlying sentiment prediction for the unlabeled data, which then benefits the ATSA classifier.
Aspect-Based Sentiment Analysis (ABSA) Natural Language Understanding +1
no code implementations • 23 Aug 2018 • Chenchen Li, Xiang Yan, Xiaotie Deng, Yuan Qi, Wei Chu, Le Song, Junlong Qiao, Jianshan He, Junwu Xiong
Then we develop a variant of Latent Dirichlet Allocation (LDA) to infer latent variables under the current market environment, which represents the preferences of customers and strategies of competitors.
no code implementations • 13 Jul 2018 • Zhangyu Xiao, Zhijian Ou, Wei Chu, Hui Lin
In this paper, we present an end-to-end automatic speech recognition system, which successfully employs subword units in a hybrid CTC-Attention based system.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 3 May 2018 • Qiangpeng Yang, Mengli Cheng, Wenmeng Zhou, Yan Chen, Minghui Qiu, Wei. Lin, Wei Chu
To solve this problem, we propose a novel end-to-end scene text detector IncepText from an instance-aware segmentation perspective.
no code implementations • 12 Jan 2018 • Feng-Lin Li, Minghui Qiu, Haiqing Chen, Xiongwei Wang, Xing Gao, Jun Huang, Juwei Ren, Zhongzhou Zhao, Weipeng Zhao, Lei Wang, Guwei Jin, Wei Chu
We present AliMe Assist, an intelligent assistant designed for creating an innovative online shopping experience in E-commerce.
1 code implementation • 23 Nov 2017 • Jianfei Yu, Minghui Qiu, Jing Jiang, Jun Huang, Shuangyong Song, Wei Chu, Haiqing Chen
In this paper, we study transfer learning for the PI and NLI problems, aiming to propose a general framework, which can effectively and efficiently adapt the shared knowledge learned from a resource-rich source domain to a resource- poor target domain.
no code implementations • ACL 2017 • Minghui Qiu, Feng-Lin Li, Siyu Wang, Xing Gao, Yan Chen, Weipeng Zhao, Haiqing Chen, Jun Huang, Wei Chu
We propose AliMe Chat, an open-domain chatbot engine that integrates the joint results of Information Retrieval (IR) and Sequence to Sequence (Seq2Seq) based generation models.
no code implementations • 20 Apr 2016 • Wei Chu, Ruxin Chen
The previously trained DNN of the matched speaker cluster is used for decoding utterances of the test speaker.
4 code implementations • 31 Mar 2010 • Lihong Li, Wei Chu, John Langford, Xuanhui Wang
\emph{Offline} evaluation of the effectiveness of new algorithms in these applications is critical for protecting online user experiences but very challenging due to their "partial-label" nature.
11 code implementations • 28 Feb 2010 • Lihong Li, Wei Chu, John Langford, Robert E. Schapire
In this work, we model personalized recommendation of news articles as a contextual bandit problem, a principled approach in which a learning algorithm sequentially selects articles to serve users based on contextual information about the users and articles, while simultaneously adapting its article-selection strategy based on user-click feedback to maximize total user clicks.
no code implementations • NeurIPS 2007 • Kai Yu, Wei Chu
In this paper we develop a Gaussian process (GP) framework to model a collection of reciprocal random variables defined on the \emph{edges} of a network.