no code implementations • 1 Jan 2021 • Peng Zhang, Jing Zhang, Xindian Ma, Siwei Rao, Guangjian Tian, Jun Wang
As a novel model that bridges machine learning and quantum theory, tensor network (TN) has recently gained increasing attention and successful applications for processing natural images.
no code implementations • 28 Jul 2020 • Shuai Zhang, Peng Zhang, Xindian Ma, Junqiu Wei, Ningning Wang, Qun Liu
Transformer has been widely-used in many Natural Language Processing (NLP) tasks and the scaled dot-product attention between tokens is a core module of Transformer.
no code implementations • 25 Sep 2019 • Peng Zhang, Xiaoliu Mao, Xindian Ma, Benyou Wang, Jing Zhang, Jun Wang, Dawei Song
We prove that by a mapping (via the trace operator) on the high-dimensional matching matrix, a low-dimensional attention matrix can be derived.
no code implementations • 25 Sep 2019 • Xindian Ma, Peng Zhang, Xiaoliu Mao, Yehua Zhang, Nan Duan, Yuexian Hou, Ming Zhou.
Then, we show that the lower bound of such a separation rank can reveal the quantitative relation between the network structure (e. g. depth/width) and the modeling ability for the contextual dependency.
1 code implementation • NeurIPS 2019 • Xindian Ma, Peng Zhang, Shuai Zhang, Nan Duan, Yuexian Hou, Dawei Song, Ming Zhou
In this paper, based on the ideas of tensor decomposition and parameters sharing, we propose a novel self-attention model (namely Multi-linear attention) with Block-Term Tensor Decomposition (BTD).
1 code implementation • 31 Jan 2019 • Lipeng Zhang, Peng Zhang, Xindian Ma, Shuqin Gu, Zhan Su, Dawei Song
Theoretically, we prove that such tensor representation is a generalization of the n-gram language model.