Search Results for author: Tian Tan

Found 11 papers, 4 papers with code

SALMONN: Towards Generic Hearing Abilities for Large Language Models

1 code implementation • 20 Oct 2023 • Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Hearing is arguably an essential ability of artificial intelligence (AI) agents in the physical world, which refers to the perception and understanding of general auditory information consisting of at least three types of sounds: speech, audio events, and music.

Audio captioning Automatic Speech Recognition +10

803

Paper
Code

Synthetic IMU Datasets and Protocols Can Simplify Fall Detection Experiments and Optimize Sensor Configuration

no code implementations • 16 Oct 2023 • Jie Tang, Bin He, Junkai Xu, Tian Tan, Zhipeng Wang, Yanmin Zhou, Shuo Jiang

The proposed method simplifies fall detection data acquisition experiments, provides novel venue for generating low cost synthetic data in scenario where acquiring data for machine learning is challenging and paves the way for customizing machine learning configurations.

Paper
Add Code

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

2 code implementations • 9 Oct 2023 • Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Audio-visual large language models (LLM) have drawn significant attention, yet the fine-grained combination of both input streams is rather under-explored, which is challenging but necessary for LLMs to understand general video inputs.

Question Answering Video Question Answering

Paper
Code

Connecting Speech Encoder and Large Language Model for ASR

no code implementations • 25 Sep 2023 • Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Q-Former-based LLMs can generalise well to out-of-domain datasets, where 12% relative WER reductions over the Whisper baseline ASR model were achieved on the Eval2000 test set without using any in-domain training data from Switchboard.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

no code implementations • 14 Sep 2023 • Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen

In spite of the excellent strides made by end-to-end (E2E) models in speech recognition in recent years, named entity recognition is still challenging but critical for semantic understanding.

Language Modelling named-entity-recognition +3

Paper
Add Code

Multi-Modality Deep Network for Extreme Learned Image Compression

no code implementations • 26 Apr 2023 • Xuhao Jiang, Weimin Tan, Tian Tan, Bo Yan, Liquan Shen

Image-based single-modality compression learning approaches have demonstrated exceptionally powerful encoding and decoding capabilities in the past few years , but suffer from blur and severe semantics loss at extremely low bitrates.

Image Compression

Paper
Add Code

Adjacency constraint for efficient hierarchical reinforcement learning

no code implementations • 30 Oct 2021 • Tianren Zhang, Shangqi Guo, Tian Tan, Xiaolin Hu, Feng Chen

Searching in a large goal space poses difficulty for both high-level subgoal generation and low-level policy learning.

Continuous Control Hierarchical Reinforcement Learning +2

Paper
Add Code

Success-Rate Targeted Reinforcement Learning by Disorientation Penalty

no code implementations • 1 Jan 2021 • Haichuan Gao, Zhile Yang, Tian Tan, Feng Chen

Unfortunately, applying traditional Bellman updates to value function learning can be problematic for learning undiscounted return, and thus not suitable for optimizing success rate.

Decision Making Q-Learning +2

Paper
Add Code

An Investigation on Deep Learning with Beta Stabilizer

no code implementations • 31 Jul 2020 • Qi Liu, Tian Tan, Kai Yu

It is concluded that beta stabilizer parameters can reduce the sensitivity of learning rate with almost the same performance on DNN with relu activation function and LSTM.

Handwriting Recognition speech-recognition +1

Paper
Add Code

Generating Adjacency-Constrained Subgoals in Hierarchical Reinforcement Learning

1 code implementation • NeurIPS 2020 • Tianren Zhang, Shangqi Guo, Tian Tan, Xiaolin Hu, Feng Chen

In this paper, we show that this problem can be effectively alleviated by restricting the high-level action space from the whole goal space to a $k$-step adjacent region of the current state using an adjacency constraint.

Continuous Control Hierarchical Reinforcement Learning +2

Paper
Code

Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

1 code implementation • 23 Dec 2019 • Tian Tan, Zhihan Xiong, Vikranth R. Dwaracherla

We use an indexed value function to represent uncertainty in our action-value estimates.

Efficient Exploration reinforcement-learning +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.