Search Results for author: Yiqun Yao

Found 12 papers, 4 papers with code

Modality-specific Learning Rates for Effective Multimodal Additive Late-fusion

no code implementations • Findings (ACL) 2022 • Yiqun Yao, Rada Mihalcea

Moreover, for different modalities, the best unimodal models may work under significantly different learning rates due to the nature of the modality and the computational flow of the model; thus, selecting a global learning rate for late-fusion models can result in a vanishing gradient for some modalities.

Open-Ended Question Answering

Paper
Add Code

Tele-FLM Technical Report

no code implementations • 25 Apr 2024 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang

Large language models (LLMs) have showcased profound capabilities in language understanding and generation, facilitating a wide array of applications.

Language Modelling Large Language Model

Paper
Add Code

CatCode: A Comprehensive Evaluation Framework for LLMs On the Mixture of Code and Text

no code implementations • 4 Mar 2024 • Zhenru Lin, Yiqun Yao, Yang Yuan

Large language models (LLMs) such as ChatGPT are increasingly proficient in understanding and generating a mixture of code and text.

Code Translation

Paper
Add Code

FLM-101B: An Open LLM and How to Train It with $100K Budget

no code implementations • 7 Sep 2023 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin, Zheng Zhang, Aixin Sun, Yequan Wang

We demonstrate that a 101B-parameter LLM with 0. 31T tokens can be trained with a budget of 100K US dollars.

Memorization

Paper
Add Code

Masked Structural Growth for 2x Faster Language Model Pre-training

1 code implementation • 4 May 2023 • Yiqun Yao, Zheng Zhang, Jing Li, Yequan Wang

In terms of growth schedule, the impact of each single dimension on a schedule's efficiency is under-explored by existing work.

Language Modelling Large Language Model +1

Paper
Code

nanoLM: an Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales

1 code implementation • 14 Apr 2023 • Yiqun Yao, Siqi Fan, Xiusheng Huang, Xuezhi Fang, Xiang Li, Ziyi Ni, Xin Jiang, Xuying Meng, Peng Han, Shuo Shang, Kang Liu, Aixin Sun, Yequan Wang

With around 14% of the one-time pre-training cost, we can accurately forecast the loss for models up to 52B.

Paper
Code

MUSER: MUltimodal Stress Detection using Emotion Recognition as an Auxiliary Task

no code implementations • NAACL 2021 • Yiqun Yao, Michalis Papakostas, Mihai Burzo, Mohamed Abouelenien, Rada Mihalcea

The capability to automatically detect human stress can benefit artificial intelligent agents involved in affective computing and human-computer interaction.

Emotion Recognition Multi-Task Learning

Paper
Add Code

The World in My Mind: Visual Dialog with Adversarial Multi-modal Feature Encoding

no code implementations • NAACL 2019 • Yiqun Yao, Jiaming Xu, Bo Xu

Visual Dialog is a multi-modal task that requires a model to participate in a multi-turn human dialog grounded on an image, and generate correct, human-like responses.

General Knowledge Visual Dialog

Paper
Add Code

Concept Learning through Deep Reinforcement Learning with Memory-Augmented Neural Networks

no code implementations • 15 Nov 2018 • Jing Shi, Jiaming Xu, Yiqun Yao, Bo Xu

In this paper, we present a memory-augmented neural network which is motivated by the process of human concept learning.

One-Shot Learning Outlier Detection +2

Paper
Add Code

Cascaded Mutual Modulation for Visual Reasoning

1 code implementation • EMNLP 2018 • Yiqun Yao, Jiaming Xu, Feng Wang, Bo Xu

Our code is available at https://github. com/FlamingHorizon/CMM-VR.

Question Answering Visual Question Answering +1

Paper
Code

Combining Lexical and Semantic-based Features for Answer Sentence Selection

no code implementations • WS 2016 • Jing Shi, Jiaming Xu, Yiqun Yao, Suncong Zheng, Bo Xu

As the result of the evaluation shows, our solution provides a valuable and brief model which could be used in modelling question answering or sentence semantic relevance.

Feature Engineering Open-Domain Question Answering +1

Paper
Add Code

Hierarchical Memory Networks for Answer Selection on Unknown Words

1 code implementation • COLING 2016 • Jiaming Xu, Jing Shi, Yiqun Yao, Suncong Zheng, Bo Xu

Recently, end-to-end memory networks have shown promising results on Question Answering task, which encode the past facts into an explicit memory and perform reasoning ability by making multiple computational steps on the memory.

Answer Selection Sentence

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.