Search Results for author: Jiawei Zhao

Found 9 papers, 5 papers with code

Mutual-Learning Improves End-to-End Speech Translation

no code implementations • EMNLP 2021 • Jiawei Zhao, Wei Luo, Boxing Chen, Andrew Gilman

In this paper, we propose an alternative–a trainable mutual-learning scenario, where the MT and the ST models are collaboratively trained and are considered as peers, rather than teacher/student.

Knowledge Distillation Machine Translation +1

Paper
Add Code

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

1 code implementation • 6 Mar 2024 • Jiawei Zhao, Zhenyu Zhang, Beidi Chen, Zhangyang Wang, Anima Anandkumar, Yuandong Tian

Our approach reduces memory usage by up to 65. 5% in optimizer states while maintaining both efficiency and performance for pre-training on LLaMA 1B and 7B architectures with C4 dataset with up to 19. 7B tokens, and on fine-tuning RoBERTa on GLUE tasks.

1,134

Paper
Code

InRank: Incremental Low-Rank Learning

1 code implementation • 20 Jun 2023 • Jiawei Zhao, Yifei Zhang, Beidi Chen, Florian Schäfer, Anima Anandkumar

To remedy this, we design a new training algorithm Incremental Low-Rank Learning (InRank), which explicitly expresses cumulative weight updates as low-rank matrices while incrementally augmenting their ranks during training.

Computational Efficiency

211

Paper
Code

Incremental Spatial and Spectral Learning of Neural Operators for Solving Large-Scale PDEs

no code implementations • 28 Nov 2022 • Robert Joseph George, Jiawei Zhao, Jean Kossaifi, Zongyi Li, Anima Anandkumar

Fourier Neural Operators (FNO) offer a principled approach to solving challenging partial differential equations (PDE) such as turbulent flows.

Paper
Add Code

ZerO Initialization: Initializing Neural Networks with only Zeros and Ones

1 code implementation • 25 Oct 2021 • Jiawei Zhao, Florian Schäfer, Anima Anandkumar

Deep neural networks are usually initialized with random weights, with adequately selected initial variance to ensure stable signal propagation during training.

Image Classification

Paper
Code

Transformer-based Dual Relation Graph for Multi-label Image Recognition

1 code implementation • ICCV 2021 • Jiawei Zhao, Ke Yan, Yifan Zhao, Xiaowei Guo, Feiyue Huang, Jia Li

Different from these researches, in this paper, we propose a novel Transformer-based Dual Relation learning framework, constructing complementary relationships by exploring two aspects of correlation, i. e., structural relation graph and semantic relation graph.

Ranked #8 on Multi-Label Classification on PASCAL VOC 2007

Multi-Label Classification Relation

Paper
Code

M3TR: Multi-modal Multi-label Recognition with Transformer

1 code implementation • ACM MM 2021 • Jiawei Zhao, Yifan Zhao, Jia Li

Multi-label image recognition aims to recognize multiple objects simultaneously in one image.

Ranked #4 on Multi-Label Classification on PASCAL VOC 2007

Multi-Label Classification

Paper
Code

RGB-D Salient Object Detection with Ubiquitous Target Awareness

no code implementations • 8 Sep 2021 • Yifan Zhao, Jiawei Zhao, Jia Li, Xiaowu Chen

To construct our framework as well as achieving accurate salient detection results, we propose a Ubiquitous Target Awareness (UTA) network to solve three important challenges in RGB-D SOD task: 1) a depth awareness module to excavate depth information and to mine ambiguous regions via adaptive depth-error weights, 2) a spatial-aware cross-modal interaction and a channel-aware cross-level interaction, exploiting the low-level boundary cues and amplifying high-level salient channels, and 3) a gated multi-scale predictor module to perceive the object saliency in different contextual scales.

Ranked #10 on Thermal Image Segmentation on RGB-T-Glass-Segmentation

Object object-detection +4

Paper
Add Code

LNS-Madam: Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update

no code implementations • 26 Jun 2021 • Jiawei Zhao, Steve Dai, Rangharajan Venkatesan, Brian Zimmer, Mustafa Ali, Ming-Yu Liu, Brucek Khailany, Bill Dally, Anima Anandkumar

Representing deep neural networks (DNNs) in low-precision is a promising approach to enable efficient acceleration and memory reduction.

Quantization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.