no code implementations • EMNLP 2021 • Jiawei Zhao, Wei Luo, Boxing Chen, Andrew Gilman
In this paper, we propose an alternative–a trainable mutual-learning scenario, where the MT and the ST models are collaboratively trained and are considered as peers, rather than teacher/student.
1 code implementation • 6 Mar 2024 • Jiawei Zhao, Zhenyu Zhang, Beidi Chen, Zhangyang Wang, Anima Anandkumar, Yuandong Tian
Our approach reduces memory usage by up to 65. 5% in optimizer states while maintaining both efficiency and performance for pre-training on LLaMA 1B and 7B architectures with C4 dataset with up to 19. 7B tokens, and on fine-tuning RoBERTa on GLUE tasks.
1 code implementation • 20 Jun 2023 • Jiawei Zhao, Yifei Zhang, Beidi Chen, Florian Schäfer, Anima Anandkumar
To remedy this, we design a new training algorithm Incremental Low-Rank Learning (InRank), which explicitly expresses cumulative weight updates as low-rank matrices while incrementally augmenting their ranks during training.
no code implementations • 28 Nov 2022 • Robert Joseph George, Jiawei Zhao, Jean Kossaifi, Zongyi Li, Anima Anandkumar
Fourier Neural Operators (FNO) offer a principled approach to solving challenging partial differential equations (PDE) such as turbulent flows.
1 code implementation • 25 Oct 2021 • Jiawei Zhao, Florian Schäfer, Anima Anandkumar
Deep neural networks are usually initialized with random weights, with adequately selected initial variance to ensure stable signal propagation during training.
1 code implementation • ICCV 2021 • Jiawei Zhao, Ke Yan, Yifan Zhao, Xiaowei Guo, Feiyue Huang, Jia Li
Different from these researches, in this paper, we propose a novel Transformer-based Dual Relation learning framework, constructing complementary relationships by exploring two aspects of correlation, i. e., structural relation graph and semantic relation graph.
Ranked #8 on Multi-Label Classification on PASCAL VOC 2007
1 code implementation • ACM MM 2021 • Jiawei Zhao, Yifan Zhao, Jia Li
Multi-label image recognition aims to recognize multiple objects simultaneously in one image.
Ranked #4 on Multi-Label Classification on PASCAL VOC 2007
no code implementations • 8 Sep 2021 • Yifan Zhao, Jiawei Zhao, Jia Li, Xiaowu Chen
To construct our framework as well as achieving accurate salient detection results, we propose a Ubiquitous Target Awareness (UTA) network to solve three important challenges in RGB-D SOD task: 1) a depth awareness module to excavate depth information and to mine ambiguous regions via adaptive depth-error weights, 2) a spatial-aware cross-modal interaction and a channel-aware cross-level interaction, exploiting the low-level boundary cues and amplifying high-level salient channels, and 3) a gated multi-scale predictor module to perceive the object saliency in different contextual scales.
Ranked #10 on Thermal Image Segmentation on RGB-T-Glass-Segmentation
no code implementations • 26 Jun 2021 • Jiawei Zhao, Steve Dai, Rangharajan Venkatesan, Brian Zimmer, Mustafa Ali, Ming-Yu Liu, Brucek Khailany, Bill Dally, Anima Anandkumar
Representing deep neural networks (DNNs) in low-precision is a promising approach to enable efficient acceleration and memory reduction.