1 code implementation • 23 Feb 2024 • Zui Chen, Yezeng Chen, Jiaqi Han, Zhijie Huang, Ji Qi, Yi Zhou
Large language models (LLMs) are displaying emergent abilities for math reasoning tasks, and there is a growing attention on enhancing the ability of open-source LLMs through supervised fine-tuning (SFT). In this paper, we aim to explore a general data strategy for supervised data to help optimize and expand math reasoning ability. Firstly, we determine the ability boundary of reasoning paths augmentation by identifying these paths' minimal optimal set. Secondly, we validate that different abilities of the model can be cumulatively enhanced by Mix of Minimal Optimal Sets of corresponding types of data, while our models MMOS achieve SOTA performance on series base models under much lower construction costs. Besides, we point out GSM-HARD is not really hard and today's LLMs no longer lack numerical robustness. Also, we provide an Auto Problem Generator for robustness testing and educational applications. Our code and data are publicly available at https://github. com/cyzhh/MMOS.
Ranked #2 on Math Word Problem Solving on ASDiv-A (using extra training data)
1 code implementation • 19 Sep 2023 • Xinda Wu, Zhijie Huang, Kejun Zhang, Jiaxing Yu, Xu Tan, Tieyao Zhang, ZiHao Wang, Lingyun Sun
In particular, subjective evaluations show that, on the melody continuation task, MelodyGLM gains average improvements of 0. 82, 0. 87, 0. 78, and 0. 94 in consistency, rhythmicity, structure, and overall quality, respectively.
1 code implementation • 23 May 2023 • Daliang Ouyang, Su He, Guozhong Zhang, Mingzhu Luo, Huaiyong Guo, Jian Zhan, Zhijie Huang
Remarkable effectiveness of the channel or spatial attention mechanisms for producing more discernible feature representation are illustrated in various computer vision tasks.
1 code implementation • 22 Feb 2023 • Sihan Xu, Zelong Jiang, Ruisi Liu, Kaikai Yang, Zhijie Huang
Moreover, our learning method can learn the style features of images on different domains effectively.
1 code implementation • 11 Jan 2023 • Kejun Zhang, Xinda Wu, Tieyao Zhang, Zhijie Huang, Xu Tan, Qihao Liang, Songruoyao Wu, Lingyun Sun
Although deep learning has revolutionized music generation, existing methods for structured melody generation follow an end-to-end left-to-right note-by-note generative paradigm and treat each note equally.
1 code implementation • 10 Aug 2021 • Xiaopeng Guo, Zhijie Huang, Jie Gao, Mingyu Shang, Maojing Shu, Jun Sun
The original and adversarial examples are further used to jointly train the KT model, forcing it is not only to be robust to the adversarial examples, but also to enhance the generalization over the original ones.