MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts

Fine-tuning Large Language Models (LLMs) is a common practice to adapt pre-trained models for specific applications. While methods like LoRA have effectively addressed GPU memory constraints during fine-tuning, their performance often falls short, especially in multi-task scenarios. In contrast, Mixture-of-Expert (MoE) models, such as Mixtral 8x7B, demonstrate remarkable performance in multi-task learning scenarios while maintaining a reduced parameter count. However, the resource requirements of these MoEs remain challenging, particularly for consumer-grade GPUs with less than 24GB memory. To tackle these challenges, we propose MixLoRA, an approach to construct a resource-efficient sparse MoE model based on LoRA. MixLoRA inserts multiple LoRA-based experts within the feed-forward network block of a frozen pre-trained dense model and employs a commonly used top-k router. Unlike other LoRA-based MoE methods, MixLoRA enhances model performance by utilizing independent attention-layer LoRA adapters. Additionally, an auxiliary load balance loss is employed to address the imbalance problem of the router. Our evaluations show that MixLoRA improves about 9% accuracy compared to state-of-the-art PEFT methods in multi-task learning scenarios. We also propose a new high-throughput framework to alleviate the computation and memory bottlenecks during the training and inference of MOE models. This framework reduces GPU memory consumption by 40% and token computation latency by 30% during both training and inference.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Common Sense Reasoning ARC (Challenge) LLaMA-2 13B + MixLoRA Accuracy 69.9 # 15
Common Sense Reasoning ARC (Challenge) LLaMA-3 8B + MixLoRA Accuracy 79.9 # 14
Common Sense Reasoning ARC (Challenge) LLaMA-2 7B + MixLoRA Accuracy 58.1 # 23
Common Sense Reasoning ARC (Easy) LLaMA-2 13B + MixLoRA Accuracy 83.5 # 9
Common Sense Reasoning ARC (Easy) LLaMA-2 7B + MixLoRA Accuracy 77.7 # 19
Common Sense Reasoning ARC (Easy) LLaMA-3 8B + MixLoRA Accuracy 86.5 # 4
Question Answering BoolQ LLaMA-2 7B + MixLoRA Accuracy 72.7 # 38
Question Answering BoolQ LLaMA-2 13B + MixLoRA Accuracy 77.1 # 30
Question Answering BoolQ LLaMA-3 8B + MixLoRA Accuracy 75 # 35
Sentence Completion HellaSwag LLaMA-2 7B + MixLoRA Accuracy 93.1 # 9
Sentence Completion HellaSwag LLaMA-2 13B + MixLoRA Accuracy 94.7 # 5
Sentence Completion HellaSwag LLaMA-3 8B + MixLoRA Accuracy 93.3 # 8
Question Answering OpenBookQA LLaMA-3 8B + MixLoRA Accuracy 84.8 # 15
Question Answering OpenBookQA LLaMA-2 13B + MixLoRA Accuracy 83 # 19
Question Answering OpenBookQA LLaMA-2 7B + MixLoRA Accuracy 84.4 # 16
Question Answering PIQA LLaMA-3 8B + MixLoRA Accuracy 87.6 # 3
Question Answering PIQA LLaMA-2 7B + MixLoRA Accuracy 83.2 # 12
Question Answering PIQA LLaMA-2 13B + MixLoRA Accuracy 86.8 # 6
Question Answering SIQA LLaMA-2 13B + MixLoRA Accuracy 82.5 # 2
Question Answering SIQA LLaMA-2 7B + MixLoRA Accuracy 78 # 10
Question Answering SIQA LLaMA-3 8B + MixLoRA Accuracy 78.8 # 9
Common Sense Reasoning WinoGrande LLaMA-2 13B + MixLoRA Accuracy 86.3 # 9
Common Sense Reasoning WinoGrande LLaMA-3 8B + MixLoRA Accuracy 82.1 # 11
Common Sense Reasoning WinoGrande LLaMA-2 7B + MixLoRA Accuracy 76.8 # 23

Methods