Large Language Model Fine-tuning Method Based on Adaptive Quantization

Lijuan Feng; Jiaxiang Wang; Jiangjiang Li; Yachao Zhang

Authors

Lijuan Feng Zhengzhou University of Science and Technology
Jiaxiang Wang Zhengzhou University of Science and Technology
Jiangjiang Li Zhengzhou University of Science and Technology
Yachao Zhang Zhengzhou University of Science and Technology

Keywords:

Large language model, AI, ADAQ-LoRA

Abstract

In recent years, large language models (LLMs) have excelled in comprehensive AI tasks such as language text generation, mathematics, abstraction, and code, and people have seen the embryonic form of general artificial intelligence. However, the fine tuning of the model also needs to consume a lot of computer memory, and the computing resources are extremely high, which is difficult to meet the general consumer grade graphics card. Therefore, an adaptive quantized low-rank (ADAQ-LoRA) fine-tuning algorithm is proposed to solve the problem of video memory consumption during fine-tuning of large language models. The solution is to use both quantification and pruning methods to dramatically reduce video memory usage without losing accuracy. ADAQ-LoRA is applied to ChatGLM2-6B model and its effectiveness is verified in different fine-tuning datasets and downstream scenarios. Compared with the existing large language model fine-tuning methods, ADAQ-LoRA shows better performance and lower memory usage.

Large Language Model Fine-tuning Method Based on Adaptive Quantization

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

Current Issue

Information

Make a Submission