Tosy

Low Rank Adaptation Representation Reparamatization (LoRA) and Reinforcement Learning with Human Feedback (RLHF)

The Low-Rank Adaptation of Large Models (LoRA)¹ is a parameter efficient fine-tuning method as it freezes the original LLM parameters, injects a pair of low rank decomposition matrices. The dimensions of the matrices are set so that their product is the same dimensions of the original LLM. This smaller matrices is updated during training. During Inference, the low rank matrices are multiplied and added to the original LLM weights.

‍

‍

LoRA process

In the project, I used the LoRA method to fine tune the FLAN-T5 model² for dialouge summarization. Using the DialogSum dataset, the fine-tuned model achieved improvement in all the ROUGE score metrics category.

|   Metric   | Percentage Improvement |
|:-----------:|:-----------------------:|
|   Rouge1    |          17.5%          |
|   Rouge2    |           8.7%          |
|   RougeL    |          12.4%          |
| RougeLsum |          12.3%          |

Utilizing the transformers TRL library, I fine-tuned the model to detoxify summaries. Using the Proximal Policy Optimization⁴, in addition with KL-Divergence to ensure the updated policy does not deviate far from the original policy. I made use of META's AI RoBERTa based hate speech model⁵.

The RL-finetuned model achieved a 54% average increment in non-toxic score over the baseline/peft model.

‍

View source code on Github

‍

Reference

LoRA: Low-Rank Adaptation of Large Language Models, J. Hu et al.
google/flan-t5-base from 🤗
Transformers Reinforcement Learning
Proximal Policy Optimizations Algorithms, Schulman et al
facebook/roberta-hate-speech from 🤗
Generative AI with LLMs

‍

PEFT-RLHF

ML

About the project

Low Rank Adaptation Representation Reparamatization (LoRA) and Reinforcement Learning with Human Feedback (RLHF)

Reference