微调技术

微调是指在预训练模型的基础上，使用特定任务的数据进行训练，使模型适应特定任务。

概述

微调是将通用预训练模型适配到特定任务的过程。

微调方法

LoRA (Low-Rank Adaptation)

LoRA 通过冻结预训练模型的权重，只训练低秩矩阵。

python

from peft import LoraConfig, get_peft_model

config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, config)

P-Tuning

P-Tuning 通过优化虚拟 token 来适应特定任务。

python

from peft import PromptTuningConfig, get_peft_model

config = PromptTuningConfig(
    task_type="CAUSAL_LM",
    num_virtual_tokens=20,
    prompt_tuning_init="TEXT",
    prompt_tuning_init_text="Classify if the tweet is positive or negative:",
)

model = get_peft_model(model, config)

全参数微调

全参数微调更新模型的所有参数。

python

# 冻结部分层
for name, param in model.named_parameters():
    if "embeddings" in name:
        param.requires_grad = False

# 训练所有参数
optimizer = AdamW(model.parameters(), lr=2e-5)

数据准备

数据格式

python

# 分类任务
{
    "text": "This movie is great!",
    "label": "positive"
}

# 问答任务
{
    "question": "What is the capital of France?",
    "answer": "Paris"
}

# 生成任务
{
    "prompt": "Translate to French: Hello world",
    "response": "Bonjour le monde"
}

数据增强

python

import random

def augment_text(text):
    # 随机插入空格
    if random.random() > 0.5:
        text = text.replace(" ", "  ")
    
    # 随机截断
    if len(text) > 100 and random.random() > 0.5:
        text = text[:random.randint(50, 100)]
    
    return text

训练配置

学习率选择

python

# 小学习率用于微调
learning_rate = 2e-5

# 较大学习率用于全参数微调
learning_rate = 1e-4

批处理大小

python

batch_size = 8  # GPU 显存有限时使用小批量
batch_size = 32  # GPU 显存充足时使用大批量

训练轮数

python

epochs = 3  # 避免过拟合
epochs = 10  # 数据量大时可以增加

评估方法

分类任务

python

from sklearn.metrics import accuracy_score, f1_score

predictions = model.predict(test_data)
accuracy = accuracy_score(test_labels, predictions)
f1 = f1_score(test_labels, predictions, average="weighted")

print(f"Accuracy: {accuracy:.4f}")
print(f"F1 Score: {f1:.4f}")

生成任务

python

from rouge_score import rouge_scorer

scorer = rouge_scorer.RougeScorer(['rouge1', 'rougeL'], use_stemmer=True)
scores = scorer.score(reference, prediction)

print(f"ROUGE-1: {scores['rouge1'].fmeasure:.4f}")
print(f"ROUGE-L: {scores['rougeL'].fmeasure:.4f}")

过拟合处理

早停

python

from transformers import EarlyStoppingCallback

callback = EarlyStoppingCallback(
    early_stopping_patience=3,
    early_stopping_threshold=0.001
)

正则化

python

# Dropout
model = model.apply(lambda m: m.train())

# 权重衰减
optimizer = AdamW(model.parameters(), weight_decay=0.01)

微调技术 ​

概述 ​

微调方法 ​

LoRA (Low-Rank Adaptation) ​

P-Tuning ​

全参数微调 ​

数据准备 ​

数据格式 ​

数据增强 ​

训练配置 ​

学习率选择 ​

批处理大小 ​

训练轮数 ​

评估方法 ​

分类任务 ​

生成任务 ​

过拟合处理 ​

早停 ​

正则化 ​

微调技术

概述

微调方法

LoRA (Low-Rank Adaptation)

P-Tuning

全参数微调

数据准备

数据格式

数据增强

训练配置

学习率选择

批处理大小

训练轮数

评估方法

分类任务

生成任务

过拟合处理

早停

正则化