LLM 基础

大语言模型（LLM）是基于深度学习的人工智能模型，能够理解和生成人类语言。

预训练模型

概述

预训练是指在大规模文本数据上训练模型，使其学习语言的统计规律和语义理解能力。

训练过程

数据准备: 收集大规模文本数据（如维基百科、书籍、网页等）
模型初始化: 构建 Transformer 架构模型
预训练: 使用自监督学习目标训练模型
微调: 在特定任务上进行微调

常用预训练模型

GPT (Generative Pre-trained Transformer): OpenAI 开发的生成式模型
BERT (Bidirectional Encoder Representations from Transformers): Google 开发的双向编码器
LLaMA: Meta 开源的大语言模型
Qwen: 阿里云开发的大语言模型

微调技术

概述

微调是指在预训练模型的基础上，使用特定任务的数据进行训练，使模型适应特定任务。

微调方法

LoRA (Low-Rank Adaptation)

python

from peft import LoraConfig, get_peft_model

config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, config)

P-Tuning

python

from peft import PromptTuningConfig, get_peft_model

config = PromptTuningConfig(
    task_type="CAUSAL_LM",
    num_virtual_tokens=20,
    prompt_tuning_init="TEXT",
    prompt_tuning_init_text="Classify if the tweet is positive or negative:",
)

model = get_peft_model(model, config)

模型部署

概述

模型部署是将训练好的模型部署到生产环境，提供 API 服务。

部署方式

使用 Hugging Face Transformers

python

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-7B")

prompt = "介绍一下大语言模型"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

使用 vLLM 加速

python

from vllm import LLM, SamplingParams

llm = LLM(model="Qwen/Qwen-7B")
sampling_params = SamplingParams(max_tokens=100)

prompts = ["介绍一下大语言模型", "什么是人工智能"]
outputs = llm.generate(prompts, sampling_params)

for output in outputs:
    print(output.outputs[0].text)

LLM 基础 ​

预训练模型 ​

概述 ​

训练过程 ​

常用预训练模型 ​

微调技术 ​

概述 ​

微调方法 ​

LoRA (Low-Rank Adaptation) ​

P-Tuning ​

模型部署 ​

概述 ​

部署方式 ​

使用 Hugging Face Transformers ​

使用 vLLM 加速 ​

LLM 基础

预训练模型

概述

训练过程

常用预训练模型

微调技术

概述

微调方法

LoRA (Low-Rank Adaptation)

P-Tuning

模型部署

概述

部署方式

使用 Hugging Face Transformers

使用 vLLM 加速