🤖 Transformers (HuggingFace Pipelines 实战)
本教程基于 Hugging Face 的 transformers
库,展示如何使用预训练模型完成以下任务:
- 情感分析(Sentiment Analysis)
- 文本生成(Text Generation)
- 翻译(Translation)
- 掩码填空(Masked Language Modeling)
- 零样本分类(Zero-shot Classification)
- 特定任务模型推理(Text2Text Generation, e.g., T5)
- 文本摘要(Summarization)
- 自定义分类模型载入与推理(如 BERT)
📦 安装依赖
python
!pip install datasets evaluate transformers sentencepiece -q
✅ 情感分析
python
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
# 单个句子
classifier("I really enjoy learning new things every day.")
# 多个句子
classifier([
"I really enjoy learning new things every day.",
"I dislike rainy weather on weekends."
])
✍️ 文本生成(默认模型)
python
from transformers import pipeline
generator = pipeline("text-generation")
generator("Today we are going to explore something exciting")
🎯 文本生成(指定参数)
python
# 两句话,每句最多15词
generator("Today we are going to explore something exciting", num_return_sequences=2, max_length=15)
⚙️ 使用 distilgpt2 模型生成文本
python
from transformers import pipeline
generator = pipeline("text-generation", model="distilgpt2")
generator(
"Today we are going to explore something exciting",
max_length=30,
num_return_sequences=2,
)
🌍 翻译(英语→德语)
python
from transformers import pipeline
translator = pipeline("translation_en_to_de")
translator("This is a great day to learn something new!")
🧩 掩码填空任务(填词)
python
from transformers import pipeline
unmasker = pipeline("fill-mask")
unmasker("Life is like a <mask> of chocolates.")
🧠 零样本分类(Zero-shot Classification)
python
from transformers import pipeline
classifier = pipeline("zero-shot-classification")
classifier(
"This tutorial is about machine learning and natural language processing.",
candidate_labels=["education", "sports", "politics"]
)
🔁 T5 模型(Text2Text)
python
from transformers import pipeline
text2text = pipeline("text2text-generation")
text2text("Translate English to French: How are you today?")
✂️ 文本摘要(Summarization)
python
from transformers import pipeline
summarizer = pipeline("summarization")
summarizer(
"Machine learning is a field of artificial intelligence that focuses on enabling machines to learn from data..."
)
🧪 使用自己的模型(以 BERT 为例)
python
from transformers import BertTokenizer, BertForSequenceClassification
from transformers import pipeline
model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)
pipe = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
pipe("I had an amazing experience with this product!")