Self-Instruct构造Prompt的例子

人工构造一批Prompt做种子。（Starting with a small seed set of human-written tasks）

每次把一些种子+后来生成的Prompt，放到Input里做few-shot examples，用LLM生成更多的Prompt；（Using the LLM to generate new instructions based on the seed tasks）

过滤掉质量太差的，修正能要的；（Filtering and refining the generated instructions）

把生成的所有Prompt，输入LLM得到输出结果；（Creating input-output instances for the new instructions）

Input+Output，做LLM的训练样本（Using the generated dataset to fine-tune the LLM）

第2步，LLM生成：

复制代码

import random
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load a pre-trained language model
model_name = "bigcode/starcoderbase-1b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Seed tasks (simplified for demonstration)
seed_tasks = [
    "Write a function to calculate the factorial of a number.",
    "Create a class to represent a bank account.",
    "Implement a binary search algorithm."
]

def generate_instruction(prompt):
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(**inputs, max_new_tokens=50)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

def self_instruct(num_iterations):
    generated_tasks = []
    
    for _ in range(num_iterations):
        # Sample existing tasks
        sampled_tasks = random.sample(seed_tasks + generated_tasks, min(3, len(seed_tasks) + len(generated_tasks)))
        
        # Create a prompt for generating new instructions
        prompt = "Generate a new programming task based on these examples:\n\n"
        prompt += "\n".join(sampled_tasks)
        prompt += "\n\nNew task:"
        
        # Generate a new instruction
        new_task = generate_instruction(prompt)
        
        # In practice, you would filter and refine the generated task here
        
        generated_tasks.append(new_task)
    
    return generated_tasks

# Run Self-Instruct
new_tasks = self_instruct(5)
for i, task in enumerate(new_tasks, 1):
    print(f"Task {i}: {task}")

第3步过滤：

人工定义一些规则，过滤掉太差的；（也可以用LLM来做裁判）

目的：确保质量和多样性；

Filter out instructions that are too short or too long

Filter out instructions containing keywords unsuitable for language models (e.g. "image", "graph", "file", "plot")

Filter out instructions starting with punctuation

Filter out instructions starting with non-English characters

Filter out instructions that have high ROUGE-L similarity (above 0.7) with any existing instruction in the task pool