OpenAI开放gpt-3.5turbo微调fine-tuning测试教程

文章目录

openai微调 fine-tuning介绍
openai微调地址

网址:https://platform.openai.com/finetune

jsonl格式数据集准备
  • 使用Chinese-medical-dialogue-data数据集
  • git clone进行下载

git clone https://github.com/Toyhom/Chinese-medical-dialogue-data

  • 选择其中心血管科中的部分数据进行微调

    微调需要进行付费,token越多收费越多,并且gpt-3.5-turbotoken数最多为4096

  • dataframe导入csv文件

python 复制代码
import pandas as pd

df = pd.read_csv('Chinese-medical-dialogue-data/样例_内科5000-6000.csv',encoding='gbk')

df
  • 提取样本
python 复制代码
train_data = df[df['department']=='心血管科'].iloc[0:50,:]
valid_data = df[df['department']=='心血管科'].iloc[50:70,:]

train_data
  • jsonl格式数据构建
python 复制代码
lis1 = []
lis2 = []
sys_content = "You are a specialist in cardiovascular disease and you will apply your expertise to give your specialized answers to patients."

for index,row in train_data.iterrows():
    each = []
    each.append({"role":"system","content":sys_content})
    each.append({"role":"user","content":row['ask']})
    each.append({"role":"assistant","content":row['answer']})
    #print(each)
    lis1.append(each)

for index,row in valid_data.iterrows():
    each = []
    each.append({"role":"system","content":sys_content})
    each.append({"role":"user","content":row['ask']})
    each.append({"role":"assistant","content":row['answer']})
    #print(each)
    lis2.append(each)

lis1
  • jsonl数据导出
python 复制代码
lis1 = []
lis2 = []
sys_content = "You are a specialist in cardiovascular disease and you will apply your expertise to give your specialized answers to patients."

for index,row in train_data.iterrows():
    each = []
    each.append({"role":"system","content":sys_content})
    each.append({"role":"user","content":row['ask']})
    each.append({"role":"assistant","content":row['answer']})
    #print(each)
    lis1.append(each)

for index,row in valid_data.iterrows():
    each = []
    each.append({"role":"system","content":sys_content})
    each.append({"role":"user","content":row['ask']})
    each.append({"role":"assistant","content":row['answer']})
    #print(each)
    lis2.append(each)

lis1
点击上传文件
  • 上传文件(钱不够了)
相关推荐
机器学习之心31 分钟前
改进模糊C均值时序聚类+编码器状态识别!IPOA-FCM-Transformer组合模型
均值算法·transformer·聚类·ipoa-fcm·改进模糊c均值时序聚类
来自于狂人1 小时前
给大模型“贴膏药”:LoRA微调原理说明书
人工智能·深度学习·transformer
Silence4Allen1 小时前
零基础用 Hexo + Matery 搭建博客|Github Pages 免费部署教程
github·hexo·博客搭建·matery
qianmoQ2 小时前
GitHub 趋势日报 (2025年05月16日)
github
兔子坨坨2 小时前
pycharm连接github(详细步骤)
windows·git·学习·pycharm·github
聚客AI6 小时前
ChatGPT到Claude全适配:跨模型Prompt高级设计规范与迁移技巧
人工智能·机器学习·语言模型·自然语言处理·langchain·transformer·llama
大大小小聪明9 小时前
Git合并多个提交方法详解
git·github
水花花花花花14 小时前
Transformer 架构在目标检测中的应用:YOLO 系列模型解析
目标检测·架构·transformer
Yvonne爱编码14 小时前
CSS- 4.1 浮动(Float)
前端·css·html·github·html5·hbuilder
冷yan~16 小时前
GitHub文档加载器设计与实现
java·人工智能·spring·ai·github·ai编程