OpenAI开放gpt-3.5turbo微调fine-tuning测试教程

文章目录

openai微调 fine-tuning介绍
openai微调地址

网址:https://platform.openai.com/finetune

jsonl格式数据集准备
  • 使用Chinese-medical-dialogue-data数据集
  • git clone进行下载

git clone https://github.com/Toyhom/Chinese-medical-dialogue-data

  • 选择其中心血管科中的部分数据进行微调

    微调需要进行付费,token越多收费越多,并且gpt-3.5-turbotoken数最多为4096

  • dataframe导入csv文件

python 复制代码
import pandas as pd

df = pd.read_csv('Chinese-medical-dialogue-data/样例_内科5000-6000.csv',encoding='gbk')

df
  • 提取样本
python 复制代码
train_data = df[df['department']=='心血管科'].iloc[0:50,:]
valid_data = df[df['department']=='心血管科'].iloc[50:70,:]

train_data
  • jsonl格式数据构建
python 复制代码
lis1 = []
lis2 = []
sys_content = "You are a specialist in cardiovascular disease and you will apply your expertise to give your specialized answers to patients."

for index,row in train_data.iterrows():
    each = []
    each.append({"role":"system","content":sys_content})
    each.append({"role":"user","content":row['ask']})
    each.append({"role":"assistant","content":row['answer']})
    #print(each)
    lis1.append(each)

for index,row in valid_data.iterrows():
    each = []
    each.append({"role":"system","content":sys_content})
    each.append({"role":"user","content":row['ask']})
    each.append({"role":"assistant","content":row['answer']})
    #print(each)
    lis2.append(each)

lis1
  • jsonl数据导出
python 复制代码
lis1 = []
lis2 = []
sys_content = "You are a specialist in cardiovascular disease and you will apply your expertise to give your specialized answers to patients."

for index,row in train_data.iterrows():
    each = []
    each.append({"role":"system","content":sys_content})
    each.append({"role":"user","content":row['ask']})
    each.append({"role":"assistant","content":row['answer']})
    #print(each)
    lis1.append(each)

for index,row in valid_data.iterrows():
    each = []
    each.append({"role":"system","content":sys_content})
    each.append({"role":"user","content":row['ask']})
    each.append({"role":"assistant","content":row['answer']})
    #print(each)
    lis2.append(each)

lis1
点击上传文件
  • 上传文件(钱不够了)
相关推荐
yvestine2 小时前
自然语言处理——Transformer
人工智能·深度学习·自然语言处理·transformer
zy2152154 小时前
Git 命令全流程总结
git
thels_8 小时前
记录一个用了很久的git提交到github和gitee比较方便的方法
git·gitee·github
雨白11 小时前
初识版本控制工具 Git
git
三道杠卷胡11 小时前
【AI News | 20250609】每日AI进展
人工智能·python·语言模型·github·aigc
急速前行Klein12 小时前
Ubuntu中安装git
linux·git·ubuntu
红衣信14 小时前
电影项目开发中的编程要点与用户体验优化
前端·javascript·github
寻月隐君15 小时前
Rust + Protobuf:从零打造高效键值存储项目
后端·rust·github
qianmoQ15 小时前
GitHub 趋势日报 (2025年06月07日)
github
饼干ovo18 小时前
shell编程
java·git·github