OpenAI开放gpt-3.5turbo微调fine-tuning测试教程

文章目录

openai微调 fine-tuning介绍
openai微调地址

网址:https://platform.openai.com/finetune

jsonl格式数据集准备
  • 使用Chinese-medical-dialogue-data数据集
  • git clone进行下载

git clone https://github.com/Toyhom/Chinese-medical-dialogue-data

  • 选择其中心血管科中的部分数据进行微调

    微调需要进行付费,token越多收费越多,并且gpt-3.5-turbotoken数最多为4096

  • dataframe导入csv文件

python 复制代码
import pandas as pd

df = pd.read_csv('Chinese-medical-dialogue-data/样例_内科5000-6000.csv',encoding='gbk')

df
  • 提取样本
python 复制代码
train_data = df[df['department']=='心血管科'].iloc[0:50,:]
valid_data = df[df['department']=='心血管科'].iloc[50:70,:]

train_data
  • jsonl格式数据构建
python 复制代码
lis1 = []
lis2 = []
sys_content = "You are a specialist in cardiovascular disease and you will apply your expertise to give your specialized answers to patients."

for index,row in train_data.iterrows():
    each = []
    each.append({"role":"system","content":sys_content})
    each.append({"role":"user","content":row['ask']})
    each.append({"role":"assistant","content":row['answer']})
    #print(each)
    lis1.append(each)

for index,row in valid_data.iterrows():
    each = []
    each.append({"role":"system","content":sys_content})
    each.append({"role":"user","content":row['ask']})
    each.append({"role":"assistant","content":row['answer']})
    #print(each)
    lis2.append(each)

lis1
  • jsonl数据导出
python 复制代码
lis1 = []
lis2 = []
sys_content = "You are a specialist in cardiovascular disease and you will apply your expertise to give your specialized answers to patients."

for index,row in train_data.iterrows():
    each = []
    each.append({"role":"system","content":sys_content})
    each.append({"role":"user","content":row['ask']})
    each.append({"role":"assistant","content":row['answer']})
    #print(each)
    lis1.append(each)

for index,row in valid_data.iterrows():
    each = []
    each.append({"role":"system","content":sys_content})
    each.append({"role":"user","content":row['ask']})
    each.append({"role":"assistant","content":row['answer']})
    #print(each)
    lis2.append(each)

lis1
点击上传文件
  • 上传文件(钱不够了)
相关推荐
ANNENBERG27 分钟前
git分支开发管理
git
坤坤藤椒牛肉面33 分钟前
GIT的使用
git
w32963627138 分钟前
使用 OpenCode 在 Windows 上加速安装 Playwright 的完整指南
windows·git
宝桥南山41 分钟前
GitHub Copilot - 尝试使用一下Azure Devops MCP server
microsoft·微软·github·aigc·copilot·devops
MicrosoftReactor42 分钟前
技术速递|提升 GitHub Agentic Workflows 的 Token 使用效率
ai·github·copilot·智能体
DogDaoDao1 小时前
【GitHub】last30days-skill 深度技术解析
深度学习·程序员·大模型·github·ai agent·agent skill
IT WorryFree1 小时前
GitHub / Gitee / Gitea / GitLab 四平台完整对比(定位、优缺点、适用场景)
gitee·github·gitea
Dontla1 小时前
Github Personal Access Token(个人访问令牌)添加workflow scope(更新GitHub Actions工作流文件必须)
github
難釋懷1 小时前
Nginx对上游服务器使用keepalive
服务器·nginx·github
Lethehong1 小时前
去芜存菁:NextChat 本地部署与物流“数字客服”的优雅落地
ai·github·蓝耘·蓝耘元生代