视频讲解1:哔哩哔哩视频讲解
视频讲解2:https://www.douyin.com/video/7651157553740598554
源码下载:https://github.com/KeepTryingTo/MindSpore-Framework/tree/main
官方链接:MindSpore总体架构 | MindSpore 1.2 文档 | 昇思MindSpore社区
安装教程:MindSpore安装 | 昇思MindSpore社区
MindYolo:主页 - MindYOLO Docs
MindSpore Transformer: 预训练实践 | MindSpore Transformers 1.9.0 文档 | 昇思MindSpore社区
MindNLP:MindNLP Docs

目录
mindspore.nn和mindspore.mint.nn之间的区别
前置
昇思MindSpore是华为自主研发的面向"端-边-云"全场景设计的新一代AI计算框架,旨在弥合AI算法研究与生产部署之间的鸿沟。该框架以"易开发、高效执行、全场景覆盖"为核心设计理念,支持CPU、GPU及昇腾NPU等多样性异构算力。在编程体验上,MindSpore首创动静统一的编程范式,开发者仅需修改一行代码即可在易于调试的动态图与高性能的静态图之间无缝切换,兼顾了开发效率与执行性能;同时,它采用基于源码转换的函数式自动微分机制,使开发者能够聚焦于模型算法的数学原生表达,并原生支持AI与科学计算融合编程。在架构层面,MindSpore通过端云统一的中间表达(MindIR)实现了模型与底层硬件的解耦,达成"一次训练,多次部署";其内置的自动并行技术能够自动分析模型并选择最优切分策略,极大降低了千亿级大模型分布式训练的门槛。此外,MindSpore提供了从低阶张量操作到高阶模型管理的丰富API,并配备了MindSpore Insight可视化调优工具以及MindSpore Armour安全增强库,在保障企业级数据隐私与模型安全的同时,为计算机视觉、自然语言处理及大模型等前沿领域提供了极简、高效、安全可信的底层支撑。
不同框架的对比
| 对比维度 | PyTorch | TensorFlow | MindSpore | PaddlePaddle |
|---|---|---|---|---|
| 开发背景 | Meta AI (Facebook) 研发 | Google Brain 团队开发 | 华为自主研发并开源 | 百度自主研发并开源 |
| 核心定位 | 学术研究首选,灵活易用 | 工业部署成熟,端到端工程体系完善 | 华为生态,昇腾适配,端边云协同 | 源于产业实践,产业级深度学习平台 |
| 执行模式 | 动态图优先(Define-by-Run) | 默认 Eager Execution(动态图) | 动静态图统一(同一套代码无缝切换) | 默认命令式(动态图),支持一行代码转静态图 |
| 硬件支持 | GPU 为主(CUDA 生态完善) | GPU / TPU 深度集成 | 昇腾 NPU 最优,兼容 GPU/CPU | 支持 CPU、GPU、树莓派等通用硬件 |
| 分布式训练 | 半自动配置(DDP/FSDP 数据并行) | 多策略分布式(tf.distribute 抽象) | 全自动并行策略(自动切分模型) | 提供 FleetAPI,十行代码转分布式 |
| 部署能力 | 需转换(TorchScript / ONNX) | 原生支持(TF Serving / TFLite) | 端边云原生支持(MindIR 统一格式) | 丰富套件(FastDeploy / Paddle Lite / Paddle.js) |
| 生态与工具 | 社区最活跃,论文代码首选,生态最丰富 | 工业级部署工具链完善(TFX 等) | 配套 MindInsight 调优、MindArmour 安全 | 提供 PaddleX 低代码工具、VisualDL 可视化 |
| 适用场景 | 学术实验、快速原型设计、前沿算法研究 | 传统工业界大规模生产环境、移动端部署 | 华为昇腾硬件生态、端边云一体化部署 | 企业 POC 快速验证、产业级算法开发与落地 |
实战环节
(1)导入库
python
import os
import mindspore
from mindspore import nn, context
from mindspore.dataset import ImageFolderDataset
import mindspore.dataset.vision as vision
import mindspore.dataset.transforms as transforms
from mindspore.train import Model, LossMonitor
from mindvision.classification.models import mobilenet_v2
from mindvision.dataset import DownLoad
from mindspore.train import CheckpointConfig, EarlyStopping, ModelCheckpoint
设置运行环境
python
# 设置运行模式
context.set_context(mode=context.GRAPH_MODE, device_target="CPU")
(2)加载数据
统一加载
python
def create_dataset(data_path, batch_size=32, is_train=True):
if is_train:
data_path = os.path.join(data_path, 'train')
else:
data_path = os.path.join(data_path, 'test')
dataset = ImageFolderDataset(data_path, shuffle=is_train)
if is_train:
transform_list = [
vision.RandomCropDecodeResize(size=224, scale=(0.8, 1.0)),
vision.RandomHorizontalFlip(prob=0.5),
vision.Rescale(1.0 / 255.0, 0.0),
vision.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), # MobileNet 官方归一化参数
vision.HWC2CHW()
]
else:
transform_list = [
vision.Decode(),
vision.Resize(size=224),
vision.Rescale(1.0 / 255.0, 0.0),
vision.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
vision.HWC2CHW()
]
dataset = dataset.map(operations=transform_list, input_columns="image")
dataset = dataset.map(operations=transforms.TypeCast(mindspore.int32), input_columns="label")
dataset = dataset.batch(batch_size, drop_remainder=True)
return dataset
自定义加载
python
from mindspore.dataset import Dataset
import mindspore.dataset as ds
class MyDataset:
def __init__(self):
# super().__init__()
self.data = np.ones((5, 2))
self.label = np.zeros((5, 1))
def __getitem__(self, index):
return self.data[index], self.label[index]
def __len__(self):
return len(self.data)
my_dataset = MyDataset()
dataset = ds.GeneratorDataset(source=my_dataset, column_names=["data", "label"], shuffle=True)
dataset = dataset.batch(batch_size=2)
(3)加载模型
加载方式1:
python
def filter_ckpt_parameter(origin_dict, param_filter):
for key in list(origin_dict.keys()):
for name in param_filter:
if name in key:
print(f"Delete parameter from checkpoint: {key}")
del origin_dict[key]
break
class myMobileNet(nn.Cell):
def __init__(self, num_classes = 5, pretrained = True, img_size = (224, 224)):
super().__init__()
self.num_classes = num_classes
self.pretrained = pretrained
self.img_size = img_size
self.backbone = mobilenet_v2(num_classes=self.num_classes, resize=self.img_size[0])
if self.pretrained:
models_url = "https://download.mindspore.cn/vision/classification/mobilenet_v2_1.0_224.ckpt"
# dl = DownLoad()
# ckpt_path = dl.download_url(models_url) # 自动下载到当前目录
ckpt_path = r'/data/myProjects/myProjects/mindspore/classification/my_projects/checkpoints/mobilenet_v2_1.0_224.ckpt'
if ckpt_path is None:
raise RuntimeError(
f"Failed to download pretrained model from {models_url}. "
f"Please check your network connection or the URL validity.")
print(f"Successfully downloaded checkpoint to: {ckpt_path}")
# 模型权重参数
param_dict = mindspore.load_checkpoint(ckpt_path)
filter_list = [x.name for x in self.backbone.head.classifier.get_parameters()]
filter_ckpt_parameter(param_dict, filter_list)
mindspore.load_param_into_net(self.backbone, param_dict)
def construct(self, x):
return self.backbone(x)
加载方式2:
python
import mindcv
# create a dataset
dataset = mindcv.create_dataset('cifar10', download=True)
# create a model
network = mindcv.create_model('resnet50', pretrained=True)
mindspore.nn和mindspore.mint.nn之间的区别
mindspore.nn 是 MindSpore 的原生标准接口,
mindspore.mint.nn 是为了方便 PyTorch 用户无痛迁移而推出的“高兼容”接口。
也就是mindspore.nn中的相关接口功能和mindspore.mint.nn的接口功能是差不多的
只不过是mindspore.mint.nn的接口名称和Pytorch中的接口名称相同而已
python
import mindspore
from mindspore import nn as mp_nn
from mindspore.mint import nn as mpm_nn
class NativeNet(mp_nn.Cell):
def __init__(self):
super().__init__()
# 原生接口:使用 Conv2d 和 Dense
self.conv = mp_nn.Conv2d(
in_channels=3,
out_channels=64,
kernel_size=3,
pad_mode='pad', # MindSpore 原生参数命名
padding=1
)
self.relu = mp_nn.ReLU()
# MindSpore 习惯将全连接层称为 Dense
self.fc = mp_nn.Dense(64 * 224 * 224, 10)
def construct(self, x):
x = self.conv(x)
x = self.relu(x)
x = x.view(x.shape, -1) # MindSpore 常用 view 进行展平
x = self.fc(x)
return x
class MintNet(mpm_nn.Cell):
def __init__(self):
super().__init__()
# mint 接口:底层会自动映射到昇腾优化的算子,但接口名对齐 PyTorch
self.conv = mpm_nn.Conv2d(
in_channels=3,
out_channels=64,
kernel_size=3,
padding=1,
bias=True,
dtype=mindspore.float16
)
# 在 PyTorch 习惯中,激活函数常在前向传播中用 functional 调用,mint 提供了与 torch.nn.functional 极其相似的接口
self.relu = mpm_nn.ReLU()
# PyTorch 习惯将全连接层称为 Linear
self.fc = mpm_nn.Linear(64 * 224 * 224, 10)
def construct(self, x):
x = self.conv(x)
x = self.relu(x)
x = x.view(x.shape, -1)
x = self.fc(x)
return x
(4)加载回调函数,优化器以及损失函数
python
loss_fn = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
network = myMobileNet(num_classes=5, img_size=(224, 224), pretrained=True)
optimizer = nn.Adam(params=network.trainable_params(), learning_rate=0.001)
# 每训练完 1 个 epoch 保存一次模型,最多保留 2 个模型文件
config_ckpt = CheckpointConfig(
save_checkpoint_steps=len(train_dataset), # 一个 epoch 的 step 数量
keep_checkpoint_max=2
)
ckpt_cb = ModelCheckpoint(
prefix="mobilenet_v2_finetune", # 保存的文件名前缀
directory="./checkpoints", # 保存的文件夹路径
config=config_ckpt
)
early_cb = EarlyStopping(
monitor='eval_loss',
patience=5,
verbose=True
)
model = Model(
network,
loss_fn=loss_fn,
optimizer=optimizer,
metrics={"accuracy": Accuracy()}
)
print("========== 开始基于 MobileNetV2 微调训练 ==========")
lossmonitor = LossMonitor(per_print_times=5)
如果是自定义训练和验证的过程的话,就可以不用加载那些回调函数,自己来决定训练和验证的过程细节。
(5)模型训练和验证
方式一
python
print("========== 开始基于 MobileNetV2 微调训练 ==========")
lossmonitor = LossMonitor(per_print_times=5)
model.train(
epoch=50,
train_dataset=train_dataset,
callbacks=[lossmonitor, ckpt_cb, early_cb] # 将所有回调函数放在同一个列表中
)
print("========== 开始评估 ==========")
acc = model.eval(eval_dataset)
print(f"评估结果: {acc}")
方式二
python
network = myMobileNet(num_classes=5, pretrained=True)
loss_fn = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
optimizer = nn.Adam(network.trainable_params(), learning_rate=0.001, momentum=0.9)
def forward_fn(data, label):
logits = network(data)
loss = loss_fn(logits, label)
return loss
# 使用 value_and_grad 获取梯度计算函数, 这里的 None 表示不对非参数(如 loss_fn 的超参)求导
# optimizer.parameters 是需要更新的参数列表
grad_fn = mindspore.value_and_grad(
fn=forward_fn,
grad_position=None,
weights=optimizer.parameters,
has_aux=False
)
@mindspore.jit # 使用 JIT 加速图模式运行
def train_step(data, label):
loss, grads = grad_fn(data, label)
optimizer(grads) # 执行梯度更新
return loss
def eval_loop(dataset, network, loss_fn):
network.set_train(False) # 关闭 Dropout/BatchNorm 的训练行为
total_loss = 0
total_correct = 0
total_samples = 0
for data, label in dataset.create_tuple_iterator():
pred = network(data)
loss = loss_fn(pred, label)
total_loss += loss.asnumpy() * len(data)
total_correct += (pred.argmax(1) == label).asnumpy().sum()
total_samples += len(data)
avg_loss = total_loss / total_samples
accuracy = total_correct / total_samples
return avg_loss, accuracy
def load_checkpoint(network, optimizer, ckpt_path):
checkpoint = mindspore.load_checkpoint(ckpt_path)
# 因为保存时 save_obj=param_dict,所以加载回来的顶层字典直接就是网络参数
mindspore.load_param_into_net(network, checkpoint)
opt_dict = checkpoint.get("optimizer_state")
if opt_dict:
mindspore.load_param_into_net(optimizer, opt_dict)
epoch = checkpoint.get("epoch", 0)
print(f"成功加载模型权重与优化器状态,当前 epoch: {epoch}")
return epoch
def save_checkpoint(network, optimizer, epoch, output_dir):
param_dict = network.parameters_dict()
opt_dict = optimizer.parameters_dict()
mindspore.save_checkpoint(
save_obj=param_dict,
ckpt_file_name=os.path.join(output_dir, f'checkpoint_{str(epoch)}.ckpt'),
append_dict={"epoch": epoch, "optimizer_state": opt_dict}
)
print("========== 开始自定义训练循环 ==========")
loss_monitor = LossMonitor(per_print_times=50)
epochs = 100
init_acc = 0.
init_loss = np.inf
train_loss = 0
output_dir = r'./checkpoints'
resume_path = None
start_epoch = 0
if resume_path:
start_epoch = load_checkpoint(network, optimizer, ckpt_path=resume_path)
for epoch in range(start_epoch , epochs):
network.set_train(True)
step = 0
for data, label in train_dataset.create_tuple_iterator():
step += 1
loss = train_step(data, label)
train_loss += loss
loss_monitor.step_end(run_context={"net_outputs": loss, "num_step": step})
val_loss, val_acc = eval_loop(eval_dataset, network, loss_fn)
if init_loss > val_loss or val_acc > init_acc:
init_loss = val_loss if init_loss > val_loss else init_loss
init_acc = val_acc if init_acc < val_acc else init_acc
save_checkpoint(network, optimizer, epoch, output_dir)
print(f"Epoch: {epoch + 1}, Train Loss: {(train_loss / len(train_dataset)):.4f}, "
f"Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.4f}")
print("========== 训练完成 ==========")
python
========== 开始基于 MobileNetV2 微调训练 ==========
epoch: 1 step: 5, loss is 1.6564632654190063
epoch: 1 step: 10, loss is 1.2783548831939697
epoch: 1 step: 15, loss is 1.438764214515686
...
对比pytorch的实现方式
python
import os
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, models, transforms
from torchvision.models import MobileNet_V2_Weights
import numpy as np
# 设置设备
device = torch.device("cpu" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
# 1. 数据预处理
def create_dataloaders(data_path, batch_size=32):
# 训练集的数据增强和预处理
# 注意:PyTorch 的 Normalize 通常放在 ToTensor 之后,因为 ToTensor 会将 [0,255] 转为 [0.0,1.0]
train_transform = transforms.Compose([
transforms.RandomResizedCrop(224, scale=(0.8, 1.0)),
transforms.RandomHorizontalFlip(p=0.5),
transforms.ToTensor(), # ToTensor 会自动进行 Rescale (除以255)
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# 验证集的预处理
val_transform = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# 加载数据集 (假设数据结构为 data_dir/train/ 和 data_dir/val/)
train_dataset = datasets.ImageFolder(root=os.path.join(data_path, 'train'),
transform=train_transform)
val_dataset = datasets.ImageFolder(root=os.path.join(data_path, 'test'),
transform=val_transform)
train_loader = DataLoader(
train_dataset,
batch_size=batch_size,
shuffle=True,
num_workers=8,
drop_last=True
)
val_loader = DataLoader(
val_dataset,
batch_size=batch_size,
shuffle=False,
num_workers=8,
drop_last=True
)
return train_loader, val_loader
data_dir = "/data/myProjects/myDatasets/flower_photos"
train_loader, val_loader = create_dataloaders(data_dir, batch_size=32)
def create_model(num_classes=5, pretrained=True):
# 加载官方在 ImageNet 上预训练的 MobileNetV2 权重
weights = MobileNet_V2_Weights.IMAGENET1K_V1 if pretrained else None
model = models.mobilenet_v2(weights=weights)
# 修改最后的分类器 (Classifier)
# MobileNetV2 的最后是一个包含 Dropout 和 Linear 的 Sequential
in_features = model.classifier[1].in_features
model.classifier = nn.Sequential(
nn.Dropout(p=0.2, inplace=True),
nn.Linear(in_features, num_classes)
)
# 如果使用预训练权重,我们需要剔除原分类器的参数,避免加载时报 shape 不匹配的错误
# 注意:在 PyTorch 中,如果我们指定了 weights,模型会自动加载匹配的层。
# 但是对于不匹配的层(如最后的 Linear),load_state_dict 默认会报错。
# 我们可以通过 strict=False 来忽略不匹配的层,或者手动过滤。
# 这里为了演示与 MindSpore 逻辑一致,我们手动过滤(虽然设置 weights 参数后内部已经处理了,但为了逻辑对齐):
# 如果你是在加载本地的 .pth 文件且需要过滤,可以参考以下逻辑:
# if pretrained and os.path.exists(local_ckpt_path):
# state_dict = torch.load(local_ckpt_path)
# model_dict = model.state_dict()
# filtered_dict = {k: v for k, v in state_dict.items() if not k.startswith('classifier')}
# model_dict.update(filtered_dict)
# model.load_state_dict(model_dict)
return model
model = create_model(num_classes=5, pretrained=True)
model.to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001, betas=(0.9, 0.999)) # PyTorch Adam 默认有动量(类似betas)
def train_one_epoch(model, dataloader, criterion, optimizer, device):
model.train()
running_loss = 0.0
for inputs, labels in dataloader:
inputs, labels = inputs.to(device), labels.to(device)
# 梯度清零
optimizer.zero_grad()
# 前向传播
outputs = model(inputs)
loss = criterion(outputs, labels)
# 反向传播
loss.backward()
optimizer.step()
running_loss += loss.item()
return running_loss / len(dataloader)
def validate(model, dataloader, criterion, device):
model.eval()
val_loss = 0.0
corrects = 0
total_samples = 0
with torch.no_grad():
for inputs, labels in dataloader:
inputs, labels = inputs.to(device), labels.to(device)
outputs = model(inputs)
loss = criterion(outputs, labels)
val_loss += loss.item() * inputs.size(0)
_, preds = torch.max(outputs, 1)
corrects += torch.sum(preds == labels.data)
total_samples += inputs.size(0)
avg_loss = val_loss / total_samples
accuracy = corrects.double() / total_samples
return avg_loss, accuracy.item()
def save_checkpoint(model, optimizer, epoch, output_dir, filename="checkpoint.pth"):
os.makedirs(output_dir, exist_ok=True)
state = {
'epoch': epoch,
'model_state_dict': model.state_dict(),
'optimizer_state_dict': optimizer.state_dict(),
'loss': val_loss, # 如果需要保存最佳 loss
}
torch.save(state, os.path.join(output_dir, filename))
print(f"Model saved at epoch {epoch}")
def load_checkpoint(model, optimizer, ckpt_path):
if os.path.exists(ckpt_path):
checkpoint = torch.load(ckpt_path)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
start_epoch = checkpoint['epoch'] + 1
print(f"Loaded checkpoint from epoch {checkpoint['epoch']}")
return start_epoch
return 0
print("========== 开始 PyTorch 训练循环 ==========")
num_epochs = 100
output_dir = "./checkpoints_torch"
best_acc = 0.0
start_epoch = 0
# start_epoch = load_checkpoint(model, optimizer, "checkpoints_torch/checkpoint.pth")
for epoch in range(start_epoch, num_epochs):
# 训练
train_loss = train_one_epoch(model, train_loader, criterion, optimizer, device)
# 验证
val_loss, val_acc = validate(model, val_loader, criterion, device)
# 打印日志
print(f"Epoch: {epoch + 1}/{num_epochs}, "
f"Train Loss: {train_loss:.4f}, "
f"Val Loss: {val_loss:.4f}, "
f"Val Acc: {val_acc:.4f}")
# 保存最佳模型 (模拟 MindSpore 代码逻辑)
if val_acc > best_acc:
best_acc = val_acc
save_checkpoint(model, optimizer, epoch, output_dir, filename=f"best_checkpoint_epoch_{epoch + 1}.pth")
print(f"========== 训练完成,最佳准确率: {best_acc:.4f} ==========")
MindNLP
MindNLP:MindNLP Docs
| 特性 | MindNLP | PyTorch + HF | TensorFlow + HF |
|---|---|---|---|
| HuggingFace 模型 | ✅ 200K+ | ✅ 200K+ | ⚠️ 有限 |
| 昇腾 NPU 支持 | ✅ 原生 | ❌ | ❌ |
| 零代码迁移 | ✅ | - | ❌ |
| 中文模型支持 | ✅ 优秀 | ✅ 良好 | ⚠️ 有限 |
mindspore和pytorch加载设备区别
| 维度 | PyTorch | MindSpore |
|---|---|---|
| 设备管理 | 显式:torch.device(...)+ .to(device) |
隐式/全局:context.set_context(device_target=...) |
| 多卡/异构 | DataParallel/ DistributedDataParallel |
set_auto_parallel_context/ Model封装 |
| NPU 支持 | ❌(需第三方扩展) | ✅ 原生支持华为 Ascend NPU |