- torchvision .datasets 模块中提供很多内置数据集;
- 以及很多工具类用于构建用户自己的数据集;
内置数据集
- 所有内置的数据集都是 torch.utils.data.Dataset的子类;
- 也就是他们都具有已经实现的__getitem__和__len__方法;
- 内置数据集都能够被送到torch.utils.data.DataLoader;
- 以并行的方式使用torch.multiprocessing加载多种样本;
- 代码实例:
python
import torch.utils.data
import torchvision.datasets
imagenet_data=torchvision.datasets.ImageNet('path/to/imagenet_root/')
data_loader=torch.utils.data.DataLoader(imagenet_data,
batch_size=4,
shuffle=True,
num_workers=args.nThreads)
- 所有的数据集具有类似的 API;
- 所有的 API 都具有两个共同的参数:transform 和 target_transform,独立的转换输入和目标;
- 使用 pytorch 提供的基础类用户可以创建自己的数据集 ;
图像分类数据集
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Caltech101(root, target_type, transform, ...) | Caltech 101 Dataset. |
| Caltech256(root, transform, ...) | Caltech 256 Dataset. |
| CelebA(root, split, target_type, ...) | Large-scale CelebFaces Attributes (CelebA) Dataset Dataset. |
| CIFAR10(root, train, transform, ...) | CIFAR10 Dataset. |
| CIFAR100(root, train, transform, ...) | CIFAR100 Dataset. |
| Country211(root, split, transform, ...) | The Country211 Data Set from OpenAI. |
| DTD(root, split, partition, transform, ...) | Describable Textures Dataset (DTD). |
| EMNIST(root, split, **kwargs) | EMNIST Dataset. |
| EuroSAT(root, transform, target_transform, ...) | RGB version of the EuroSAT Dataset. |
| FakeData(size, image_size, num_classes, ...) | A fake dataset that returns randomly generated images and returns them as PIL images |
| FashionMNIST(root, train, transform, ...) | Fashion-MNIST Dataset. |
| FER2013(root, split, transform, ...) | FER2013 Dataset. |
| FGVCAircraft(root, split, ...) | FGVC Aircraft Dataset. |
| Flickr8k(root, ann_file, transform, ...) | Flickr8k Entities Dataset. |
| Flickr30k(root, ann_file, transform, ...) | Flickr30k Entities Dataset. |
| Flowers102(root, split, transform, ...) | Oxford 102 Flower Dataset. |
| Food101(root, split, transform, ...) | The Food-101 Data Set. |
| GTSRB(root, split, transform, ...) | German Traffic Sign Recognition Benchmark (GTSRB) Dataset. |
| INaturalist(root, version, target_type, ...) | iNaturalist Dataset. |
| ImageNet(root, split) | ImageNet 2012 Classification Dataset. |
| Imagenette(root, split, size, download, ...) | Imagenette image classification dataset. |
| KMNIST(root, train, transform, ...) | Kuzushiji-MNIST Dataset. |
| LFWPeople(root, split, image_set, ...) | LFW Dataset. |
| LSUN(root, classes, transform, ...) | LSUN dataset. |
| MNIST(root, train, transform, ...) | MNIST Dataset. |
| Omniglot(root, background, transform, ...) | Omniglot Dataset. |
| OxfordIIITPet(root, split, target_types, ...) | Oxford-IIIT Pet Dataset. |
| Places365(root, split, small, download, ...) | Places365 classification dataset. |
| PCAM(root, split, transform, ...) | PCAM Dataset. |
| QMNIST(root, what, compat, train) | QMNIST Dataset. |
| RenderedSST2(root, split, transform, ...) | The Rendered SST2 Dataset. |
| SEMEION(root, transform, target_transform, ...) | SEMEION Dataset. |
| SBU(root, transform, target_transform, ...) | SBU Captioned Photo Dataset. |
| StanfordCars(root, split, transform, ...) | Stanford Cars Dataset |
| STL10(root, split, folds, transform, ...) | STL10 Dataset. |
| SUN397(root, transform, target_transform, ...) | The SUN397 Data Set. |
| SVHN(root, split, transform, ...) | SVHN Dataset. |
| USPS(root, train, transform, ...) | USPS Dataset. |
图像探测和分割数据集
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CocoDetection(root, annFile, transform, ...) | MS Coco Detection Dataset. |
| CelebA(root, split, target_type, ...) | Large-scale CelebFaces Attributes (CelebA) Dataset Dataset. |
| Cityscapes(root, split, mode, target_type, ...) | Cityscapes Dataset. |
| Kitti(root, train, transform, ...) | KITTI Dataset. |
| OxfordIIITPet(root, split, target_types, ...) | Oxford-IIIT Pet Dataset. |
| SBDataset(root, image_set, mode, download, ...) | Semantic Boundaries Dataset |
| VOCSegmentation(root, year, image_set, ...) | Pascal VOC Segmentation Dataset. |
| VOCDetection(root, year, image_set, ...) | Pascal VOC Detection Dataset. |
| WIDERFace(root, split, transform, ...) | WIDERFace Dataset. |
光流数据集
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|
| FlyingChairs(root, split, transforms) | FlyingChairs Dataset for optical flow. |
| FlyingThings3D(root, split, pass_name, ...) | FlyingThings3D dataset for optical flow. |
| HD1K(root, split, transforms) | HD1K dataset for optical flow. |
| KittiFlow(root, split, transforms) | KITTI dataset for optical flow (2015). |
| Sintel(root, split, pass_name, transforms) | Sintel Dataset for optical flow. |
立体匹配数据集
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CarlaStereo(root, transforms) | Carla simulator data linked in the CREStereo github repo. |
| Kitti2012Stereo(root, split, transforms) | KITTI dataset from the 2012 stereo evaluation benchmark. |
| Kitti2015Stereo(root, split, transforms) | KITTI dataset from the 2015 stereo evaluation benchmark. |
| CREStereo(root, transforms) | Synthetic dataset used in training the CREStereo architecture. |
| FallingThingsStereo(root, variant, transforms) | FallingThings dataset. |
| SceneFlowStereo(root, variant, pass_name, ...) | Dataset interface for Scene Flow datasets. |
| SintelStereo(root, pass_name, transforms) | Sintel Stereo Dataset. |
| InStereo2k(root, split, transforms) | InStereo2k dataset. |
| ETH3DStereo(root, split, transforms) | ETH3D Low-Res Two-View dataset. |
| Middlebury2014Stereo(root, split, ...) | Publicly available scenes from the Middlebury dataset 2014 version <https://vision.middlebury.edu/stereo/data/scenes2014/\>. |
图像配对数据集
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------|
| LFWPairs(root, split, image_set, ...) | LFW Dataset. |
| PhotoTour(root, name, train, transform, ...) | Multi-view Stereo Correspondence Dataset. |
图像说明数据集
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------|
| CocoCaptions(root, annFile, transform, ...) | MS Coco Captions Dataset. |
视频分类数据集
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|
| HMDB51(root, annotation_path, frames_per_clip) | HMDB51 dataset. |
| Kinetics(root, frames_per_clip, ...) | Generic Kinetics dataset. |
| UCF101(root, annotation_path, frames_per_clip) | UCF101 dataset. |
视频预测数据集
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|
| MovingMNIST(root, split, split_ratio, ...) | MovingMNIST Dataset. |
用于定制数据集的基础类
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------|
| DatasetFolder(root, loader, extensions, ...) | A generic data loader. |
| ImageFolder(root, transform, ...) | A generic data loader where the images are arranged in this way by default: . |
| VisionDataset(root, transforms, transform, ...) | Base Class For making datasets which are compatible with torchvision. |
V2
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|
| wrap_dataset_for_transforms_v2(dataset, ...) | Wrap a torchvision.dataset for usage with torchvision.transforms.v2. |