History

Guillaume Lagrange 0cbe9a927d Add learner training report summary (#1591 ) * Add training report summary * Fix LossMetric batch size state * Add NumericEntry de/serialize * Fix clippy suggestion * Compact recorder does not use compression (anymore) * Add learner summary expected results tests * Add summary to learner builder and automatically display in fit - Add LearnerSummaryConfig - Keep track of summary metrics names - Add model field when displaying from learner.fit()		2024-04-11 12:32:25 -04:00
..
examples	docs(book-&-examples): modify book and examples with new `prelude` module (#1372 )	2024-02-28 13:25:25 -05:00
src	Add learner training report summary (#1591 )	2024-04-11 12:32:25 -04:00
.gitignore	Add `ImageFolderDataset` (#1232 )	2024-02-02 16:32:38 -05:00
Cargo.toml	Add multi-label classification dataset and metric (#1572 )	2024-04-05 13:16:46 -04:00
README.md	Add multi-label classification dataset and metric (#1572 )	2024-04-05 13:16:46 -04:00

README.md

Training on a Custom Image Dataset

In this example, a simple CNN model is trained from scratch on the CIFAR-10 dataset by leveraging the ImageFolderDataset struct to retrieve images from a folder structure on disk.

Since the original source is in binary format, the data is downloaded from a fastai mirror in a folder structure with .png images.

cifar10
├── labels.txt
├── test
│   ├── airplane
│   ├── automobile
│   ├── bird
│   ├── cat
│   ├── deer
│   ├── dog
│   ├── frog
│   ├── horse
│   ├── ship
│   └── truck
└── train
    ├── airplane
    ├── automobile
    ├── bird
    ├── cat
    ├── deer
    ├── dog
    ├── frog
    ├── horse
    ├── ship
    └── truck

To load the training and test dataset splits, it is as simple as providing the root path to both folders

let train_ds = ImageFolderDataset::new_classification("/path/to/cifar10/train").unwrap();
let test_ds = ImageFolderDataset::new_classification("/path/to/cifar10/test").unwrap();

as is done in CIFAR10Loader for this example.

Example Usage

The CNN model and training recipe used in this example are fairly simple since the objective is to demonstrate how to load a custom image classification dataset from disk. Nonetheless, it still achieves 70-80% accuracy on the test set after just 30 epochs.

Run it with the Torch GPU backend:

export TORCH_CUDA_VERSION=cu121
cargo run --example custom-image-dataset --release --features tch-gpu

Run it with our WGPU backend:

cargo run --example custom-image-dataset --release --features wgpu