mmpose/docs/en/user_guides/configs.md

# Configs

We use python files as configs and incorporate modular and inheritance design into our config system, which is convenient to conduct various experiments.

## Structure

The file structure of configs is as follows:

```shell
configs
|----_base_
     |----datasets
     |----default_runtime.py
|----animal_2d_keypoint
|----body_2d_keypoint
|----body_3d_keypoint
|----face_2d_keypoint
|----fashion_2d_keypoint
|----hand_2d_keypoint
|----hand_3d_keypoint
|----wholebody_2d_keypoint
```

## Introduction

MMPose is equipped with a powerful config system. Cooperating with Registry, a config file can organize all the configurations in the form of python dictionaries and create instances of the corresponding modules.

Here is a simple example of vanilla Pytorch module definition to show how the config system works:

```Python
# Definition of Loss_A in loss_a.py
Class Loss_A(nn.Module):
    def __init__(self, param1, param2):
        self.param1 = param1
        self.param2 = param2
    def forward(self, x):
        return x

# Init the module
loss = Loss_A(param1=1.0, param2=True)
```

All you need to do is just to register the module to the pre-defined Registry `MODELS`:

```Python
# Definition of Loss_A in loss_a.py
from mmpose.registry import MODELS

@MODELS.register_module() # register the module to MODELS
Class Loss_A(nn.Module):
    def __init__(self, param1, param2):
        self.param1 = param1
        self.param2 = param2
    def forward(self, x):
        return x
```

And import the new module in `__init__.py` in the corresponding directory:

```Python
# __init__.py of mmpose/models/losses
from .loss_a.py import Loss_A

__all__ = ['Loss_A']
```

Then you can define the module anywhere you want：

```Python
# config_file.py
loss_cfg = dict(
    type='Loss_A', # specify your registered module via `type`
    param1=1.0,    # pass parameters to __init__() of the module
    param2=True
)

# Init the module
loss = MODELS.build(loss_cfg) # equals to `loss = Loss_A(param1=1.0, param2=True)`
```

```{note}
Note that all new modules need to be registered using `Registry` and imported in `__init__.py` in the corresponding directory before we can create their instances from configs.
```

Here is a list of pre-defined registries in MMPose:

- `DATASETS`: data-related modules
- `TRANSFORMS`: data transformations
- `MODELS`: all kinds of modules inheriting `nn.Module` (Backbone, Neck, Head, Loss, etc.)
- `VISUALIZERS`: visualization tools
- `VISBACKENDS`: visualizer backend
- `METRICS`: all kinds of evaluation metrics
- `KEYPOINT_CODECS`: keypoint encoder/decoder
- `HOOKS`: all kinds of hooks like `CheckpointHook`

All registries are defined in `$MMPOSE/mmpose/registry.py`.

## Config System

It is best practice to layer your configs in five sections:

- **General**: basic configurations non-related to training or testing, such as Timer, Logger, Visualizer and other Hooks, as well as distributed-related environment settings

- **Data**: dataset, dataloader and data augmentation

- **Training**: resume, weights loading, optimizer, learning rate scheduling, epochs and valid interval etc.

- **Model**: structure, module and loss function etc.

- **Evaluation**: metrics

You can find all the provided configs under `$MMPOSE/configs`. A config can inherit contents from another config.To keep a config file simple and easy to read, we store some necessary but unremarkable configurations to `$MMPOSE/configs/_base_`.You can inspect the complete configurations by：

```Bash
python tools/analysis_tools/print_config.py /PATH/TO/CONFIG
```

### General

General configuration refers to the necessary configuration non-related to training or testing, mainly including:

- **Default Hooks**: time statistics, training logs, checkpoints etc.

- **Environment**: distributed backend, cudnn, multi-processing etc.

- **Visualizer**: visualization backend and strategy

- **Log**: log level, format, printing and recording interval etc.

Here is the description of General configuration:

```Python
# General
default_scope = 'mmpose'
default_hooks = dict(
    # time the data processing and model inference
    timer=dict(type='IterTimerHook'),
    # interval to print logs，50 iters by default
    logger=dict(type='LoggerHook', interval=50),
    # update lr according to the lr scheduler
    param_scheduler=dict(type='ParamSchedulerHook'),
    checkpoint=dict(
        # interval to save ckpt
        # e.g.
        # save_best='coco/AP' means save the best ckpt according to coco/AP of CocoMetric
        # save_best='PCK' means save the best ckpt according to PCK of PCKAccuracy
        type='CheckpointHook', interval=1, save_best='coco/AP',

        # rule to judge the metric
        # 'greater' means the larger the better
        # 'less' means the smaller the better
        rule='greater'), # rule to judge the metric
    sampler_seed=dict(type='DistSamplerSeedHook')) # set the distributed seed
env_cfg = dict(
    cudnn_benchmark=False, # cudnn benchmark flag
    mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), # num of opencv threads
    dist_cfg=dict(backend='nccl')) # distributed training backend
vis_backends = [dict(type='LocalVisBackend')] # visualizer backend
visualizer = dict( # Config of visualizer
    type='PoseLocalVisualizer',
    vis_backends=[dict(type='LocalVisBackend')],
    name='visualizer')
log_processor = dict( # Format, interval to log
    type='LogProcessor', window_size=50, by_epoch=True, num_digits=6)
log_level = 'INFO' # The level of logging
```

```{note}
We now support two visualizer backends: LocalVisBackend and TensorboardVisBackend, the former is for local visualization and the latter is for Tensorboard visualization. You can choose according to your needs. See [Train and Test](./train_and_test.md) for details.
```

General configuration is stored alone in the `$MMPOSE/configs/_base_`, and inherited by doing:

```Python
_base_ = ['../../../_base_/default_runtime.py'] # take the config file as the starting point of the relative path
```

### Data

Data configuration refers to the data processing related settings, mainly including:

- **File Client**: data storage backend, default is `disk`, we also support `LMDB`, `S3 Bucket` etc.

- **Dataset**: image and annotation file path

- **Dataloader**: loading configuration, batch size etc.

- **Pipeline**: data augmentation

- **Input Encoder**: encoding the annotation into specific form of target

Here is the description of Data configuration:

```Python
backend_args = dict(backend='local') # data storage backend
dataset_type = 'CocoDataset' # name of dataset
data_mode = 'topdown' # type of the model
data_root = 'data/coco/' # root of the dataset
 # config of codec，to generate targets and decode preds into coordinates
codec = dict(
    type='MSRAHeatmap', input_size=(192, 256), heatmap_size=(48, 64), sigma=2)
train_pipeline = [ # data aug in training
    dict(type='LoadImage', backend_args=backend_args, # image loading
    dict(type='GetBBoxCenterScale'), # calculate center and scale of bbox
    dict(type='RandomBBoxTransform'), # config of scaling, rotation and shifing
    dict(type='RandomFlip', direction='horizontal'), # config of random flipping
    dict(type='RandomHalfBody'), # config of half-body aug
    dict(type='TopdownAffine', input_size=codec['input_size']), # update inputs via transform matrix
    dict(
        type='GenerateTarget', # generate targets via transformed inputs
        # typeof targets
        encoder=codec, # get encoder from codec
    dict(type='PackPoseInputs') # pack targets
]
test_pipeline = [ # data aug in testing
    dict(type='LoadImage', backend_args=backend_args), # image loading
    dict(type='GetBBoxCenterScale'), # calculate center and scale of bbox
    dict(type='TopdownAffine', input_size=codec['input_size']), # update inputs via transform matrix
    dict(type='PackPoseInputs') # pack targets
]
train_dataloader = dict(
    batch_size=64, # batch size of each single GPU during training
    num_workers=2, # workers to pre-fetch data for each single GPU
    persistent_workers=True, # workers will stay around (with their state) waiting for another call into that dataloader.
    sampler=dict(type='DefaultSampler', shuffle=True), # data sampler, shuffle in traning
    dataset=dict(
        type=dataset_type , # name of dataset
        data_root=data_root, # root of dataset
        data_mode=data_mode, # type of the model
        ann_file='annotations/person_keypoints_train2017.json', # path to annotation file
        data_prefix=dict(img='train2017/'), # path to images
        pipeline=train_pipeline
    ))
val_dataloader = dict(
    batch_size=32, # batch size of each single GPU during validation
    num_workers=2, # workers to pre-fetch data for each single GPU
    persistent_workers=True, # workers will stay around (with their state) waiting for another call into that dataloader.
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False), # data sampler
    dataset=dict(
        type=dataset_type , # name of dataset
        data_root=data_root, # root of dataset
        data_mode=data_mode, # type of the model
        ann_file='annotations/person_keypoints_val2017.json', # path to annotation file
        bbox_file=
        'data/coco/person_detection_results/COCO_val2017_detections_AP_H_56_person.json', # bbox file use for evaluation
        data_prefix=dict(img='val2017/'), # path to images
        test_mode=True,
        pipeline=test_pipeline
    ))
test_dataloader = val_dataloader # use val as test by default
```

```{note}
Common Usages:
- [Resume training](https://mmpose.readthedocs.io/en/dev-1.x/user_guides/train_and_test.html#resume-training)
- [Automatic mixed precision (AMP) training](https://mmpose.readthedocs.io/en/dev-1.x/user_guides/train_and_test.html#automatic-mixed-precision-amp-training)
- [Set the random seed](https://mmpose.readthedocs.io/en/dev-1.x/user_guides/train_and_test.html#set-the-random-seed)
```

### Training

Training configuration refers to the training related settings including:

- Resume training

- Model weights loading

- Epochs of training and interval to validate

- Learning rate adjustment strategies like warm-up, scheduling etc.

- Optimizer and initial learning rate

- Advanced tricks like auto learning rate scaling

Here is the description of Training configuration:

```Python
resume = False # resume checkpoints from a given path, the training will be resumed from the epoch when the checkpoint's is saved
load_from = None # load models as a pre-trained model from a given path
train_cfg = dict(by_epoch=True, max_epochs=210, val_interval=10) # max epochs of training, interval to validate
param_scheduler = [
    dict( # warmup strategy
        type='LinearLR', begin=0, end=500, start_factor=0.001, by_epoch=False),
    dict( # scheduler
        type='MultiStepLR',
        begin=0,
        end=210,
        milestones=[170, 200],
        gamma=0.1,
        by_epoch=True)
]
optim_wrapper = dict(optimizer=dict(type='Adam', lr=0.0005)) # optimizer and initial lr
auto_scale_lr = dict(base_batch_size=512) # auto scale the lr according to batch size
```

### Model

Model configuration refers to model training and inference related settings including:

- Model Structure

- Loss Function

- Output Decoding

- Test-time augmentation

Here is the description of Model configuration, which defines a Top-down Heatmap-based HRNetx32:

```Python
# config of codec, if already defined in data configuration section, no need to define again
codec = dict(
    type='MSRAHeatmap', input_size=(192, 256), heatmap_size=(48, 64), sigma=2)

model = dict(
    type='TopdownPoseEstimator', # Macro model structure
    data_preprocessor=dict( # data normalization and channel transposition
        type='PoseDataPreprocessor',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        bgr_to_rgb=True),
    backbone=dict( # config of backbone
        type='HRNet',
        in_channels=3,
        extra=dict(
            stage1=dict(
                num_modules=1,
                num_branches=1,
                block='BOTTLENECK',
                num_blocks=(4, ),
                num_channels=(64, )),
            stage2=dict(
                num_modules=1,
                num_branches=2,
                block='BASIC',
                num_blocks=(4, 4),
                num_channels=(32, 64)),
            stage3=dict(
                num_modules=4,
                num_branches=3,
                block='BASIC',
                num_blocks=(4, 4, 4),
                num_channels=(32, 64, 128)),
            stage4=dict(
                num_modules=3,
                num_branches=4,
                block='BASIC',
                num_blocks=(4, 4, 4, 4),
                num_channels=(32, 64, 128, 256))),
        init_cfg=dict(
            type='Pretrained', # load pretrained weights to backbone
            checkpoint='https://download.openmmlab.com/mmpose'
            '/pretrain_models/hrnet_w32-36af842e.pth'),
    ),
    head=dict( # config of head
        type='HeatmapHead',
        in_channels=32,
        out_channels=17,
        deconv_out_channels=None,
        loss=dict(type='KeypointMSELoss', use_target_weight=True), # config of loss function
        decoder=codec), # get decoder from codec
    test_cfg=dict(
        flip_test=True, # flag of flip test
        flip_mode='heatmap', # heatmap flipping
        shift_heatmap=True,  # shift the flipped heatmap several pixels to get a better performance
    ))
```

### Evaluation

Evaluation configuration refers to metrics commonly used by public datasets for keypoint detection tasks, mainly including:

- AR, AP and mAP

- PCK, PCKh, tPCK

- AUC

- EPE

- NME

Here is the description of Evaluation configuration, which defines a COCO metric evaluator:

```Python
val_evaluator = dict(
    type='CocoMetric', # coco AP
    ann_file=data_root + 'annotations/person_keypoints_val2017.json') # path to annotation file
test_evaluator = val_evaluator # use val as test by default
```

## Config File Naming Convention

MMPose follow the style below to name config files：

```Python
{{algorithm info}}_{{module info}}_{{training info}}_{{data info}}.py
```

The filename is divided into four parts:

- **Algorithm Information**: the name of algorithm, such as `topdown-heatmap`, `topdown-rle`

- **Module Information**: list of intermediate modules in the forward order, such as `res101`, `hrnet-w48`

- **Training Information**: settings of training(e.g. `batch_size`, `scheduler`), such as `8xb64-210e`

- **Data Information**: the name of dataset, the reshape of input data, such as `ap10k-256x256`, `zebra-160x160`

Words between different parts are connected by `'_'`, and those from the same part are connected by `'-'`.

To avoid a too long filename, some strong related modules in `{{module info}}` will be omitted, such as `gap` in `RLE` algorithm, `deconv` in `Heatmap-based` algorithm

Contributors are advised to follow the same style.

## Common Usage

### Inheritance

This is often used to inherit configurations from other config files. Let's assume two configs like:

`optimizer_cfg.py`:

```Python
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
```

`resnet50.py`:

```Python
_base_ = ['optimizer_cfg.py']
model = dict(type='ResNet', depth=50)
```

Although we did not define `optimizer` in `resnet50.py`, all configurations in `optimizer.py` will be inherited by setting `_base_ = ['optimizer_cfg.py']`

```Python
cfg = Config.fromfile('resnet50.py')
cfg.optimizer  # ConfigDict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
```

### Modification

For configurations already set in previous configs, you can directly modify arguments specific to that module.

`resnet50_lr0.01.py`:

```Python
_base_ = ['optimizer_cfg.py']
model = dict(type='ResNet', depth=50)
optimizer = dict(lr=0.01) # modify specific filed
```

Now only `lr` is modified:

```Python
cfg = Config.fromfile('resnet50_lr0.01.py')
cfg.optimizer  # ConfigDict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
```

### Delete

For configurations already set in previous configs, if you wish to modify some specific argument and delete the remainders(in other words, discard the previous and redefine the module), you can set `_delete_=True`.

`resnet50.py`:

```Python
_base_ = ['optimizer_cfg.py', 'runtime_cfg.py']
model = dict(type='ResNet', depth=50)
optimizer = dict(_delete_=True, type='SGD', lr=0.01) # discard the previous and redefine the module
```

Now only `type` and `lr` are kept:

```Python
cfg = Config.fromfile('resnet50_lr0.01.py')
cfg.optimizer  # ConfigDict(type='SGD', lr=0.01)
```

```{note}
If you wish to learn more about advanced usages of the config system, please refer to [MMEngine Config](https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html).
```