Skip to content

Commit

Permalink
[Refactor] Refactor auto-generated model zoo and dataset zoo(#2552)
Browse files Browse the repository at this point in the history
  • Loading branch information
cir7 committed Jun 27, 2023
1 parent 227a49c commit e28ec68
Show file tree
Hide file tree
Showing 52 changed files with 657 additions and 393 deletions.
10 changes: 9 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,16 @@ instance/
# Scrapy stuff:
.scrapy

# Sphinx documentation
# Auto generate documentation
docs/*/_build/
docs/*/model_zoo/
docs/*/dataset_zoo/
docs/*/_model_zoo.rst
docs/*/modelzoo_statistics.md
docs/*/datasetzoo_statistics.md
docs/*/projectzoo.md
docs/*/papers/
docs/*/api/generated/

# PyBuilder
target/
Expand Down
2 changes: 1 addition & 1 deletion .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ version: 2
build:
os: ubuntu-22.04
tools:
python: "3.7"
python: "3.9"

formats:
- epub
Expand Down
4 changes: 2 additions & 2 deletions configs/detection/videomae/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ python tools/train.py configs/detection/ava_kinetics/vit-base-p16_videomae-k400-
--cfg-options randomness.seed=0 randomness.deterministic=True
```

For more details, you can refer to the **Training** part in the [Training and Test Tutorial](/docs/en/user_guides/4_train_test.md).
For more details, you can refer to the **Training** part in the [Training and Test Tutorial](/docs/en/user_guides/train_test.md).

## Test

Expand All @@ -61,7 +61,7 @@ python tools/test.py configs/detection/ava_kinetics/vit-base-p16_videomae-k400-p
checkpoints/SOME_CHECKPOINT.pth --dump result.pkl
```

For more details, you can refer to the **Test** part in the [Training and Test Tutorial](/docs/en/user_guides/4_train_test.md).
For more details, you can refer to the **Test** part in the [Training and Test Tutorial](/docs/en/user_guides/train_test.md).

## Citation

Expand Down
2 changes: 1 addition & 1 deletion configs/localization/bsn/metafile.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Collections:
- Name: BMN
- Name: BSN
README: configs/localization/bsn/README.md
Paper:
URL: https://arxiv.org/abs/1806.02964
Expand Down
2 changes: 1 addition & 1 deletion configs/recognition/mvit/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ the corresponding result without repeat augment is as follows:

| frame sampling strategy | resolution | backbone | pretrain | top1 acc | top5 acc | reference top1 acc | reference top5 acc | testing protocol | FLOPs | params | config | ckpt | log |
| :---------------------: | :--------: | :------: | :------: | :------: | :------: | :---------------------------: | :----------------------------: | :--------------: | :---: | :----: | :----------------: | :--------------: | :-------------: |
| uniform 16 | 224x224 | MViTv2-S | K400 | 68.2 | 91.3 | [68.2](https://github.com/facebookresearch/SlowFast/blob/main/projects/mvitv2/README.md) | [91.4](https://github.com/facebookresearch/SlowFast/blob/main/projects/mvitv2/README.md) | 1 clips x 3 crop | 64G | 34.4M | [config](/configs/recognition/mvit/mvit-small-p244_k400-pre_16xb16-u16-100e_sthv2-rgb) | [ckpt](https://download.openmmlab.com/mmaction/v1.0/recognition/mvit/mvit-small-p244_k400-pre_16xb16-u16-100e_sthv2-rgb/mvit-small-p244_k400-pre_16xb16-u16-100e_sthv2-rgb_20230201-4065c1b9.pth) | [log](https://download.openmmlab.com/mmaction/v1.0/recognition/mvit/mvit-small-p244_k400-pre_16xb16-u16-100e_sthv2-rgb/mvit-small-p244_k400-pre_16xb16-u16-100e_sthv2-rgb.log) |
| uniform 16 | 224x224 | MViTv2-S | K400 | 68.2 | 91.3 | [68.2](https://github.com/facebookresearch/SlowFast/blob/main/projects/mvitv2/README.md) | [91.4](https://github.com/facebookresearch/SlowFast/blob/main/projects/mvitv2/README.md) | 1 clips x 3 crop | 64G | 34.4M | [config](/configs/recognition/mvit/mvit-small-p244_k400-pre_16xb16-u16-100e_sthv2-rgb.py) | [ckpt](https://download.openmmlab.com/mmaction/v1.0/recognition/mvit/mvit-small-p244_k400-pre_16xb16-u16-100e_sthv2-rgb/mvit-small-p244_k400-pre_16xb16-u16-100e_sthv2-rgb_20230201-4065c1b9.pth) | [log](https://download.openmmlab.com/mmaction/v1.0/recognition/mvit/mvit-small-p244_k400-pre_16xb16-u16-100e_sthv2-rgb/mvit-small-p244_k400-pre_16xb16-u16-100e_sthv2-rgb.log) |

For more details on data preparation, you can refer to

Expand Down
6 changes: 5 additions & 1 deletion configs/recognition/tin/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,11 @@ Here, we use `finetune` to indicate that we use [TSM model](https://download.ope

:::

For more details on data preparation, you can refer to Kinetics400, Something-Something V1 and Something-Something V2 in [Prepare Datasets](/docs/en/user_guides/2_data_prepare.md).
For more details on data preparation, you can refer to

- [Kinetics](/tools/data/kinetics/README.md)
- [Something-something V1](/tools/data/sthv1/README.md)
- [Something-something V2](/tools/data/sthv2/README.md)

## Train

Expand Down
6 changes: 5 additions & 1 deletion configs/recognition/tpn/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,11 @@ Visual tempo characterizes the dynamics and the temporal scale of an action. Mod

:::

For more details on data preparation, you can refer to Kinetics400, Something-Something V1 and Something-Something V2 in [Data Preparation](/docs/data_preparation.md).
For more details on data preparation, you can refer to

- [Kinetics](/tools/data/kinetics/README.md)
- [Something-something V1](/tools/data/sthv1/README.md)
- [Something-something V2](/tools/data/sthv2/README.md)

## Train

Expand Down
4 changes: 2 additions & 2 deletions configs/recognition/tpn/metafile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Models:
- Dataset: Kinetics-400
Metrics:
Top 1 Accuracy: 74.20
top5 accuracy: 91.48
Top 5 Accuracy: 91.48
Task: Action Recognition
Training Log: https://download.openmmlab.com/mmaction/v1.0/recognition/tpn/tpn-slowonly_r50_8xb8-8x8x1-150e_kinetics400-rgb/tpn-slowonly_r50_8xb8-8x8x1-150e_kinetics400-rgb.log
Weights: https://download.openmmlab.com/mmaction/v1.0/recognition/tpn/tpn-slowonly_r50_8xb8-8x8x1-150e_kinetics400-rgb/tpn-slowonly_r50_8xb8-8x8x1-150e_kinetics400-rgb_20220913-97d0835d.pth
Expand All @@ -45,7 +45,7 @@ Models:
- Dataset: Kinetics-400
Metrics:
Top 1 Accuracy: 76.74
top5 accuracy: 92.57
Top 5 Accuracy: 92.57
Task: Action Recognition
Training Log: https://download.openmmlab.com/mmaction/v1.0/recognition/tpn/tpn-slowonly_imagenet-pretrained-r50_8xb8-8x8x1-150e_kinetics400-rgb/tpn-slowonly_imagenet-pretrained-r50_8xb8-8x8x1-150e_kinetics400-rgb.log
Weights: https://download.openmmlab.com/mmaction/v1.0/recognition/tpn/tpn-slowonly_imagenet-pretrained-r50_8xb8-8x8x1-150e_kinetics400-rgb/tpn-slowonly_imagenet-pretrained-r50_8xb8-8x8x1-150e_kinetics400-rgb_20220913-fed3f4c1.pth
Expand Down
10 changes: 5 additions & 5 deletions configs/recognition/uniformer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ It is a challenging task to learn rich and multi-scale spatiotemporal semantics

### Kinetics-400

| frame sampling strategy | resolution | backbone | top1 acc | top5 acc | [reference](<(https://github.com/Sense-X/UniFormer/blob/main/video_classification/README.md)>) top1 acc | [reference](<(https://github.com/Sense-X/UniFormer/blob/main/video_classification/README.md)>) top5 acc | mm-Kinetics top1 acc | mm-Kinetics top5 acc | testing protocol | FLOPs | params | config | ckpt |
| :---------------------: | :------------: | :---------: | :------: | :------: | :-----------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------: | :------------------: | :------------------: | :--------------: | :---: | :----: | :-----------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------: |
| 16x4x1 | short-side 320 | UniFormer-S | 80.9 | 94.6 | 80.8 | 94.7 | 80.9 | 94.6 | 4 clips x 1 crop | 41.8G | 21.4M | [config](/configs/recognition/uniformer/uniformer-small_imagenet1k-pre_16x4x1_kinetics400-rgb.py) | [ckpt](https://download.openmmlab.com/mmaction/v1.0/recognition/uniformerv1/uniformer-small_imagenet1k-pre_16x4x1_kinetics400-rgb_20221219-c630a037.pth) |
| 16x4x1 | short-side 320 | UniFormer-B | 82.0 | 95.0 | 82.0 | 95.1 | 82.0 | 95.0 | 4 clips x 1 crop | 96.7G | 49.8M | [config](/configs/recognition/uniformer/uniformer-base_imagenet1k-pre_16x4x1_kinetics400-rgb.py) | [ckpt](https://download.openmmlab.com/mmaction/v1.0/recognition/uniformerv1/uniformer-base_imagenet1k-pre_16x4x1_kinetics400-rgb_20221219-157c2e66.pth) |
| 32x4x1 | short-side 320 | UniFormer-B | 83.1 | 95.3 | 82.9 | 95.4 | 83.0 | 95.3 | 4 clips x 1 crop | 59G | 49.8M | [config](/configs/recognition/uniformer/uniformer-base_imagenet1k-pre_32x4x1_kinetics400-rgb.py) | [ckpt](https://download.openmmlab.com/mmaction/v1.0/recognition/uniformerv1/uniformer-base_imagenet1k-pre_32x4x1_kinetics400-rgb_20221219-b776322c.pth) |
| frame sampling strategy | resolution | backbone | top1 acc | top5 acc | [reference](https://github.com/Sense-X/UniFormer/blob/main/video_classification/README.md) top1 acc | [reference](https://github.com/Sense-X/UniFormer/blob/main/video_classification/README.md) top5 acc | mm-Kinetics top1 acc | mm-Kinetics top5 acc | testing protocol | FLOPs | params | config | ckpt |
| :---------------------: | :------------: | :---------: | :------: | :------: | :-------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------: | :------------------: | :------------------: | :--------------: | :---: | :----: | :-----------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------: |
| 16x4x1 | short-side 320 | UniFormer-S | 80.9 | 94.6 | 80.8 | 94.7 | 80.9 | 94.6 | 4 clips x 1 crop | 41.8G | 21.4M | [config](/configs/recognition/uniformer/uniformer-small_imagenet1k-pre_16x4x1_kinetics400-rgb.py) | [ckpt](https://download.openmmlab.com/mmaction/v1.0/recognition/uniformerv1/uniformer-small_imagenet1k-pre_16x4x1_kinetics400-rgb_20221219-c630a037.pth) |
| 16x4x1 | short-side 320 | UniFormer-B | 82.0 | 95.0 | 82.0 | 95.1 | 82.0 | 95.0 | 4 clips x 1 crop | 96.7G | 49.8M | [config](/configs/recognition/uniformer/uniformer-base_imagenet1k-pre_16x4x1_kinetics400-rgb.py) | [ckpt](https://download.openmmlab.com/mmaction/v1.0/recognition/uniformerv1/uniformer-base_imagenet1k-pre_16x4x1_kinetics400-rgb_20221219-157c2e66.pth) |
| 32x4x1 | short-side 320 | UniFormer-B | 83.1 | 95.3 | 82.9 | 95.4 | 83.0 | 95.3 | 4 clips x 1 crop | 59G | 49.8M | [config](/configs/recognition/uniformer/uniformer-base_imagenet1k-pre_32x4x1_kinetics400-rgb.py) | [ckpt](https://download.openmmlab.com/mmaction/v1.0/recognition/uniformerv1/uniformer-base_imagenet1k-pre_32x4x1_kinetics400-rgb_20221219-b776322c.pth) |

The models are ported from the repo [UniFormer](https://github.com/Sense-X/UniFormer/blob/main/video_classification/README.md) and tested on our data. Currently, we only support the testing of UniFormer models, training will be available soon.

Expand Down
Loading

0 comments on commit e28ec68

Please sign in to comment.