Will pretraining on Object365 helps results on ADE20K to get to 60+ mIoU? #2760

fingertap · 2022-08-18T00:20:57Z

fingertap
Aug 18, 2022

Mask DINO says they pretrained on Object365 detection dataset, and the results on semantic segmentation on ADE20k gets to 60.8, which is way higher than backbones such as convnext and segformer. Is there an easy way to reproduce this result using mmseg?

fingertap · 2022-08-18T04:51:09Z

fingertap
Aug 18, 2022
Author

Or is this performance boost is a result of pretraining on Imagenet 22k? What is the pretraining dataset for segformer-b5 and convnext-XL? Why is there a big gap between their performance and those on the leader board (~60 mIoU on ADE20k val)?

Thanks!

0 replies

MeowZheng · 2022-08-18T06:19:02Z

MeowZheng
Aug 18, 2022
Maintainer

It's a good question that the different pretrained models will help the results on ade20k, and many excellent scientists are trying to figure it. In fact, we are planning to do some experiments on this problem, but suffer to the limitation of human resources and computation resources, this plan has not been carried out.

I remembered, from the paper, the pretrained dataset of segformer-b5 is imagenet-1k and convnext-xl is imagenet-22k, and you can check the train schedule details in their paper or mmclassification which is the pretrained backbone model zoo and codebase in OpenMMLab.

0 replies

fingertap · 2022-08-18T06:52:12Z

fingertap
Aug 18, 2022
Author

I doubt some of the pretraining dataset may have visual similarities as ADE20K. To reproduce the entire experiment including pretraining on Imagenet-22k requires large dataset download, many training tricks (even including random seeds), and most importantly, long machine and human time to train. For me, it is not realistic. However, benchmarks such as mmseg does not include a model that is on par with those on the leaderboard, so I'm really concerned about the reproducity.

~~I notice that this work (HORNet) produces a 58 mIoU on ADE20k based on mmseg. Maybe this is a good start.~~

I'm sorry, their results are 55 mIoU on ADE20k, not dramatically higher than convnext.

0 replies

MeowZheng · 2022-08-18T08:32:04Z

MeowZheng
Aug 18, 2022
Maintainer

Thanks for your reminder.

0 replies

fingertap · 2022-08-18T14:06:41Z

fingertap
Aug 18, 2022
Author

I notice that for Beit, by training 320k iterations, the model can achieve 56+ mIoU. Training for longer time seems to be a potential cause. But this is not guaranteed, as no validation is available for early stopping.

0 replies

ydhongHIT · 2022-10-16T14:49:53Z

ydhongHIT
Oct 16, 2022

I notice that for Beit, by training 320k iterations, the model can achieve 56+ mIoU. Training for longer time seems to be a potential cause. But this is not guaranteed, as no validation is available for early stopping.

Hi, have you reproduced such high mIoU? I think that the fluctuation of semantic segmentation datasets (Cityscapes, Ade20k) is very huge compared to object detection (COCO). It means that different random seeds may bring obvious performance variation. At least, that's what my experiments with MMsegmentation say. I don't know how those papers report the performance. Maybe running many times and reporting the best mIoU.

0 replies

ydhongHIT · 2022-10-16T14:53:37Z

ydhongHIT
Oct 16, 2022

I estimate that the variance of some sota methods such as segmenter and BEiT is more than 1% mIoU on ade20k.

0 replies

fingertap · 2022-10-18T08:15:45Z

fingertap
Oct 18, 2022
Author

I notice that for Beit, by training 320k iterations, the model can achieve 56+ mIoU. Training for longer time seems to be a potential cause. But this is not guaranteed, as no validation is available for early stopping.

Hi, have you reproduced such high mIoU? I think that the fluctuation of semantic segmentation datasets (Cityscapes, Ade20k) is very huge compared to object detection (COCO). It means that different random seeds may bring obvious performance variation. At least, that's what my experiments with MMsegmentation say. I don't know how those papers report the performance. Maybe running many times and reporting the best mIoU.

Hi, no I did not reproduce this score. There are variance, indeed. However, I believe the comparison should at least be using the same random seed to ensure the initialization of the networks is the same. Running the model with multiple random seeds are too wasteful for calculation resources.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Will pretraining on Object365 helps results on ADE20K to get to 60+ mIoU? #2760

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 8 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Will pretraining on Object365 helps results on ADE20K to get to 60+ mIoU? #2760

fingertap Aug 18, 2022

Replies: 8 comments

fingertap Aug 18, 2022 Author

MeowZheng Aug 18, 2022 Maintainer

fingertap Aug 18, 2022 Author

MeowZheng Aug 18, 2022 Maintainer

fingertap Aug 18, 2022 Author

ydhongHIT Oct 16, 2022

ydhongHIT Oct 16, 2022

fingertap Oct 18, 2022 Author

fingertap
Aug 18, 2022

fingertap
Aug 18, 2022
Author

MeowZheng
Aug 18, 2022
Maintainer

fingertap
Aug 18, 2022
Author

MeowZheng
Aug 18, 2022
Maintainer

fingertap
Aug 18, 2022
Author

ydhongHIT
Oct 16, 2022

ydhongHIT
Oct 16, 2022

fingertap
Oct 18, 2022
Author