Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support OHD-SJTU dataset and object heading detection #704

Open
wants to merge 1 commit into
base: dev-1.x
Choose a base branch
from

Conversation

sltlls
Copy link
Contributor

@sltlls sltlls commented Jan 19, 2023

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Support OHD-SJTU dataset.
Support object heading detection.
Complete task8 and task9 in #626

Modification

Implement OHD-SJTU dataset, creating a new box type "RotatedHeadBoxes" to support box head quadrant gt.

Implement object heading detection by adding a head classification branch in rotated retinanet, supervised by FocalLoss, changing codes related to prediction and loss calculation functions.

Current performance(using ohd-sjtuL dataset):

  1. TP, HEAD_TP: iou>0.5, category and head quadrant are all predict correct.
model backbone angle lr schd map head_acc
rotated retinanet resnet50 le90 1x 15.6 4.7

2)TP is the same as classic mAP.
HEAD_TP: iou>0.5, category and head quadrant are all predict correct.

model backbone angle lr schd map head_acc
rotated retinanet resnet50 le90 1x 58.3 4.0

Problem:
Current experiment performance represents that model hardly learns how to predict box head quadrant, and I think that the big difference in magnitude between loss_head and loss_cls/loss_bbox may be the reason, as shown below

Epoch(train) [1][ 50/2615] lr: 3.9880e-04 grad_norm: 1.3056 loss: 2.9622 loss_cls: 1.2014 loss_bbox: 1.1142 loss_head: 0.6467
Epoch(train) [1][ 100/2615] lr: 4.6560e-04 grad_norm: 1.0562 loss: 2.7534 loss_cls: 1.1452 loss_bbox: 0.9708 loss_head: 0.6374
Epoch(train) [1][ 150/2615] lr: 5.3240e-04 grad_norm: 1.3597 loss: 2.7922 loss_cls: 1.1481 loss_bbox: 1.0197 loss_head: 0.6244
Epoch(train) [1][ 200/2615] lr: 5.9920e-04 grad_norm: 2.0736 loss: 2.7875 loss_cls: 1.1531 loss_bbox: 1.0378 loss_head: 0.5966
Epoch(train) [1][ 250/2615] lr: 6.6600e-04 grad_norm: 6.9892 loss: 2.5320 loss_cls: 1.1415 loss_bbox: 1.0527 loss_head: 0.3378
Epoch(train) [1][ 300/2615] lr: 7.3280e-04 grad_norm: 7.8637 loss: 2.3590 loss_cls: 1.1344 loss_bbox: 1.1265 loss_head: 0.0981
Epoch(train) [1][ 350/2615] lr: 7.9960e-04 grad_norm: 6.0629 loss: 2.2941 loss_cls: 1.1327 loss_bbox: 1.0805 loss_head: 0.0809
Epoch(train) [1][ 400/2615] lr: 8.6640e-04 grad_norm: 8.0070 loss: 2.2093 loss_cls: 1.0657 loss_bbox: 1.0669 loss_head: 0.0767
Epoch(train) [1][ 450/2615] lr: 9.3320e-04 grad_norm: 10.7075 loss: 2.1032 loss_cls: 0.9992 loss_bbox: 1.0426 loss_head: 0.0615
........................................................................................
Epoch(train) [2][2000/2615] lr: 1.0000e-03 grad_norm: 6.9920 loss: 1.1298 loss_cls: 0.3704 loss_bbox: 0.7534 loss_head: 0.0060
Epoch(train) [2][2050/2615] lr: 1.0000e-03 grad_norm: 6.6444 loss: 1.1109 loss_cls: 0.3370 loss_bbox: 0.7677 loss_head: 0.0061
Epoch(train) [2][2100/2615] lr: 1.0000e-03 grad_norm: 7.8718 loss: 1.1643 loss_cls: 0.4220 loss_bbox: 0.7350 loss_head: 0.0073
Epoch(train) [2][2150/2615] lr: 1.0000e-03 grad_norm: 8.7219 loss: 1.2167 loss_cls: 0.5056 loss_bbox: 0.7037 loss_head: 0.0074
Epoch(train) [2][2200/2615] lr: 1.0000e-03 grad_norm: 7.6649 loss: 1.1461 loss_cls: 0.4118 loss_bbox: 0.7260 loss_head: 0.0083

The whole strcture of task8 and task9 in #626 has been completed, but I don't know why loss_head descent so fast that model can't learn knowledge about box head, so I pull this PR to get some help.

BC-breaking (Optional)

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  3. The documentation has been modified accordingly, like docstring or example tutorials.

@CLAassistant
Copy link

CLAassistant commented Jan 19, 2023

CLA assistant check
All committers have signed the CLA.

@RangiLyu RangiLyu changed the base branch from 1.x to dev-1.x February 1, 2023 05:33
@liuyanyi
Copy link
Collaborator

liuyanyi commented Feb 1, 2023

Thanks for your contribution. This pr looks like a great start for head detection.

Implementing Head Detection is a big job, i think we should start by designing the overall framework.

The first step is to represent the rotated box with head, there could be different representations. For example (x,y,w,h,a,x_head,y_head) in this pr, or (x,y,w,h,a) with 360 degree, or (x,y,w,h,a,axis) in OHD-SJTU paper. This part need more discussion.

I guess werid loss_head here could be fix if we encode the head, it seem's head keep the orginal coordinate, I think it should be coded as (x,y)_head-(x,y)_center or (x,y)_head-(x,y)_anchor.

@zytx121
Copy link
Collaborator

zytx121 commented Feb 16, 2023

Please refer to #720.

Maybe you can support the OHD-SJTU dataset based on 360° detection.

@sltlls
Copy link
Contributor Author

sltlls commented Feb 18, 2023

Thanks for your contribution. This pr looks like a great start for head detection.

Implementing Head Detection is a big job, i think we should start by designing the overall framework.

The first step is to represent the rotated box with head, there could be different representations. For example (x,y,w,h,a,x_head,y_head) in this pr, or (x,y,w,h,a) with 360 degree, or (x,y,w,h,a,axis) in OHD-SJTU paper. This part need more discussion.

I guess werid loss_head here could be fix if we encode the head, it seem's head keep the orginal coordinate, I think it should be coded as (x,y)_head-(x,y)_center or (x,y)_head-(x,y)_anchor.

But (x,y)_head coordinates just exist in RotatedHeadBoxes, and they will be transformed to head quadrant(0/1/2/3) in model head. In model head I directly predict one head quadrant for an object(just like classification) and use FocalLoss to calculate the head loss between pedicted head quadrant and target quadrant(transformed from x,y_head coordinates)

@123456hxh
Copy link

I ran into a problem, I tried many times and I can't solve it, please what is the reason for this problem. thanks
File "<array_function internals>", line 180, in concatenate
ValueError: need at least one array to concatenate

@sltlls
Copy link
Contributor Author

sltlls commented Mar 27, 2023

I ran into a problem, I tried many times and I can't solve it, please what is the reason for this problem. thanks File "<array_function internals>", line 180, in concatenate ValueError: need at least one array to concatenate

I think we need more details to find the problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants