You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have the DAIR-V2X infrastructure side transformed into KITTI dataset format, as explained in the DAIR-V2X repo. For a bit of context, I attach the example image, calib and label:
When I browse the dataset with python tools/misc/browse_dataset.py configs/_base_/datasets/dair-infrastructure-3d-3class.py --task lidar_det --show-interval -1,
dair-infrastructure-3d-3class.py
# dataset settings
dataset_type = 'KittiDataset'
data_root = 'data/DAIR-V2X/cooperative-vehicle-infrastructure-kittiformat/infrastructure-side/'
class_names = ['Pedestrian', 'Cyclist', 'Car']
point_cloud_range = [0, -60, -2, 200, 60, 1]
input_modality = dict(use_lidar=True, use_camera=False)
metainfo = dict(classes=class_names)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/kitti/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args = None
db_sampler = dict(
data_root=data_root,
info_path=data_root + 'dair_dbinfos_train.pkl',
rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(Car=5, Pedestrian=10, Cyclist=10)),
classes=class_names,
sample_groups=dict(Car=12, Pedestrian=6, Cyclist=6),
points_loader=dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4,
backend_args=backend_args),
backend_args=backend_args)
train_pipeline = [
dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4, # x, y, z, intensity
use_dim=4,
backend_args=backend_args),
dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
dict(type='ObjectSample', db_sampler=db_sampler),
dict(
type='ObjectNoise',
num_try=100,
translation_std=[1.0, 1.0, 0.5],
global_rot_range=[0.0, 0.0],
rot_range=[-0.78539816, 0.78539816]),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(
type='GlobalRotScaleTrans',
rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='PointShuffle'),
dict(
type='Pack3DDetInputs',
keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4,
backend_args=backend_args),
dict(
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range)
]),
dict(type='Pack3DDetInputs', keys=['points'])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [
dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4,
backend_args=backend_args),
dict(type='Pack3DDetInputs', keys=['points'])
]
train_dataloader = dict(
batch_size=16,
num_workers=2,
persistent_workers=True,
sampler=dict(type='DefaultSampler', shuffle=True),
dataset=dict(
type='RepeatDataset',
times=2,
dataset=dict(
type=dataset_type,
data_root=data_root,
ann_file='dair_infos_train.pkl',
# data_prefix=dict(pts='training/velodyne_reduced'),
data_prefix=dict(pts='training/velodyne'),
pipeline=train_pipeline,
modality=input_modality,
test_mode=False,
metainfo=metainfo,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d='LiDAR',
backend_args=backend_args)))
val_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type=dataset_type,
data_root=data_root,
# data_prefix=dict(pts='training/velodyne_reduced'),
data_prefix=dict(pts='training/velodyne'),
ann_file='dair_infos_val.pkl',
pipeline=test_pipeline,
modality=input_modality,
test_mode=True,
metainfo=metainfo,
box_type_3d='LiDAR',
backend_args=backend_args))
test_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type=dataset_type,
data_root=data_root,
# data_prefix=dict(pts='training/velodyne_reduced'),
data_prefix=dict(pts='training/velodyne'),
ann_file='dair_infos_val.pkl',
pipeline=test_pipeline,
modality=input_modality,
test_mode=True,
metainfo=metainfo,
box_type_3d='LiDAR',
backend_args=backend_args))
val_evaluator = dict(
type='KittiMetric',
ann_file=data_root + 'dair_infos_val.pkl',
metric='bbox',
backend_args=backend_args)
test_evaluator = val_evaluator
vis_backends = [dict(type='LocalVisBackend')]
visualizer = dict(
type='Det3DLocalVisualizer', vis_backends=vis_backends, name='visualizer')
I get perfect 3D bboxes in the pointcloud
Results with pointcloud (perfect)
BUT when I browse the dataset with python tools/misc/browse_dataset.py configs/_base_/datasets/dair-infrastructure-mono3d.py --task mono_det --show-interval -1, I get the misplaced 2D and 3D bboxes:
It seems like the 3D bboxes orientation is the same as if it were the KITTI dataset (the vanishing point of the boxes is in the center of the image) and the 2D boxes coordinates are "calculated" from the 3D bboxes boundaries.
Troubleshoot
I already checked the lidar2cam and cam2img transformation matrices and the labels, since I can succesfully manually load the 2D bboxes and 3D bboxes. However, if the labels coordinates were incorrect, I could not have seen the correctly placed 3D bboxes in lidar detection, I guess.
Correct 2D bboxes
Correct 3D bboxes
Additional information
No response
The text was updated successfully, but these errors were encountered:
ramajoballester
changed the title
Misplaced 3D bboxes in DAIR-V2X-C infrastructure-side dataset in monocular 3D detection
Misplaced 2D and 3D bboxes in DAIR-V2X-C infrastructure-side dataset in monocular 3D detection
Mar 7, 2024
I am having a similar issue with a custom dataset in KITTI format. Only the projected 3D bounding box in the image is being incorrectly rendered and it looks correct in the point cloud when using multi-modality_det. I am on the dev-1.x branch.
I fixed this in #2923. It turns out the issue was that the images were being resized but the projected 2D bounding box points were not being correctly scaled down as well...
Hi, @yeetypete !
Thank you very much for your response. Unfortunately, your proposed solution does not solve the issue. Let me know if you need any data samples to work on them.
Prerequisite
Task
I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.
Branch
main branch https://github.com/open-mmlab/mmdetection3d
Environment
Environment
Reproduces the problem - code sample
Reproduces the problem - command or script
Reproduces the problem - error message
I have the DAIR-V2X infrastructure side transformed into KITTI dataset format, as explained in the DAIR-V2X repo. For a bit of context, I attach the example image, calib and label:
image_2/000009.jpg
calib/000009.txt
label_2/000009.txt
When I browse the dataset with
python tools/misc/browse_dataset.py configs/_base_/datasets/dair-infrastructure-3d-3class.py --task lidar_det --show-interval -1
,dair-infrastructure-3d-3class.py
I get perfect 3D bboxes in the pointcloud
Results with pointcloud (perfect)
BUT when I browse the dataset with
python tools/misc/browse_dataset.py configs/_base_/datasets/dair-infrastructure-mono3d.py --task mono_det --show-interval -1
, I get the misplaced 2D and 3D bboxes:dair-infrastructure-mono3d.py
Misplaced 2D bboxes
Misplaced 3D bboxes
It seems like the 3D bboxes orientation is the same as if it were the KITTI dataset (the vanishing point of the boxes is in the center of the image) and the 2D boxes coordinates are "calculated" from the 3D bboxes boundaries.
Troubleshoot
I already checked the lidar2cam and cam2img transformation matrices and the labels, since I can succesfully manually load the 2D bboxes and 3D bboxes. However, if the labels coordinates were incorrect, I could not have seen the correctly placed 3D bboxes in lidar detection, I guess.
Correct 2D bboxes
Correct 3D bboxes
Additional information
No response
The text was updated successfully, but these errors were encountered: