空持千百偈,不如吃茶去

空持千百偈,不如吃茶去

一、引言

本次探讨的问题是:如何通过OpenPCDet改变预测的类别?

假设:我们原来预测的类别是Car,如何仅在yaml文件中添加一些配置,使得模型能够同时预测出Pedestrians和Cyclists。

下面就这个问题简单复现一下。

二、复现过程

首先我们以VirConv举例:

3D目标检测实用技巧(四)- OpenPCDet改变预测类别-LMLPHP

在VirConv.yaml文件中,上面的Class只有['Car']类别,因此首先我们要做的是要把其他两个类别加进来:

CLASS_NAMES: ['Car','Pedestrian', 'Cyclist']  # prediction class

然后这样运行起来,预测的结果将会是:

3D目标检测实用技巧(四)- OpenPCDet改变预测类别-LMLPHP

 也就是说,除了car类别以外,没有任何一个类别可以被成功预测出来。这是因为我们没有添加anchor生成器的一些参数:

    DENSE_HEAD:
        NAME: AnchorHeadSingle
        CLASS_AGNOSTIC: False

        USE_DIRECTION_CLASSIFIER: True
        DIR_OFFSET: 0.78539
        DIR_LIMIT_OFFSET: 0.0
        NUM_DIR_BINS: 2
        # 在这里加入预测类别的框
        ANCHOR_GENERATOR_CONFIG: [  # prediction anchor generation
            {
                'class_name': 'Car',
                'anchor_sizes': [[3.9, 1.6, 1.56]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-1.78],
                'align_center': False,
                'feature_map_stride': 8,
                'matched_threshold': 0.6,
                'unmatched_threshold': 0.45
            },
            {
                'class_name': 'Pedestrian',
                'anchor_sizes': [[0.8, 0.6, 1.73]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-0.6],
                'align_center': False,
                'feature_map_stride': 8,
                'matched_threshold': 0.5,
                'unmatched_threshold': 0.35
            },
            {
                'class_name': 'Cyclist',
                'anchor_sizes': [[1.76, 0.6, 1.73]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-0.6],
                'align_center': False,
                'feature_map_stride': 8,
                'matched_threshold': 0.5,
                'unmatched_threshold': 0.35
            }
        ]

加入后就没啥问题了:

2024-04-22 20:20:45,398   INFO  Car AP@0.70, 0.70, 0.70:
bbox AP:99.0242, 89.8920, 89.7941
bev  AP:90.5798, 88.8974, 88.8224
3d   AP:90.2740, 87.3502, 87.2354
aos  AP:98.71, 89.46, 89.26
Car AP_R40@0.70, 0.70, 0.70:
bbox AP:99.4942, 95.7422, 95.6544
bev  AP:96.6726, 92.3975, 92.3774
3d   AP:96.1565, 88.9776, 88.9854
aos  AP:99.16, 95.20, 94.96
Car AP@0.70, 0.50, 0.50:
bbox AP:99.0242, 89.8920, 89.7941
bev  AP:98.9991, 97.3256, 89.7406
3d   AP:98.9965, 89.8333, 89.7366
aos  AP:98.71, 89.46, 89.26
Car AP_R40@0.70, 0.50, 0.50:
bbox AP:99.4942, 95.7422, 95.6544
bev  AP:99.4721, 97.7615, 95.6498
3d   AP:99.4714, 95.6845, 95.6274
aos  AP:99.16, 95.20, 94.96
Pedestrian AP@0.50, 0.50, 0.50:
bbox AP:59.1360, 50.5516, 42.3628
bev  AP:59.0754, 50.3845, 42.3691
3d   AP:58.5248, 49.8588, 42.0124
aos  AP:32.92, 28.50, 24.50
Pedestrian AP_R40@0.50, 0.50, 0.50:
bbox AP:61.8966, 50.3211, 43.5737
bev  AP:61.6259, 50.0281, 43.1782
3d   AP:60.9391, 49.4208, 42.6651
aos  AP:30.05, 23.97, 20.56
Pedestrian AP@0.50, 0.25, 0.25:
bbox AP:59.1360, 50.5516, 42.3628
bev  AP:68.7782, 59.6901, 51.3097
3d   AP:68.7782, 59.6901, 51.3086
aos  AP:32.92, 28.50, 24.50
Pedestrian AP_R40@0.50, 0.25, 0.25:
bbox AP:61.8966, 50.3211, 43.5737
bev  AP:68.3972, 56.2436, 46.9825
3d   AP:68.3972, 56.2436, 46.9822
aos  AP:30.05, 23.97, 20.56
Cyclist AP@0.50, 0.50, 0.50:
bbox AP:60.3033, 42.8206, 34.7173
bev  AP:59.6499, 41.9951, 34.3846
3d   AP:57.8227, 41.0860, 33.6334
aos  AP:29.64, 22.72, 19.19
Cyclist AP_R40@0.50, 0.50, 0.50:
bbox AP:61.2353, 37.5554, 35.1295
bev  AP:58.4128, 36.8041, 34.4358
3d   AP:56.3943, 35.5600, 31.3543
aos  AP:26.01, 15.93, 14.79
Cyclist AP@0.50, 0.25, 0.25:
bbox AP:60.3033, 42.8206, 34.7173
bev  AP:59.7645, 42.0996, 34.3846
3d   AP:59.7645, 42.0996, 34.3846
aos  AP:29.64, 22.72, 19.19
Cyclist AP_R40@0.50, 0.25, 0.25:
bbox AP:61.2353, 37.5554, 35.1295
bev  AP:60.5544, 36.9441, 34.5113
3d   AP:60.5544, 36.9441, 34.5113
aos  AP:26.01, 15.93, 14.79

三、 注意

还有一点值得注意的是,yaml文件还有一个地方提及了三个类别,也就是gt-sampling:

  AUG_CONFIG_LIST:
            - NAME: gt_sampling
              AUG_WITH_IMAGE: True # use PC-Image Aug
              JOINT_SAMPLE: True # joint sample with point
              KEEP_RAW: False # keep original PC
              POINT_REFINE: True # refine points with different calib
              BOX_IOU_THRES: 0.5
              IMG_AUG_TYPE: by_depth  
              AUG_USE_TYPE: annotation 
              IMG_ROOT_PATH: semi/image_2

              USE_ROAD_PLANE: False
              DB_INFO_PATH:
                  - kitti_dbinfos_trainsemi.pkl
              PREPARE: {
                  filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],
                  filter_by_difficulty: [-1],
              }
              # 这是gt-sampling增强用到的参数
              SAMPLE_GROUPS: ['Car:15', 'Pedestrian:15', 'Cyclist:15']
              NUM_POINT_FEATURES: 8
              DATABASE_WITH_FAKELIDAR: False  
              REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0]
              LIMIT_WHOLE_SCENE: False

            - NAME: random_world_rotation
              WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]

            - NAME: random_world_flip
              ALONG_AXIS_LIST: ['x']

            - NAME: random_world_scaling
              WORLD_SCALE_RANGE: [0.95, 1.05]

参数的意思也很简单,就是从别的帧里采样多少个目标过来,Car:15就是采样十五个汽车。实际上,这些与我们上述的更改与预测什么类别并没有太大联系,只能说一定程度上会影响预测精度。

04-23 18:06