MMDetection Tracking Dataset
Sample annotation file
{
"categories": [
{
"id": 1,
"name": "pedestrian"
}
],
"videos": [
{
"id": 1,
"name": "aid052N1D3_tp1_stack1_default_aug_False_epoch_19_theta_0.5_delta_0.1_Test",
"fps": 1,
"width": 512,
"height": 512
},
...
],
"images": [
{
"id": 1,
"video_id": 1,
"file_name": "aid052N1D3_tp1_stack1_default_aug_False_epoch_19_theta_0.5_delta_0.1_Test\\img\\000001.png",
"height": 512,
"width": 512,
"frame_id": 0,
"mot_frame_id": 1
},
...
],
"annotations": [
{
"category_id": 1,
"bbox": [
22.0,
129.0,
18.0,
14.0
],
"area": 252.0,
"iscrowd": false,
"visibility": 1.0,
"mot_instance_id": 0,
"mot_conf": 1.0,
"mot_class_id": 0,
"id": 1,
"image_id": 1,
"instance_id": 0
},
...
],
"num_instances": 784
}
JSON annotation file structure
erDiagram
DATASET {
int num_instances "Number of distinct annotation instance ids"
}
CATEGORY }o--|| DATASET : has
VIDEO }o--|| DATASET : has
IMAGE }o--|| DATASET : has
ANNOTATION }o--|| DATASET : has
Category entities
erDiagram
CATEGORY {
int id PK "The category ID starting from 1"
string name "The category name"
}
Video entities
erDiagram
VIDEO {
int id PK "The sequence / video ID"
string name "The sequence / video name"
fps int "frames per second of the video"
width int "frame / image width"
height int "frame / image height"
}
Image entities
erDiagram
IMAGE {
id int PK "Image ID starting from 1"
video_id int "FK Video ID"
file_name string "Image filename (relative to train directory)"
height int "Image height"
width int "Image width"
frame_id int "Frame ID (starting from 0)"
mot_frame_id int "MOT frame ID (starting from 1)"
}
Annotation entities
erDiagram
ANNOTATION {
id int PK "Annotation ID (starting from 1)"
category_id int FK "ID of the category of this bounding box"
bbox list[int] "Bounding box in XYXY format"
area int "Bounding box area (width * height)"
iscrowd bool "Always false"
visibility float "Always 1.0"
mot_instance_id int "Instance ID of this object (every distinct object has its own ID)"
mot_conf float "Confidence score (always 1.0 in GT)"
mot_class_id int "Class ID (same as category_id minus 1)"
instance_id int "Instance ID"
}
Convert MOT to MMDetection COCO format
MMDetection contains a script tools/dataset_converters/mot2coco.py that can convert the MOT annotation format to MMDetection COCO format. First place your dataset in the data/MOT17 directory with the following structure:
train
|- <stack_train1>
|- det
|- det.txt
|- gt
|- gt_half-train.txt
|- gt_half-val.txt
|- gt.txt
|- img
|- 000001.png
|- 000002.png
|- 000003.png
|- ...
|- seqinfo.ini
|- <stack_train2>
|- ...
...
val
|- <stack_val1>
...
test
|- <stack_test1>
...
Next call the following Python script to create COCO tracking and REID datasets.
# Create COCO tracking dataset
python ./tools/dataset_converters/mot2coco.py -i ./data/MOT17 -o ./data/MOT17/annotations --split-train --convert-det
# Create REID annotations
python ./tools/dataset_converters/mot2reid.py -i ./data/MOT17/ -o ./data/MOT17/reid --val-split 0.2 --vis-threshold 0.3