Track 1 Results Submissions Instructions

Submission format
As part of Track 1 of the NVIDIA AI City Challenge, teams build models that can detect, localize, and classify objects in keyframes extracted from videos at several intersections. Teams were given training and validation subsets from three datasets, aic480, aic540, and aic1080. A few days before the challenge results are due, test sets will be provided, in the form of sets of keyframe images. For each image, teams will execute prediction models and provide results in one of two formats:

- A zip archive (NOT tar.gz, .z, .rar, or any other type of archive) containing one file for each test image, with the same name as the image, except using the '.txt' extension. Text files should not be in a sub-directory. Each text file will have one line for each predicted bounding box, in the following format:

class xmin ymin xmax ymax confidence

- A JSON file containing a dictionary with elements for each image in the test set, in the following format:

  "great_neck_first_colonial_20140604_00016": [
     "class": "Van",
     "confidence": 0.93,
     "xmax": 506.0,
     "xmin": 424.0,
     "ymax": 297.0,
     "ymin": 252.0
     "class": "SUV",
     "confidence": 0.24,
     "xmax": 281.0,
     "xmin": 179.0,
     "ymax": 348.0,
     "ymin": 293.0

Note that image names in the JSON format do not have the file extension. Confidence scores are float values in the range[0,1]. Bounding box coordinates are number of pixels from the lower-left corner of the image. Class is the string value representing the class (e.g., Car). For derived datasets that have numeric classes, please note the 0-indexed class list can be found in /custom_classes.txt, where is the AIC version of the dataset (e.g., 2 -> SUV). An example of each type of submission formats is included as an attachment.

Submission site
A Web-based submission site is currently being built and will be available once test sets are provided to the teams. Teams will log in using the same credentials they were provided for the annotation system. Each team will be allowed a maximum of 5 submissions for each dataset, and their best result will be used when computing the final team score. Teams should not use submissions for tuning model parameters. Each submission on a dataset should be for a different model.

An evaluation script will be provided to the teams soon. Using this script, they can test their model performance on their own training, test, or validation sets and choose to submit only their top-performing models. For each dataset and object class combination, we will compute several prediction scores (e.g., Average Precision, F1-score). We follow the Pascal VOC 2012 challenge in our methodology for computing these scores. The strategy for combining these scores for the purpose of ranking teams will be revealed at a later time.

Reporting results
Teams should prepare a short report between 4 to 6 pages, in IEEE 2-column format, describing the modeling methods they employed and the results they obtained as part of the challenge. Additional details regarding the report submission site and details will be provided at a later time.


Docker for Jetson TX2

This information is being made available to the NVIDIA AI City Challenge Participants courtesy of Chris Dye and team at IBM.
Challenge teams can pull prebuilt images from the docker hub repo or build them themselves using Dockerfiles and other supporting code for Jetson TX1 and Jetson TX2

Docker hub repo:

Wiki's: (Getting started with Jetson, Docker; pulling and building images)


docker images
TX1: Caffe, Darknet (w/ Yolo)
TX2: Caffe, TensorRT, DustyNV, Darknet (w/ Yolo)

For any questions, please email