LTVRR Challenge

Long Tail Visual Relationship Recognition Challenge
Co-organized with the L3D-IVU Workshop CVPR 2023

Submission Servers
Dataset Download
Starter Code

Submission Servers are open!

About

The main goal of this challenge is to evaluate and benchmark new and better methods for GQA-LT and VG8K-LT benchmarks proposed in Exploring Long Tail Visual Relationship Recognition with Large Vocabulary. One high level motivation is to allow researchers to develop methods for long-tail visual relationship recognition - a very important problem for fine-grained image understanding. Both the benchmarks, GQA-LT and VG8K-LT, are long-tailed created out of GQA and VG datasets. With the presence of bounding boxes during both training and test time, the participants are required to submit a csv file containing the detection results of subject/object and the relation classes in the corresponding triplet.

Task & Metrics

The main task for both of these benchmarks is to predict relationship and at the same time subject/object associated with the triplet. During training and evaluation, the ground-truth bounding boxes are already provided and hence the main aim is the classification of sbj/obj and the rel associated with them. The main metric used for the challenge is average per-class accuracy, which is the accuracy of each class calculated separately, then average. The average per-class accuracy is a commonly used metric in the long-tail literature. Further details about the metrics on the leaderboard is give below:

rel_all_per_class: The per-class relationship classification performance averaged across all classes.
sbj_obj_all_per_class: The per-class subject/object classification performance averaged across all classes.
trip_sro_scores_all: The triplet classification performance averaged across all possible triplet types
trip_or_scores_all: The triplet classification performance averaged across all possible object-relation combination types.
trip_sr_scores_all: The triplet classification performance averaged across all possible subject-relation combination types.
trip_so_scores_all: The triplet classification performance averaged across all possible subject-object combination types.

Subission Format

The submission is supposed to be a csv file (named rel_detections_gt_boxes_prdcls.csv when the given evaluation code is run on starter code) that is to be renamed answer.csv and compressed into a .zip file (named submission.zip) before submission. No other format of submission will be accepted.

Servers

Challenge 1: GQA-LT Benchmark: Link
Challenge 2: VG8K-LT Benchmark: Link

Download Dataset

GQA-LT:
You can download the annotations and splits necessary for the GQA-LT benchmark here. You should see a 'gvqa' folder once unzipped. It contains seed folder called 'seed0' that contains .json annotations suited for the dataloader used in our implementations. Also, download GQA images from here
VG8K-LT:
You can download the annotations and splits necessary for the VG8K-LT benchmark here. You should see a 'vg8k' folder once unzipped. It contains seed folder called 'seed3' that contains .json annotations suited for the dataloader used in our implementations. Also, download VG images from here

Further instructions about the dataset placing and also dataloader can be found in tthe README of starter code here.

Code

You can find the starter code that can be used for submissions here. Please follow the code as presented in the branch 'ltvrd-challenge-2023' for creating an output in the format that is required for the submission. For running the baseline models you can follows the README instructions. You can also download some of the pre-trained models present there. After running the test file (here test_net_rel.py), the csv file to be used for submission would be named rel_detections_gt_boxes_prdcls.csv, which to be renamed according to rules mentioned above. For any query regarding this, please feel free to contact the people mentioned below.

Important Dates

16th February, 2023: The challenge portal for submissions for both benchmarks (GQA-LT & VG8K-LT) open up.
1st June, 2023: The challenge deadline for both the benchmarks.

Workshop: 19th June, 2023

Contact

Organizers

Arushi Goel, University of Edinburgh
Aniket Agarwal, KAUST Intern/ IIT Roorkee
Jun Chen, KAUST
Mohamed Elhoseiny, KAUST

Citation

@inproceedings{ltvrr2021,
    title={Exploring Long Tail Visual Relationship Recognition with Large Vocabulary},
    author={Abdelkarim, Sherif and Agarwal, Aniket and Achlioptas, Panos and Chen, Jun and Huang, Jiaji and Li, Boyang and Church, Kenneth and Elhoseiny, Mohamed},
    booktitle={Proceedings of the IEEE International Conference on Computer Vision},
    year={2021}
    }
}

The website template was borrowed from Michaël Gharbi.

Long Tail Visual Relationship Recognition Challenge Co-organized with the L3D-IVU Workshop CVPR 2023