less than 1 minute read

The AToMiC dataset for the TREC 2023 evaluation is now available at the following locations:

To aid exploration of the dataset, we have included notebooks here. Additionally, the resource paper that accompanies the dataset is now available on arXiv.

Changes

  • Expanded the text collection by including text-only samples (English Wikipedia articles) without any associated images. The previous version (v0.1) only contained paired image-text examples.
  • Expanded the image collection by incorporating images from non-English languages. The previous version only included images attached to English articles.