The training dataset can be downloaded here.


For the modern dataset no training data is available for Track B. Note that the modern dataset has a different annotation (Bounding box of the content describes a cell region in contrast to the historical dataset whereas a cell region is described by the cell area) and thus a different requested output. See samples on the Dataset/Description section.

The naming convention of ground truth files: documents from the modern dataset have the prefix "cTDaR_t1" and documents from the historical dataset start with "cTDaR_t0".