Skip to content

Latest commit

 

History

History
 
 

cBAD_complex

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

cBAD Dataset

See.

The database of Track A [Simple Documents] consists of 755 images extracted from 9 different archival collections. The dataset comprises images with additional PAGE XMLs 1. The PAGE XML contains text regions, e.g. paragraphs. Thus a layout analysis or text detection needs not to be performed on this dataset. Only handwritten text is present and the dataset contains no tables. The groundtruth of the test-set will be released after evaluating all submitted methods and the final results being made public.

Track B [Complex Documents] contains mixed documents. Though most documents are handwritten, printed documents, book covers, empty pages, and tables are contained in this track. While Track A has locally skewed text-lines, text-lines in Track B are rotated up to 180°

On this example only Complex Track is used.

Usage:

./run.sh

Dataset size more then 2GB, make sure to have at least 6GB of free disk space to store all the experiment.

See config for details about training parameters.

ICDAR 2017 Results

Complex Track

Following table shows results published on ICDAR 2017 proceddings plus the results of this experiment (P2PaLA row), Nonparametric Bootstrapping confidence intervals at 95%, 10000 repetitions.

Method P R F1
DMRZ 85.4 86.3 85.9
P2PaLA 84.8[83.9, 85.7] 85.4[84.4, 86.4] 85.1
BYU 77.3 82.0 79.6
IRISA 69.2 77.2 73.0
UPVLC 83.3 60.6 70.2

As you can notice, results are pretty close to competition winner. Although no hyperparameter tunning is performed.

Corpus Notes

Complex Track

Train data

  • number of pages: 270
  • color schema: 60 Gray, 210 sRGB
  • size: 209 different sizes, from 1504x1194 to 7456x6104
  • orientation: both Portrait and landscape
  • Baselines:
    • total: 21684
    • average per page: 80.3
    • min: 0
    • max: 472
    • histogram

Test data

  • number of pages: 1010
  • color schema: 163 Gray, 847 sRGB
  • size: 678 different sizes, from 982x3127 to 7472x6088
  • orientation: both Portrait and landscape
  • Baselines: blind test