Several datasets were created in order to evaluate the perfomance of the different algorithms involved in the project.
A dataset containing more than 40 000 images of text only documents with print and scan noise:
A dataset containing 960 images of different layout with print and scan noise :
A dataset containing 990 copies of 55 documents with print and scan noise: