Several datasets were created in order to evaluate the perfomance of the different algorithms involved in the project.
A dataset containing more than 40 000 images of text only documents with print and scan noise: http://navidomass.univ-lr.fr/TextCopies/
A dataset containing 960 images of different layout with print and scan noise : http://navidomass.univ-lr.fr/layoutcopies/
A dataset containing 990 copies of 55 documents with print and scan noise: http://l3i-share.univ-lr.fr/datasets/DocCopiesWebsite/DocCopiesDataset.html