ImUnipen image data set for writer identification (N=208) - vectorial handwriting converted to usable images
Description
==============
Terms of Usage
==============
The ImUnipen data set is intended for non-commercial, scientific use,
and is distributed under auspices of the Unipen Foundation.
Please always refer to the following paper in IEEE PAMI when using
the ImUnipen data set:
Bulacu, M.; Schomaker, L.
Text-Independent Writer Identification and Verification
Using Textural and Allographic Features
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Volume 29, Issue 4, April 2007 Page(s):701 - 717
The ImUnipen data set is derived from the Unipen (unipen.org)
data set of on-line (i.e., vectorial, xy) handwriting.
The xy-coordinates and a line-generator algorithm are used
to generate a raster image, as if the data were optically scanned.
Contents: for 208 writers, there are two PNG images per writer of
an artificially constructed table of naturally written words (49MByte).
These words are pasted onto a white page. For systematics reasons,
we call such a page a Paragraph, see below.
The file names are organized as (example):
Writ990221.Doc01.Par00.png
Writ990221.Doc01.Par01.png
meaning: writer number 990221, document 01 (there exists only Doc01)
and the image with artificial "paragraph" of isolated words "Par00"
and "Par01".
The Par00 and Pa01 images are typically used as the query
and best match in a leave-one-out setting for writer identification.
For instance, Par00 is the query, and Par01 is added to the total set
of all other images as the attractor for an identification search.
For these experiments, word labels are not given in this data set,
on purpose, as the goal is to test recognition-free writer identification
methods.
For a description of the regular
Unipen data set, please visit http://unipen.org
Lambert Schomaker constructed this set in 2005
Notes
Files
Files
(44.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:e89e976ee72e49f9842da53f28aa4998
|
44.2 MB | Download |
Additional details
References
- Bulacu, M.; Schomaker, L. Text-Independent Writer Identification and Verification Using Textural and Allographic Features Pattern Analysis and Machine Intelligence, IEEE Transactions on Volume 29, Issue 4, April 2007 Page(s):701 - 717