Pattern Compressed Files

PCF stands for Pattern Compressed Files. This is a special format developed by Géza Makay using an idea of Árpád Kurusa.

This format is a special compressed format of black and white bitmapped images. It can be used for electronically available documents (see for example my publications on my homepage) or for scanned images of printed documents (for example, all previously published issues of a printed journal).

The compressing works the following way: we try to find repeating parts (patterns) in the bitmap, and store these parts only once. Hence, the compression ratio depends on several things:

The best thing would be an electronically available document which was typeset using only one font type in one magnification with a low resolution. Usually the electronically available, 300 dpi resolution documents (like my publications on my homepage) can be compressed to 3-5 Kb/page, and the size of the compressed file depends logarithmically on the resolution. Scanned images of documents can be compressed to 7-15 Kb/page at 300 dpi resolution, and the compressed file's size depends (usually) linearly on the resolution. These compression ratios are valid only if the document has at least 10 pages, since the repeating patterns are searched through the whole document.

The compression requires about 12 Mb+the storage size of a single bitmap, which would mean 13 Mb for 300 dpi A4 size images. On a Pentium 133 MHz machine with Windows NT 3.51 and 16 Mb of RAM 300 dpi A4 size scanned images can be compressed around 20-25 seconds/page, while electronically available documents need 5-6 seconds/page. The viewer program can be run using OLE Automation, all menu items, tool tips, status and title messages can be changed. Since the viewer program can call back the caller program too, some menu/toolbar commands can be replaced by custom programmed routines in the caller program, for example: all page changing menu items (Next-Previous Page, First-Last Page, etc), Exit command, Bitmap Save command, etc.

To view or print these files you need a viewer, which is available from our server for Windows 3.x and for Windows 95/NT. The viewer runs on 386 machines with 4 Mb of RAM, although a faster processor is recommended. No drawbacks or bugs known in this format or in it's viewer, if you have problems/comments or need more information, then write me an e-mail.