ZVR Compressed Text File Reader
Introduction
The Psion series 3a pocket organiser is a wonderful machine. I
was particularly impressed when I came across a program by Ewan Paton called vr3a
which allowed text files to be read “vertically” on it,
i.e., with the machine turned through ninety degrees. A Psion 3a
fits neatly into the hand when held this way, and you can read
the text in much the same way as you would use a paper book. Your
thumb on the space bar flips to the next page.
Unfortunately, the main memory on Psion 3a machines is limited: even the most capacious models only allow for 2MB of main memory. Although you can plug in Solid State Disks (SSDs) to extend this, they are quite expensive.
So, it seemed reasonable to extend the original vr3a
program to allow reading of specially prepared compressed text
files, thus getting more text into the same space and allowing me
to carry around more reading material. The resulting program zvr
(along with some documentation and an early compressor in source
form) was made available in April 1995 to a few people who had
expressed an interest. The file has since found its way onto a
number of Psion archive sites.
Some people had problems with compiling the compressor (it
needs a 32-bit flat memory model system: any modern Unix, a DOS
extender or Win32) and there were a couple of bugs to fix. The
latest version of the compressor (now called zvrz
) is
now separate from the reader, and I’ve provided executables for
people who don’t want to get into the business of compiling the
compressor themselves, or simply don’t have a suitable compiler.
Download the Reader: ZVR V1.1
To get zvr
working on your Psion 3a, follow these
steps:
- Download and install Ewan Paton’s original vr3a
application from
VERT.ZIP
. Try it on some plain text files to make sure it is working. - Download the
zvr
archive and installzvr
itself fromZVR110.ZIP
. Tryzvr
on the includedALADDIN.ZVR
file to confirm its operation. - Discard the compression and decompression program source code provided.
Download the Compressor: ZVRZ V1.2
If you just want to compress some files, and you have a system which I can build executables for, I’ll try and provide an executable of the compressor program here. No documentation is provided: just execute the program with an empty command line and it will tell you all you need to know.
- The Win32 console mode executable (26KB) should work as a command-line application for Windows 95 or Windows NT.
- The DOS4GW executable (45KB)
is the same application compiled for use as a
command-line application for DOS or Windows 3.1. It will
also execute in some other environments. If you don’t
have a copy already, you’ll also need to download the
DOS4GW
DOS extender (140KB).
If you’re interested in the compression program itself, or
have a system I can’t support, you need to pick up the source archive instead. Included are the zvrz.c
main file and a getopt.c
and getopt.h
in case
you don’t have them on your system.
What Can I Read?
The best answer to this question is to find something you want
to read that exists in electronic form and compress it yourself
using the zvrz
program; Project
Gutenberg is a good place to start if your taste is in older
books. A couple of my favourite classics, run through zvrz
and then zipped, are The Time Machine
(65KB) and Dr. Jekyll and Mr. Hyde (50KB).
About the Compression Scheme
It’s one of those “well known facts” that English text has about 1.3 bits of entropy per character: i.e., an 8-bit character in a text file contains perhaps 1.3 bits of information. From this, it’s a simple step to saying that we should be able to compress English text by a factor of 8/1.3, or about 6. My compressor manages to reduce most files by about 50% of their original size: a compression factor of only 2.
This looks pretty sloppy until you look at some other aspects of the problem:
- The decompressor needed to be written in OPL.
- The decompression process needed to be very fast.
- Random seeks within the compressed file needed to be possible, to allow rapid movement within large compressed texts.
Now, all modern compression systems compress the input sequence into a sequence of output compression symbols of different length. This means that the more commonly occurring input sequences can be represented by short output symbols (only a few bits in length) with longer output sequences used for less-used sequences in the input, a procedure which can greatly enhance compression. Another technique used is to accumulate an buffer of text into which the compressed text can refer for symbol definitions.
These modern compression techniques often achieve compression
factors up to 4:1 on English text, but they were unsuitable for
use in zvr
:
- OPL has no facilities for fast manipulation of variable-length bit-fields.
- Large compression buffers are inappropriate on limited-memory machines. Most importantly, all of these techniques introduce state into the compressed text, which means that it is extremely hard to quickly reposition within it.
The compression scheme understood by zvr
is,
therefore, well behind the state of the art: the compressed file
starts with a dictionary of the 256 possible symbols which might
appear in the compressed file. The dictionary simply contains a
fixed replacement string for each of these symbols. Decompression
is therefore very fast even in OPL, and can start at any point in
the compressed file.
The compression algorithm used by zvrz
is simply to
repeatedly scan the uncompressed file for the most common symbol
pair, and define a compression symbol to represent that pair. The
result of replacing that pair whenever it appears is then used as
data for the next pass. This process repeats until no more
compression can take place because no more unused compression
symbols are available.
.TCR
Files and the Reader Application
At the time I built the zvr
program and its
compressor, several people had vertical readers; my hope was that
one of them might incorporate my ideas into their product and
then I could just use that and not bother with zvr
myself any more.
This eventually happened: after some discussion with me, Barry Childress took my primitive effort
apart and built a much better (but incompatible) variant into his
READER
program: files with a .TCR
extension are
for use with READER
. Barry’s application is so much
better than zvr
that it’s the application I use myself
now, and as a result zvr
is unlikely ever to be changed
(by me) again. Steve
Litchfield also reviewed
READER
very positively, and a copy of the reader application is linked
to this page.
[Update 2018-01-17: the links in the above paragraph use the
Internet Archive Wayback Machine,
as the 3lib.ukonline.co.uk
site is now defunct.]
The compressed file reading part of READER
is only
available in the registered version; contact
Barry for details.