This is the README for bzip2, a block-sorting file compressor, version
1.0.3. This version is fully compatible with the previous public
releases, versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0, 1.0.1 and 1.0.2.
bzip2-1.0.3 is distributed under a BSD-style license. For details,
see the file LICENSE.
Complete documentation is available in Postscript form (manual.ps),
PDF (manual.pdf) or html (manual.html). A plain-text version of the
manual page is available as bzip2.txt. A statement about Y2K issues
is now included in the file Y2K_INFO.
HOW TO BUILD -- UNIX
Type `make'. This builds the library libbz2.a and then the
programs bzip2 and bzip2recover. Six self-tests are run.
If the self-tests complete ok, carry on to installation:
To install in /usr/bin, /usr/lib, /usr/man and /usr/include, type
make install
To install somewhere else, eg, /xxx/yyy/{bin,lib,man,include}, type
make install PREFIX=/xxx/yyy
If you are (justifiably) paranoid and want to see what 'make install'
is going to do, you can first do
make -n install or
make -n install PREFIX=/xxx/yyy respectively.
The -n instructs make to show the commands it would execute, but
not actually execute them.
HOW TO BUILD -- UNIX, shared library libbz2.so.
Do 'make -f Makefile-libbz2_so'. This Makefile seems to work for
Linux-ELF (RedHat 7.2 on an x86 box), with gcc. I make no claims
that it works for any other platform, though I suspect it probably
will work for most platforms employing both ELF and gcc.
bzip2-shared, a client of the shared library, is also built, but not
self-tested. So I suggest you also build using the normal Makefile,
since that conducts a self-test. A second reason to prefer the
version statically linked to the library is that, on x86 platforms,
building shared objects makes a valuable register (%ebx) unavailable
to gcc, resulting in a slowdown of 10%-20%, at least for bzip2.
Important note for people upgrading .so's from 0.9.0/0.9.5 to version
1.0.X. All the functions in the library have been renamed, from (eg)
bzCompress to BZ2_bzCompress, to avoid namespace pollution.
Unfortunately this means that the libbz2.so created by
Makefile-libbz2_so will not work with any program which used an older
version of the library. Sorry. I do encourage library clients to
make the effort to upgrade to use version 1.0, since it is both faster
and more robust than previous versions.
HOW TO BUILD -- Windows 95, NT, DOS, Mac, etc.
It's difficult for me to support compilation on all these platforms.
My approach is to collect binaries for these platforms, and put them
on the master web page (http://sources.redhat.com/bzip2). Look there.
However (FWIW), bzip2-1.0.X is very standard ANSI C and should compile
unmodified with MS Visual C. If you have difficulties building, you
might want to read README.COMPILATION.PROBLEMS.
At least using MS Visual C++ 6, you can build from the unmodified
sources by issuing, in a command shell:
nmake -f makefile.msc
(you may need to first run the MSVC-provided script VCVARS32.BAT
so as to set up paths to the MSVC tools correctly).
VALIDATION
Correct operation, in the sense that a compressed file can always be
decompressed to reproduce the original, is obviously of paramount
importance. To validate bzip2, I used a modified version of Mark
Nelson's churn program. Churn is an automated test driver which
recursively traverses a directory structure, using bzip2 to compress
and then decompress each file it encounters, and checking that the
decompressed data is the same as the original.
Please read and be aware of the following:
WARNING:
This program (attempts to) compress data by performing several
non-trivial transformations on it. Unless you are 100% familiar
with *all* the algorithms contained herein, and with the
consequences of modifying them, you should NOT meddle with the
compression or decompression machinery. Incorrect changes can and
very likely *will* lead to disastrous loss of data.
DISCLAIMER:
I TAKE NO RESPONSIBILITY FOR ANY LOSS OF DATA ARISING FROM THE
USE OF THIS PROGRAM, HOWSOEVER CAUSED.
Every compression of a file implies an assumption that the
compressed file can be decompressed to reproduce the original.
Great efforts in design, coding and testing have been made to
ensure that this program works correctly. However, the complexity
of the algorithms, and, in particular, the presence of various
special cases in the code which occur with very low but non-zero
probability make it impossible to rule out the possibility of bugs
remaining in the program. DO NOT COMPRESS ANY DATA WITH THIS
PROGRAM UNLESS YOU ARE PREPARED TO ACCEPT THE POSSIBILITY, HOWEVER
SMALL, THAT THE DATA WILL NOT BE RECOVERABLE.
That is not to say this program is inherently unreliable. Indeed,
I very much hope the opposite is true. bzip2 has been carefully
constructed and extensively tested.
PATENTS:
To the best of my knowledge, bzip2 does not use any patented
algorithms. However, I do not have the resources to carry out
a patent search. Therefore I cannot give any guarantee of the
above statement.
End of legalities.
WHAT'S NEW IN 0.9.0 (as compared to 0.1pl2) ?
* Approx 10% faster compression, 30% faster decompression
* -t (test mode) is a lot quicker
* Can decompress concatenated compressed files
* Programming interface, so programs can directly read/write .bz2 files
* Less restrictive (BSD-style) licensing
* Flag handling more compatible with GNU gzip
* Much more documentation, i.e., a proper user manual
* Hopefully, improved portability (at least of the library)
WHAT'S NEW IN 0.9.5 ?
* Compression speed is much less sensitive to the input
data than in previous versions. Specifically, the very
slow performance caused by repetitive data is fixed.
* Many small improvements in file and flag handling.
* A Y2K statement.
WHAT'S NEW IN 1.0.0 ?
See the CHANGES file.
WHAT'S NEW IN 1.0.2 ?
See the CHANGES file.
WHAT'S NEW IN 1.0.3 ?
See the CHANGES file.
I hope you find bzip2 useful. Feel free to contact me at
jseward@bzip.org
if you have any suggestions or queries. Many people mailed me with
comments, suggestions and patches after the releases of bzip-0.15,
bzip-0.21, and bzip2 versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0, 1.0.1 and
1.0.2, and the changes in bzip2 are largely a result of this feedback.
I thank you for your comments.
At least for the time being, bzip2's "home" is (or can be reached via)
http://www.bzip.org
Julian Seward
jseward@bzip.org
Cambridge, UK.
18 July 1996 (version 0.15)
25 August 1996 (version 0.21)
7 August 1997 (bzip2, version 0.1)
29 August 1997 (bzip2, version 0.1pl2)
23 August 1998 (bzip2, version 0.9.0)
8 June 1999 (bzip2, version 0.9.5)
4 Sept 1999 (bzip2, version 0.9.5d)
5 May 2000 (bzip2, version 1.0pre8)
30 December 2001 (bzip2, version 1.0.2pre1)
15 February 2005 (bzip2, version 1.0.3)
Name Revised Size Description
ALPHAL/ 11-Jul-2008 11:40 512 subdirectory
IA64L/ 11-Jul-2008 11:40 512 subdirectory
VAX/ 11-Jul-2008 11:40 512 subdirectory
VMS/ 11-Jul-2008 11:40 512 subdirectory
BLOCKSORT.C 11-Jul-2008 11:50 32,594 C source
BZ-COMMON.XSL 11-Jul-2008 11:50 1,057
BZ-FO.XSL 11-Jul-2008 11:50 9,842
BZ-HTML.XSL 11-Jul-2008 11:50 646
BZDIFF 11-Jul-2008 11:50 2,105 plain text
BZDIFF.1 11-Jul-2008 11:50 897
BZGREP 11-Jul-2008 11:50 1,582 plain text
BZGREP.1 11-Jul-2008 11:50 1,297
BZIP.CSS 11-Jul-2008 11:50 1,746 W3C Cascading Style Sheet
BZIP2-1_0_3B_VMS.ZIP 11-Jul-2008 11:50 1,596,344 ZIP-compressed
BZIP2.1 11-Jul-2008 11:50 16,249
BZIP2.1_PREFORMATTED 11-Jul-2008 11:50 20,922
BZIP2.C 11-Jul-2008 11:50 65,807 C source
BZIP2.C_ORIG 11-Jul-2008 11:50 61,685
BZIP2.TXT 11-Jul-2008 11:50 18,976 plain text
BZIP2RECOVER.C 11-Jul-2008 11:50 16,208 C source
BZIP2RECOVER.C_ORIG 11-Jul-2008 11:50 16,188
BZLIB.C 11-Jul-2008 11:50 48,169 C source
BZLIB.C_ORIG 11-Jul-2008 11:50 47,556
BZLIB.H 11-Jul-2008 11:50 8,005 C header
BZLIB.H_ORIG 11-Jul-2008 11:50 7,861
BZLIB_PRIVATE.H 11-Jul-2008 11:50 14,288 C header
BZLIB_PRIVATE.H_ORIG 11-Jul-2008 11:50 14,257
BZMORE 11-Jul-2008 11:50 1,259 plain text
BZMORE.1 11-Jul-2008 11:50 4,310
CHANGES 11-Jul-2008 11:50 9,959 plain text
COMPRESS.C 11-Jul-2008 11:50 22,149 C source
CRCTABLE.C 11-Jul-2008 11:50 6,398 C source
DECOMPRESS.C 11-Jul-2008 11:50 21,610 C source
DLLTEST.C 11-Jul-2008 11:50 4,434 C source
DLLTEST.DSP 11-Jul-2008 11:50 3,516
ENTITIES.XML 11-Jul-2008 11:50 240
FORMAT.PL 11-Jul-2008 11:50 1,149 Perl source
FREEWARE_README.TXT 11-Jul-2008 11:50 272 plain text
HUFFMAN.C 11-Jul-2008 11:50 8,571 C source
LIBBZ2.DEF 11-Jul-2008 11:50 517
LIBBZ2.DSP 11-Jul-2008 11:50 4,254
LICENSE 11-Jul-2008 11:50 1,764 plain text
MAKEFILE-LIBBZ2_SO 11-Jul-2008 11:50 1,183 plain text
MAKEFILE 11-Jul-2008 11:50 5,760 plain text
MAKEFILE.MSC 11-Jul-2008 11:50 1,615
MANUAL.HTML 11-Jul-2008 11:50 124,411 "bzip2 and libbzip2, version 1.0.3"
MANUAL.PDF 11-Jul-2008 11:50 207,093 Adobe Portable Document Format
MANUAL.PS 11-Jul-2008 11:50 901,914 PostScript
MANUAL.XML 11-Jul-2008 11:50 112,044
MK251.C 11-Jul-2008 11:50 371 C source
RANDTABLE.C 11-Jul-2008 11:50 5,440 C source
README 11-Jul-2008 11:50 7,062 plain text
README.COMPILATION_PROBL+ 11-Jul-2008 11:50 1,880
README.XML_STUFF 11-Jul-2008 11:50 826
SAMPLE1.BZ2 11-Jul-2008 11:50 32,348
SAMPLE1.REF 11-Jul-2008 11:50 98,696
SAMPLE2.BZ2 11-Jul-2008 11:50 73,732
SAMPLE2.REF 11-Jul-2008 11:50 212,340
SAMPLE3.BZ2 11-Jul-2008 11:50 235
SAMPLE3.REF 11-Jul-2008 11:50 120,244
SPEWG.C 11-Jul-2008 11:50 1,183 C source
UNZCRASH.C 11-Jul-2008 11:50 3,104 C source
WORDS0 11-Jul-2008 11:50 184 plain text
WORDS1 11-Jul-2008 11:50 103 plain text
WORDS2 11-Jul-2008 11:50 185 plain text
WORDS3 11-Jul-2008 11:50 918 plain text
XMLPROC.SH 11-Jul-2008 11:50 2,386
Y2K_INFO 11-Jul-2008 11:50 1,343 plain text