Lzip

[ English | Español | Français | Italiano | Russkii ]

Introduction

Lzip is a lossless data compressor with a user interface similar to the one of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov chain-Algorithm' (LZMA) stream format to maximize interoperability. The maximum dictionary size is 512 MiB so that any lzip file can be decompressed on 32-bit machines. Lzip provides accurate and robust 3-factor integrity checking. Lzip can compress about as fast as gzip (lzip -0) or compress most files more than bzip2 (lzip -9). Decompression speed is intermediate between gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery perspective. Lzip has been designed, written, and tested with great care to replace gzip and bzip2 as the standard general-purpose compressed format for Unix-like systems.

For compressing/decompressing large files on multiprocessor machines plzip can be much faster than lzip at the cost of a slightly reduced compression ratio.

For creation and manipulation of compressed tar archives tarlz can be more efficient than using tar and plzip because tarlz is able to keep the alignment between tar members and lzip members.

The lzip file format is designed for data sharing and long-term archiving, taking into account both data integrity and decoder availability:

The lzip format provides very safe integrity checking and some data recovery means. The program lziprecover can repair bit flip errors (one of the most common forms of data corruption) in lzip files, and provides data recovery capabilities, including error-checked merging of damaged copies of a file.
The lzip format is as simple as possible (but not simpler). The lzip manual provides the source code of a simple decompressor along with a detailed explanation of how it works, so that with the only help of the lzip manual it would be possible for a digital archaeologist to extract the data from a lzip file long after quantum computers eventually render LZMA obsolete.
Additionally the lzip reference implementation is copylefted, which guarantees that it will remain free forever.

A nice feature of the lzip format is that a corrupt byte is easier to repair the nearer it is from the beginning of the file. Therefore, with the help of lziprecover, losing an entire archive just because of a corrupt byte near the beginning is a thing of the past.

Lzip uses the same well-defined exit status values used by bzip2, which makes it safer than compressors returning ambiguous warning values (like gzip) when it is used as a back end for other programs like tar or zutils.

Introductory links

Benchmark - Some tests showing the ability of lzip to replace gzip and bzip2 as general purpose compressor for Unix-like systems from a performance point of view.

Quality assurance - Design, development, and testing of lzip.

Safety of the lzip format - This article measures the safety of lzip's integrity checking and explains why lzip achieves a high accuracy in the detection of errors.

Lzip Compressed Format and the application/lzip Media Type - Internet-Draft at the IETF site.

The lzip format (slides) - Talk given at the GNU Hackers Meeting 2019.

Xz format inadequate for long-term archiving - This article describes the reasons why you should switch to lzip if you are using xz for anything other than compressing short-lived executables.

Other features

Lzip automatically uses for each file the largest dictionary size that does not exceed neither the file size nor the limit given. Keep in mind that the decompression memory requirement is affected at compression time by the choice of dictionary size limit.

When compressing, lzip replaces every file given in the command line with a compressed version of itself, with the name "original_name.lz".

(De)compressing a file is much like copying or moving it; therefore lzip preserves the access and modification dates, permissions, and, if you have appropriate privileges, ownership of the file just as 'cp -p' does. (If the user ID or the group ID can't be duplicated, the file permission bits S_ISUID and S_ISGID are cleared).

Lzip is able to read from some types of non-regular files if either the option '-c' or the option '-o' is specified.

If no file names are specified, lzip compresses (or decompresses) from standard input to standard output. Lzip refuses to read compressed data from a terminal or write compressed data to a terminal, as this would be entirely incomprehensible and might leave the terminal in an abnormal state.

Lzip correctly decompresses a file which is the concatenation of two or more compressed files. The result is the concatenation of the corresponding decompressed files. Integrity testing of concatenated compressed files is also supported.

Lzip can produce multimember files, and lziprecover can safely recover the undamaged members in case of file damage. Lzip can also split the compressed output in volumes of a given size, even when reading from standard input. This allows the direct creation of multivolume compressed tar archives.

Lzip is able to compress and decompress streams of unlimited size by automatically creating multimember output. The members so created are large, about 2 PiB each.

In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a concrete algorithm; it is more like "any algorithm using the LZMA coding scheme". For example, the option '-0' of lzip uses the scheme in almost the simplest way possible; issuing the longest match it can find, or a literal byte if it can't find a match. Inversely, a much more elaborated way of finding coding sequences of minimum size than the one currently used by lzip could be developed, and the resulting sequence could also be coded using the LZMA coding scheme.

Lzip currently implements two variants of the LZMA algorithm; fast (used by option '-0') and normal (used by all other compression levels).

The high compression of LZMA comes from combining two basic, well-proven compression ideas: sliding dictionaries (LZ77) and markov models (the thing used by every compression algorithm that uses a range encoder or similar order-0 entropy coder as its last stage) with segregation of contexts according to what the bits are used for.

The ideas embodied in lzip are due to (at least) the following people: Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the definition of Markov chains), G.N.N. Martin (for the definition of range encoding), Igor Pavlov (for putting all the above together in LZMA), and Julian Seward (for bzip2's CLI).

Related projects

Plzip - A multi-threaded compressor using the lzip file format.

Tarlz - An archiver with multi-threaded lzip (de)compression.

Lzlib - A compression library for the lzip file format, written in C.

Lziprecover - A data recovery tool and decompressor for the lzip format.

Zutils - Replacement for zcat, zdiff, zgrep, etc, that understands lzip, bzip2, and gzip formats.

Lunzip - A decompressor for lzip files, written in C.

Clzip - A C implementation of lzip for systems lacking a C++ compiler.

Pdlzip - A limited, "public domain" C implementation of the lzip data compressor, intended for those who can't distribute GPL licensed Free Software. Pdlzip is also able to decompress legacy lzma-alone (.lzma) files.

Lzd - An educational decompressor for the lzip format.

Xlunzip - A test tool for the lzip_decompress linux module.

Documentation

The manual is available in the info system of the GNU Operating System. Use info to access the top level info page. Use info lzip to access the lzip section directly.

An online manual for lzip can be found at manual/lzip_manual.html.

Download

The latest released version of lzip can be found at http://download.savannah.gnu.org/releases/lzip/. You may also subscribe to lzip-bug and receive an email every time a new version is released.

A Windows32 port of lzip can be downloaded from the Savannah download link just above. More ports of lzip for Windows can be found in the Links section below. A Windows port (32 and 64 bits) of plzip can be downloaded from the plzip page above.

You may compile and optionally install lzip by running the following commands:

tar -xf lzip[version].tar.gz
cd lzip[version] && ./configure && make check

then (as root) type:

make install

Once lzip is installed, the files from archive "foo.tar.lz" can be extracted using the commands "tar -xf foo.tar.lz" or "lzip -cd foo.tar.lz | tar -xf -".

How to get help

For general discussion of errors (bugs) in lzip, the mailing list lzip-bug@nongnu.org is the most appropriate forum. Please send messages as plain text. Please do not send messages encoded as HTML nor encoded as base64 nor included as multiple formats. Please include a descriptive subject line with the word "lzip" in it.

An archive of the bug report mailing list is available at http://lists.gnu.org/mailman/listinfo/lzip-bug.

How to help

To contact the author, either to report an error (bug) or to contribute fixes or improvements, send mail to lzip-bug@nongnu.org. Please send messages as plain text. If posting patches they should be in unified diff format against the latest version. They should include a text description.

You may also help lzip by donating money via PayPal or debit/credit card.

See also the lzip project page at Savannah.

Links

7-Zip ZStandard Edition - A version of 7-Zip with lzip decompression support built in.

Atool, Patool - Command line archive managers that understand lzip files.

GNU Automake - A Makefile generator able to create lzip-compressed tarballs.

Dragora GNU/Linux - A GNU/Linux distribution using lzip in its package system.

File Roller - An archive manager for GNOME that understands lzip files.

Lesspipe.sh - View the contents of lzipped files with the pager less.

Libarchive - Multi-format archive and compression library with lzip support.

Littleutils - Convert your files to lzip format.

Man-db - An implementation of the Unix man command able to read lzipped pages.

Midnight Commander - A visual file manager that understands lzip files.

RPM - RPM Package Manager that uses lzip to compress packages.

GNU Tar - Automatically create and extract lzip-compressed tar archives.

GNU Texinfo - The GNU Documentation System understands lzip-compressed manuals.

Z - A simple, safe, and convenient front-end for lzip, bzip2, and gzip.

Ports

Download lzip for AIX, ALT Linux, Amiga (Aminet), Arch Linux, DOS, Debian, Fedora, FreeBSD, Gentoo, HP-UX, Mac (fink), NetBSD, NixOS, Slackware, Solaris (OpenCSW), Windows (Cygwin), Windows (ezwinports).

Bindings (Interfaces to languages other than C/C++)

Common Lisp, Haskell.

Licensing

Lzip is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version.

Valid HTML 4.01 Strict

You are free to copy, modify, and distribute all or part of this article without limitation.

Updated: 2024-03-15

This page does not use javascript.