"Xz format inadequate for long-term archiving"

Everything that does not fit anywhere else
Post Reply
User avatar
lgillis
Posts: 137
Joined: Mon May 09, 2022 8:40 am

"Xz format inadequate for long-term archiving"

Post by lgillis »

Xz can be found at every corner in the Linux world, even the kernel code is compressed with it. I would like to briefly refer to the following article. I myself can not judge the information, but I note that there is no contradiction from the other side (xz).

Xz format inadequate for long-term archiving:
Abstract

One of the challenges of digital preservation is the evaluation of data formats. It is important to choose well-designed data formats for long-term archiving. This article describes the reasons why the xz compressed data format is inadequate for long-term archiving and inadvisable for data sharing and for free software distribution. The relevant weaknesses and design errors in the xz format are analyzed and, where applicable, compared with the corresponding behavior of the bzip2, gzip and lzip formats. Key findings include: (1) safe interoperability among xz implementations is not guaranteed; (2) xz's extensibility is unreasonable and problematic; (3) xz is vulnerable to unprotected flags and length fields; (4) LZMA2 is unsafe and less efficient than the original LZMA; (5) xz includes useless features that increase the number of false positives for corruption; (6) xz shows inconsistent behavior with respect to trailing data; (7) error detection in xz is several times less accurate than in bzip2, gzip and lzip.

Disclosure statement: The author is also author of the lzip format.
User avatar
lgillis
Posts: 137
Joined: Mon May 09, 2022 8:40 am

Re: "Xz format inadequate for long-term archiving"

Post by lgillis »

Good that you ask. I don't use xz or lzip for my backups. The duration and processor load that both programs require to save a few bytes is comparatively tremendous. Most formats already contain compression and the rest is compressed "transparently" in a similar way as more modern file systems such as ZFS, Btrfs and Bcachfs can do internally.
Post Reply