Opened 8 years ago

Last modified 3 months ago

#52000 new enhancement

Add xz to "base" to allow changing the default compression format to .tar.xz in the future

Reported by: mojca (Mojca Miklavec) Owned by: macports-tickets@…
Priority: Normal Milestone:
Component: base Version: 2.3.4
Keywords: Cc: ryandesign (Ryan Carsten Schmidt), neverpanic (Clemens Lang), raimue (Rainer Müller), larryv (Lawrence Velázquez), ci42, cooljeanius (Eric Gallager)
Port:

Description

It would be nice to add xz to base (in a similar way as it is done for Tcl) to be able to switch to .tar.xz in the future and to save space.

(I don't want to open the can of worms discussing how to do the transition from .tar.bz2 to .tar.xz, but merely to provide the prerequisite to be able to do so when we are ready.)

xz and xzdec together need less than 300k in total (and perhaps a tiny bit more if built universally).

(We could also build them just on 10.6 for x86_64 and 10.5 for i386/ppc and use the same binary everywhere.)

Somewhat related tickets:

  • #33037
    background archiving
    #50448
    Change filenames of binary packages built against libc++ on < 10.9

Change History (17)

comment:1 Changed 8 years ago by ryandesign (Ryan Carsten Schmidt)

Yes, this is interesting. I also wanted to investigate whether we could use tar/libarchive to do xz compression since it's already included in OS X 10.9 and later. Or we could include and build the libarchive source code. Its BSD license might be preferable to xz's legally ambiguous "public domain".

comment:2 Changed 8 years ago by larryv (Lawrence Velázquez)

Cc: larryv@… added

Cc Me!

comment:3 Changed 8 years ago by ryandesign (Ryan Carsten Schmidt)

See #47255 for some previous discussion.

comment:4 Changed 8 years ago by mojca (Mojca Miklavec)

Thank you for the link. I was searching for tickets in base and didn't find that one previously. I would call it a duplicate – at least the desired goal is the same. But bundling xz with base would be "easier to handle" than the chicken-and-egg problem with the xz port itself.

comment:5 Changed 8 years ago by raimue (Rainer Müller)

Has duplicate #52797.

comment:6 in reply to:  1 Changed 8 years ago by RJVB (René Bertin)

Replying to ryandesign:

Its BSD license might be preferable to xz's legally ambiguous "public domain".

Does that really make a difference? IIUC libarchive uses libzma for supporting xz compression so unless the library is the part of port:xz that's GPL'ed using it doesn't really make a difference. Also, I don't know about the rest of the world (who really does...), but "public domain" is not an ambiguous term here. When a work falls into it, that means it is no longer under any form of copyright, nor anyone's property, and freely usable by everyone.

comment:7 Changed 8 years ago by RJVB (René Bertin)

xz and xzdec together need less than 300k in total (and perhaps a tiny bit more if built universally).

That can be even less if you figure out how best to leverage hfsCompression (or the new FS's transparent compression if indeed it turns out to exist). Also, I don't think there's a need for xzdec; "base" uses the -d option to uncompress using the *z family of compressors.

For the rest: I've been using .txz software archives for close to 2 years now, ever since I discovered the support for it in macports.conf . The only issue used to be updating one of port:xz's dependencies or the port itself, but that's over now that I build a static, persistent copy.

As to the conversion: locally it seems to work just fine to do an archivefetch, convert the .tbz2 in incoming/verified into a .txz, and then do the install.

comment:8 Changed 8 years ago by ryandesign (Ryan Carsten Schmidt)

My understanding of United States law is that "public domain" is more ambiguous than a clear license like BSD which confers specific rights.

You're right that libarchive links with liblzma so libarchive's "better" license may not be relevant.

Recent versions of OS X provide a copy of liblzma and libarchive but there doesn't seem to be an OS copy of the xz executable. MacPorts currently handles decompression by running a separate executable. OS X's tar command links with libarchive so it should understand xz; we would just have to change MacPorts to decompress and extract the tarball as a single operation instead of two operations as it is currently.

This ticket is not about hfsCompression; that's a separate matter.

comment:9 in reply to:  8 Changed 8 years ago by neverpanic (Clemens Lang)

Replying to ryandesign:

My understanding of United States law is that "public domain" is more ambiguous than a clear license like BSD which confers specific rights.

Some jurisdictions do not have the concept of "public domain" (e.g. consider Germany, where there is no legal method to give up the equivalent of your copyright, even if you want to), and from a corporate background I know that not all "public domain" licenses are created equal. It probably doesn't matter much for us as an open source project, though.

comment:10 Changed 8 years ago by ci42

Cc: ci42 added

comment:11 Changed 7 years ago by yan12125 (Chih-Hsuan Yen)

Cc: yan12125 added

comment:12 Changed 7 years ago by yan12125 (Chih-Hsuan Yen)

Recent versions of OS X provide a copy of liblzma and libarchive but there doesn't seem to be an OS copy of the xz executable. MacPorts currently handles decompression by running a separate executable. OS X's tar command links with libarchive so it should understand xz; we would just have to change MacPorts to decompress and extract the tarball as a single operation instead of two operations as it is currently.

Just checked and seems lzma support in libarchive is added in 10.9. libarchive-29, which is introduced in 10.9, has HAVE_LZMA_H in its config.h [2], while this is not the case for libarchive-25.1 in 10.8 [1]. If MacPorts aims to support xz for all OS X versions, bundling xz is simpler.

As the first step to enable xz support in MacPorts, I created a simple patch that bundles xz-5.2.3 into macports-base [3]. It creates a binary /opt/local/libexec/macports/bin/xz, which can be used for creating and extracting xz-based binary packages.

[1] https://opensource.apple.com/source/libarchive/libarchive-25.1/config.h.auto.html

[2] https://opensource.apple.com/source/libarchive/libarchive-29/config.h.auto.html

[3] https://github.com/yan12125/macports-base/tree/bundle-xz

Last edited 7 years ago by ryandesign (Ryan Carsten Schmidt) (previous) (diff)

comment:13 Changed 7 years ago by mojca (Mojca Miklavec)

Wonderful, thank you very much. Now, I would probably:

  • Remove all the installed man pages
  • Remove all the header files
  • Remove the part of documentation that we are allowed to remove.
  • Remove the binaries which we likely won't need, as well as things like static library etc. (We could probably build static binaries, but that's a different/unrelated topic anyway.)

But let's see what others think first before doing any of that.

comment:14 Changed 5 years ago by yan12125 (Chih-Hsuan Yen)

Cc: yan12125 removed

comment:15 Changed 3 years ago by cooljeanius (Eric Gallager)

Cc: cooljeanius added

comment:16 Changed 3 months ago by cooljeanius (Eric Gallager)

In light of the xz-utils backdoor from earlier this year, I'd just like to say that I'm glad that this hasn't been done yet: https://en.wikipedia.org/wiki/XZ_Utils_backdoor

comment:17 Changed 3 months ago by ryandesign (Ryan Carsten Schmidt)

I don't think it's helpful to remind us of the xz-utils backdoor here. Yes, a possibly state-sponsored actor spend years infiltrating the xz-utils project to insert a sophisticated backdoor—which never worked on macOS, by the way, and so would not have affected us had we started using xz to compress our archives, and has never affected us even though many ports use xz-compressed distfiles. Could that actor have targeted macOS if they wanted to? Sure. If they hadn't had success with xz-utils could they have targeted bzip2, zlib, zstd, or any other compression method we might choose to use? Sure. Does that mean we should avoid using all software just because it could at some point in the future be compromised by someone with a government's amount of resources?

The more general goal of this ticket, per its first line, is to enhance MacPorts base so that we could compress archives better. xz files are one way to reach that goal. xz files use the lzma algorithm. There are other containers using that algorithm, like lz (lzip) and 7z (7zip). And there are other compression algorithms that might be worth consideration, like zstd. The advantage of xz is that the capability to decompress xz files is already built into OS X 10.9 and later—using libarchive, not xz-utils.

Note: See TracTickets for help on using tickets.