#43145 closed defect (fixed)
Perl PortGroup fails to reinplace non-ascii Makefiles
Reported by: | mojca (Mojca Miklavec) | Owned by: | macports-tickets@… |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | 2.2.1 |
Keywords: | Cc: | danielluke (Daniel J. Luke), neverpanic (Clemens Lang), anddam (Andrea D'Amore), cooljeanius (Eric Gallager), jeremyhu (Jeremy Huddleston Sequoia) | |
Port: | p5-test-base p5-yaml |
Description (last modified by anddam (Andrea D'Amore))
When building p5-yaml on 10.8 and 10.9, sed fails with
DEBUG: Executing reinplace: /usr/bin/sed {/^CCFLAGS *=/s/$/ /} < /opt/local/var/macports/build/_opt_mports_dports_perl_p5-yaml/p5.16-yaml/work/YAML-0.90/Makefile >@ file14 DEBUG: sed: RE error: illegal byte sequence
(see https://build.macports.org/builders/buildports-mavericks-x86_64/builds/2454) because the Makefile
contains a non-ascii character (in Latin 1 encoding, invalid UTF).
This comes from the perl PortGroup:
# CCFLAGS can be passed in to "configure" but it's not necessarily inherited. # LDFLAGS can't be passed in (or if it can, it's not easy to figure out how). post-configure { fs-traverse file ${configure.dir} { if {[file isfile ${file}] && [file tail ${file}] eq "Makefile"} { ui_info "Fixing flags in [string map "${configure.dir}/ {}" ${file}]" reinplace "/^CCFLAGS *=/s/$/ [get_canonical_archflags cc]/" ${file} reinplace "/^OTHERLDFLAGS *=/s/$/ [get_canonical_archflags ld]/" ${file} } } }
Suggestions from IRC:
neverpanic:
set LC_ALL to C I think that's a common issue with ill-formatted UTF-8 files /usr/bin/sed changed on 10.8 and above You can either patch that to be valid UTF-8 or set configure.env LC_ALL=C in the Portfile It is kind of weird to use the locale settings for that. I mean you could set the correct locale for the file, but then any messages to your terminal would be broken…
Dar1us:
it refuses to process files which aren't validly encoded given the current LC settings .. and by default that is UTF-8
Change History (18)
comment:1 Changed 11 years ago by cooljeanius (Eric Gallager)
Cc: | egall@… added |
---|
comment:2 Changed 11 years ago by mojca (Mojca Miklavec)
Port: | p5-test-base added |
---|
comment:3 Changed 11 years ago by danielluke (Daniel J. Luke)
see also: https://lists.macosforge.org/pipermail/macports-dev/2012-August/019993.html
(yes the perl portgroup should be updated)
comment:4 Changed 11 years ago by anddam (Andrea D'Amore)
Description: | modified (diff) |
---|
The portgroup cannot possibly know what encoding will the produced Makefile use.
Port p5-yaml is using ExtUtils::MakeMaker that in turn relies on Pod::Man. Pod::Man can produce utf8 Makefiles, search utf8 in in this page but seems that MakeMaker doesn't have any option to specify that, see this request.
Since the portgroup is patching Makefiles we could sanitize the file by adding a new variable specifying a charset to convert from. In this case the portfile writer would specify something like
perl5.makefile_encoding iso8859-1 perl5.setup YAML 0.90
and the post-configure phase would take care of convert the file to UTF-8 (or any other target encoding depending on the darwin release) before reinplaces.
comment:5 Changed 11 years ago by danielluke (Daniel J. Luke)
... or we could just set LANG=C and things will work just like they used to.
comment:6 Changed 11 years ago by jmroot (Joshua Root)
Cc: | jeremyhu@… added |
---|
The nicest solution would be to iconv the Makefile.PL to UTF-8 first so any info that ends up being installed doesn’t have a weird encoding. But in any case, this is the exact reason why reinplace has a -locale flag.
comment:7 Changed 11 years ago by mojca (Mojca Miklavec)
Just to explain where the weird encoding comes from.
The original file dist.ini
contains UTF-8 encoded
author = Ingy döt Net
Then this apparently gets translated into
"AUTHOR" => "Ingy d\x{f6}t Net",
in Makefile.PL
. During the configure phase, the following command is called (more or less):
perl5.18 Makefile.PL
which results in
# --- MakeMaker ppd section: # Creates a PPD (Perl Package Description) for a binary distribution. ppd : $(NOECHO) $(ECHO) '<SOFTPKG NAME="$(DISTNAME)" VERSION="$(VERSION)">' > $(DISTNAME).ppd $(NOECHO) $(ECHO) ' <ABSTRACT>YAML Ain'\''t Markup Language (tm)</ABSTRACT>' >> $(DISTNAME).ppd $(NOECHO) $(ECHO) ' <AUTHOR>Ingy döt Net</AUTHOR>' >> $(DISTNAME).ppd $(NOECHO) $(ECHO) ' <IMPLEMENTATION>' >> $(DISTNAME).ppd $(NOECHO) $(ECHO) ' <ARCHITECTURE NAME="darwin-thread-multi-2level-5.18" />' >> $(DISTNAME).ppd $(NOECHO) $(ECHO) ' <CODEBASE HREF="" />' >> $(DISTNAME).ppd $(NOECHO) $(ECHO) ' </IMPLEMENTATION>' >> $(DISTNAME).ppd $(NOECHO) $(ECHO) '</SOFTPKG>' >> $(DISTNAME).ppd
in Latin 1 encoding.
Maybe a proper fix in ExtUtils::MakeMaker
(to allow UTF-8 output) would be better (other than asking the upstream to remove non-ascii from the Makefile.PL
).
comment:8 follow-up: 9 Changed 11 years ago by jmroot (Joshua Root)
If the output is always Latin 1, that at least makes it easy to pass the right locale to reinplace.
comment:9 Changed 11 years ago by anddam (Andrea D'Amore)
Replying to jmr@…:
The nicest solution would be to iconv the Makefile.PL to UTF-8 first so any info that ends up being installed doesn’t have a weird encoding.
I'm not sure that this would help since Pod::Man needs a specific flag to enable utf8 and seems that ExtUtils::MakeMaker isn't passing it when calling Pod::Man. Also you'd need to specify the =encoding in the temporary produced POD file.
utf8 By default, Pod::Man produces the most conservative possible *roff output to try to ensure that it will work with as many different *roff implementations as possible. […] Be aware that, when using this option, the input encoding of your POD source must be properly declared unless it is US-ASCII or Latin-1. POD input without an =encoding command will be assumed to be in Latin-1, and if it's actually in UTF-8, the output will be double-encoded.
But in any case, this is the exact reason why reinplace has a -locale flag.
Correct, so we could specify a perl5.reinplace_locale variable and pass that.
Since the post-configure is traversing the filesystem checking for all "Makefile" named files and the encoding isn't necessarily the same for all files I think that this reinplace_locale should be a structure of {{file_name file_locale} …} couples that should be parsed per each file.
Replying to jmr@…:
If the output is always Latin 1, that at least makes it easy to pass the right locale to reinplace.
The output could be plain ASCII (but I guess that wouldn't be a problem), but not all modules use ExtUtiles::MakeMaker, I've seen at least on Makefile.PL directly generating the Makefile.
comment:10 follow-up: 12 Changed 11 years ago by mojca (Mojca Miklavec)
We are neither recoding files nor changing any non-ascii portions of the text. So I would say that encoding is pretty much irrelevant even if it's not Latin 1, as long as it's kept intact. The "C" local should do just as well.
comment:11 Changed 10 years ago by mojca (Mojca Miklavec)
See also #44630.
At least the following ports are affected (but there might be more):
- p5-io-all
- p5-spiffy
- p5-test-base
- p5-yaml
- p5-yaml-libyaml
comment:12 Changed 10 years ago by danielluke (Daniel J. Luke)
Replying to mojca@…:
We are neither recoding files nor changing any non-ascii portions of the text. So I would say that encoding is pretty much irrelevant even if it's not Latin 1, as long as it's kept intact. The "C" local should do just as well.
+1 for adding -locale C to the reinplace calls in the perl5 portgroup
comment:13 Changed 10 years ago by danielluke (Daniel J. Luke)
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:14 follow-up: 15 Changed 10 years ago by mojca (Mojca Miklavec)
Please also remove the patch in the other four ports then.
comment:15 Changed 10 years ago by danielluke (Daniel J. Luke)
Replying to mojca@…:
Please also remove the patch in the other four ports then.
done, although since the perl5 portgroup change didn't break them, I don't believe I really had an obligation to correct them.
comment:16 follow-up: 17 Changed 10 years ago by mojca (Mojca Miklavec)
comment:17 Changed 10 years ago by danielluke (Daniel J. Luke)
Replying to mojca@…:
Sure, the ports weren't broken, but we would probably forget to remove the patch(es) later and could keep them forever without any good reason.
historically, it has been up to the individual port maintainers to handle non-breaking modifications to base (or portgroups), not the person/persons making changes to base or portgroups.
comment:18 Changed 10 years ago by mojca (Mojca Miklavec)
(You can probably also move the checksums out of the brackets in p5-test-base
.)
Cc Me!