Opened 42 hours ago
Last modified 32 hours ago
#71237 new defect
sed: RE error: illegal byte sequence
Reported by: | ballapete (Peter "Pete" Dyballa) | Owned by: | |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | 2.10.2 |
Keywords: | Cc: | ||
Port: | Perl modules |
Description (last modified by ryandesign (Ryan Carsten Schmidt))
While trying to proof that Perl 5.38 is ready for production use I tried in a final step to patch all the test files that start with #!/usr/bin/perl
(or similarly) to start with #!/usr/bin/env perl
or ${perl5.bin}
and ran into that sed
problem. On the command line it gives:
pete 313 /\ head -20 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/80_rt.t | tail -3 | sed -e 's:/usr/bin/perl:/usr/bin/env perl:' while (<DATA>) { sed: RE error: illegal byte sequence Exit 1 pete 314 /\ which sed
and also:
pete 312 /\ head -20 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/80_rt.t | tail -3 | gsed -e 's:/usr/bin/perl:/usr/bin/env perl:' while (<DATA>) { if (s/^\253(\d+)\273\s*-?\s*//) { chomp;
So it's likely that using gsed
instead of the system's sed
will solve the problem.
How can I make port
use gsed
instead of the system's sed
?
Another example:
pete 318 /\ head -47 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/42_bindparam.t | tail -3 | sed -e 's:/usr/bin/perl:/usr/bin/env perl:' # Now try the explicit type settings ok ($sth->bind_param (1, " 4", &SQL_INTEGER), "bind 4 int"); sed: RE error: illegal byte sequence Exit 1 pete 319 /\ head -47 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/42_bindparam.t | tail -3 | gsed -e 's:/usr/bin/perl:/usr/bin/env perl:' # Now try the explicit type settings ok ($sth->bind_param (1, " 4", &SQL_INTEGER), "bind 4 int"); ok ($sth->bind_param (2, "Andreas K\366nig"), "bind str");
Another one would be
head -55 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-encode/p5.38-encode/work/Encode-3.21/t/at-cn.t | tail -7 | {g,}sed -e 's:/usr/bin/perl:/usr/bin/env perl:'
where Chinese characters are somehow "encoded".
Obviously sed
"knows" some "forbidden" characters it cannot work on, and obviously it has problems with encodings other than 7 or 8 bit that gsed has learned to overcome.
Change History (4)
comment:1 Changed 33 hours ago by ryandesign (Ryan Carsten Schmidt)
Description: | modified (diff) |
---|
comment:2 Changed 33 hours ago by ryandesign (Ryan Carsten Schmidt)
From portfiles or portgroups, sed
is usually invoked using reinplace
. You can use its -locale
flag to set LC_CTYPE
, for example reinplace -locale C
as you see in several places in the perl5-1.0 portgroup already.
comment:3 Changed 33 hours ago by ballapete (Peter "Pete" Dyballa)
These files are a bit hard to handle:
p5.38-dbd-csv/work/DBD-CSV-0.60/t/42_bindparam.t:1:#!/usr/bin/perl p5.38-dbd-csv/work/DBD-CSV-0.60/t/80_rt.t:1:#!/usr/bin/perl p5.38-html-tree/work/HTML-Tree-5.07/t/split.t:1:#!/usr/bin/perl -T p5.38-text-csv_xs/work/Text-CSV_XS-1.56/t/20_file.t:1:#!/usr/bin/perl p5.38-text-csv_xs/work/Text-CSV_XS-1.56/t/21_lexicalio.t:1:#!/usr/bin/perl p5.38-text-csv_xs/work/Text-CSV_XS-1.56/t/22_scalario.t:1:#!/usr/bin/perl p5.38-text-csv_xs/work/Text-CSV_XS-1.56/t/55_combi.t:1:#!/usr/bin/perl p5.38-text-csv_xs/work/Text-CSV_XS-1.56/t/70_rt.t p5.38-tk/work/Tk-804.036/t/unicode.t:1:#!/usr/bin/perl -w
Certainly in Perl 5.34 too!
comment:4 Changed 32 hours ago by ballapete (Peter "Pete" Dyballa)
This works fine! Right now 636 Perl modules built and installed, most of them could be tested. Tests failed with p5.38-fcgi p5.38-devel-callparser p5.38-file-homedir p5.38-future-io p5.38-inline-c p5.38-kavorka p5.38-mac-fsevents p5.38-math-pari p5.38-moops p5.38-poe p5.38-safe p5.38-unix-process (also with Perl 5.34). The first one might receive an update, if not, it would need a MacPorts specific patch to shorten the path name of the test socket used, see #71221.
Tomorrow I'check whether tests still run fine or report need for even more modules!
System
sed
expects input to be UTF-8 by default; you're trying to use it on files that aren't UTF-8. Working around this by usinggsed
instead is not necessary nor recommended. You can still use systemsed
as long as you set theLC_CTYPE
environment variable either to the correct locale or just toC
.