Opened 42 hours ago

Last modified 32 hours ago

#71237 new defect

sed: RE error: illegal byte sequence

Reported by: ballapete (Peter "Pete" Dyballa) Owned by:
Priority: Normal Milestone:
Component: ports Version: 2.10.2
Keywords: Cc:
Port: Perl modules

Description (last modified by ryandesign (Ryan Carsten Schmidt))

While trying to proof that Perl 5.38 is ready for production use I tried in a final step to patch all the test files that start with #!/usr/bin/perl (or similarly) to start with #!/usr/bin/env perl or ${perl5.bin} and ran into that sed problem. On the command line it gives:

pete 313 /\ head -20 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/80_rt.t | tail -3 | sed -e 's:/usr/bin/perl:/usr/bin/env perl:'
while (<DATA>) {
sed: RE error: illegal byte sequence
Exit 1
pete 314 /\ which sed

and also:

pete 312 /\ head -20 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/80_rt.t | tail -3 | gsed -e 's:/usr/bin/perl:/usr/bin/env perl:'
while (<DATA>) {
    if (s/^\253(\d+)\273\s*-?\s*//) {
	chomp;

So it's likely that using gsed instead of the system's sed will solve the problem.

How can I make port use gsed instead of the system's sed?

Another example:

pete 318 /\ head -47 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/42_bindparam.t | tail -3 | sed -e 's:/usr/bin/perl:/usr/bin/env perl:'
# Now try the explicit type settings
ok ($sth->bind_param (1, " 4", &SQL_INTEGER),	"bind 4 int");
sed: RE error: illegal byte sequence
Exit 1

pete 319 /\ head -47 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/42_bindparam.t | tail -3 | gsed -e 's:/usr/bin/perl:/usr/bin/env perl:'
# Now try the explicit type settings
ok ($sth->bind_param (1, " 4", &SQL_INTEGER),	"bind 4 int");
ok ($sth->bind_param (2, "Andreas K\366nig"),	"bind str");

Another one would be

head -55 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-encode/p5.38-encode/work/Encode-3.21/t/at-cn.t  | tail -7 | {g,}sed -e 's:/usr/bin/perl:/usr/bin/env perl:'

where Chinese characters are somehow "encoded".

Obviously sed "knows" some "forbidden" characters it cannot work on, and obviously it has problems with encodings other than 7 or 8 bit that gsed has learned to overcome.

Change History (4)

comment:1 Changed 33 hours ago by ryandesign (Ryan Carsten Schmidt)

Description: modified (diff)

System sed expects input to be UTF-8 by default; you're trying to use it on files that aren't UTF-8. Working around this by using gsed instead is not necessary nor recommended. You can still use system sed as long as you set the LC_CTYPE environment variable either to the correct locale or just to C.

comment:2 Changed 33 hours ago by ryandesign (Ryan Carsten Schmidt)

From portfiles or portgroups, sed is usually invoked using reinplace. You can use its -locale flag to set LC_CTYPE, for example reinplace -locale C as you see in several places in the perl5-1.0 portgroup already.

comment:3 Changed 33 hours ago by ballapete (Peter "Pete" Dyballa)

These files are a bit hard to handle:

p5.38-dbd-csv/work/DBD-CSV-0.60/t/42_bindparam.t:1:#!/usr/bin/perl
p5.38-dbd-csv/work/DBD-CSV-0.60/t/80_rt.t:1:#!/usr/bin/perl
p5.38-html-tree/work/HTML-Tree-5.07/t/split.t:1:#!/usr/bin/perl -T
p5.38-text-csv_xs/work/Text-CSV_XS-1.56/t/20_file.t:1:#!/usr/bin/perl
p5.38-text-csv_xs/work/Text-CSV_XS-1.56/t/21_lexicalio.t:1:#!/usr/bin/perl
p5.38-text-csv_xs/work/Text-CSV_XS-1.56/t/22_scalario.t:1:#!/usr/bin/perl
p5.38-text-csv_xs/work/Text-CSV_XS-1.56/t/55_combi.t:1:#!/usr/bin/perl
p5.38-text-csv_xs/work/Text-CSV_XS-1.56/t/70_rt.t
p5.38-tk/work/Tk-804.036/t/unicode.t:1:#!/usr/bin/perl -w

Certainly in Perl 5.34 too!

comment:4 Changed 32 hours ago by ballapete (Peter "Pete" Dyballa)

This works fine! Right now 636 Perl modules built and installed, most of them could be tested. Tests failed with p5.38-fcgi p5.38-devel-callparser p5.38-file-homedir p5.38-future-io p5.38-inline-c p5.38-kavorka p5.38-mac-fsevents p5.38-math-pari p5.38-moops p5.38-poe p5.38-safe p5.38-unix-process (also with Perl 5.34). The first one might receive an update, if not, it would need a MacPorts specific patch to shorten the path name of the test socket used, see #71221.

Tomorrow I'check whether tests still run fine or report need for even more modules!

Note: See TracTickets for help on using tickets.