Opened 2 days ago

Last modified 38 hours ago

#71237 new defect

sed: RE error: illegal byte sequence — at Version 1

Reported by: ballapete (Peter "Pete" Dyballa) Owned by:
Priority: Normal Milestone:
Component: ports Version: 2.10.2
Keywords: Cc:
Port: Perl modules

Description (last modified by ryandesign (Ryan Carsten Schmidt))

While trying to proof that Perl 5.38 is ready for production use I tried in a final step to patch all the test files that start with #!/usr/bin/perl (or similarly) to start with #!/usr/bin/env perl or ${perl5.bin} and ran into that sed problem. On the command line it gives:

pete 313 /\ head -20 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/80_rt.t | tail -3 | sed -e 's:/usr/bin/perl:/usr/bin/env perl:'
while (<DATA>) {
sed: RE error: illegal byte sequence
Exit 1
pete 314 /\ which sed

and also:

pete 312 /\ head -20 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/80_rt.t | tail -3 | gsed -e 's:/usr/bin/perl:/usr/bin/env perl:'
while (<DATA>) {
    if (s/^\253(\d+)\273\s*-?\s*//) {
	chomp;

So it's likely that using gsed instead of the system's sed will solve the problem.

How can I make port use gsed instead of the system's sed?

Another example:

pete 318 /\ head -47 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/42_bindparam.t | tail -3 | sed -e 's:/usr/bin/perl:/usr/bin/env perl:'
# Now try the explicit type settings
ok ($sth->bind_param (1, " 4", &SQL_INTEGER),	"bind 4 int");
sed: RE error: illegal byte sequence
Exit 1

pete 319 /\ head -47 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/42_bindparam.t | tail -3 | gsed -e 's:/usr/bin/perl:/usr/bin/env perl:'
# Now try the explicit type settings
ok ($sth->bind_param (1, " 4", &SQL_INTEGER),	"bind 4 int");
ok ($sth->bind_param (2, "Andreas K\366nig"),	"bind str");

Another one would be

head -55 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-encode/p5.38-encode/work/Encode-3.21/t/at-cn.t  | tail -7 | {g,}sed -e 's:/usr/bin/perl:/usr/bin/env perl:'

where Chinese characters are somehow "encoded".

Obviously sed "knows" some "forbidden" characters it cannot work on, and obviously it has problems with encodings other than 7 or 8 bit that gsed has learned to overcome.

Change History (1)

comment:1 Changed 39 hours ago by ryandesign (Ryan Carsten Schmidt)

Description: modified (diff)

System sed expects input to be UTF-8 by default; you're trying to use it on files that aren't UTF-8. Working around this by using gsed instead is not necessary nor recommended. You can still use system sed as long as you set the LC_CTYPE environment variable either to the correct locale or just to C.

Note: See TracTickets for help on using tickets.