Opened 15 years ago

Closed 10 years ago

#20686 closed defect (worksforme)

gsed fails to handle non-ASCII characters (bytes with top-bit set) in C locale

Reported by: vinc17@… Owned by: Schamschula (Marius Schamschula)
Priority: Normal Milestone:
Component: ports Version: 1.7.1
Keywords: Cc:
Port: gsed

Description

For instance:

$ echo "abécd" | LC_ALL=C gsed -e 's/.*//'
écd

With the sed from Mac OS X and GNU sed under GNU/Linux, one gets a blank line, thus I suppose that this is what the user expects even though é isn't part of the US-ASCII character set specified by the C locale (and even though the result could depend on the encoding with some expressions).

The consequence is that building ocaml fails if gsed is installed with the with_default_names variant (see bug #20275).

Change History (14)

comment:1 Changed 15 years ago by jabronson@…

Cc: jabronson@… added

Cc Me!

comment:2 Changed 15 years ago by nox@…

I don't get an empty line with Mac OS X sed:

Bellcross:~ nox$ which sed gsed
/usr/bin/sed
/opt/local/bin/gsed
Bellcross:~ nox$ echo "abécd" | LC_ALL=C sed -e 's/.*//'
écd
Bellcross:~ nox$ echo "abécd" | LC_ALL=C gsed -e 's/.*//'
écd

comment:3 Changed 15 years ago by vinc17@…

That's strange. I have Mac OS X 10.4.11. If you have Leopard, perhaps Apple introduced a bug.

comment:4 Changed 15 years ago by vinc17@…

BTW, does bug #20275 occur on your machine?

comment:5 Changed 15 years ago by nox@…

I don't use +with_default_names.

comment:6 Changed 15 years ago by vinc17@…

Yes, but even without +with_default_names (or without gsed installed), you should probably reproduce the bug because your Mac OS X sed is buggy too.

comment:7 Changed 15 years ago by rbubley

I encounterd this problem too. For me (on Tiger)

machine:~/bin user$ which sed gsed
/usr/bin/sed
/opt/local/bin/gsed
machine:~/bin user$ echo "ab\303\251cd" | LC_ALL=C sed -e 's/.*//'

machine:~/bin user$  echo "ab\303\251cd" | LC_ALL=C gsed -e 's/.*//'
écd

comment:8 Changed 14 years ago by jmroot (Joshua Root)

Has this been reported upstream?

comment:9 in reply to:  8 Changed 14 years ago by vinc17@…

Replying to jmr@…:

Has this been reported upstream?

Yes: sed fails to handle bytes with top-bit set in C locale under Mac OS X

comment:10 Changed 13 years ago by jmroot (Joshua Root)

Owner: changed from nox@… to macports-tickets@…

-> nomaintainer

comment:11 Changed 10 years ago by mf2k (Frank Schima)

Owner: changed from macports-tickets@… to mschamschula@…

comment:12 Changed 10 years ago by jabronson@…

Cc: jabronson@… removed

Cc Me!

comment:13 Changed 10 years ago by Schamschula (Marius Schamschula)

This ticket can be closed. With the current version of gsed, @4.2.2, I get the expected empty line (Tested on OS X 10.5.8 and 10.9.4).

comment:14 Changed 10 years ago by mf2k (Frank Schima)

Resolution: worksforme
Status: newclosed
Note: See TracTickets for help on using tickets.