Opened 18 years ago
Closed 17 years ago
#11978 closed defect (fixed)
portindex doesn't encode non-ASCII characters correctly in ISO-8859-1 locales
Reported by: | vinc17@… | Owned by: | kballard (Lily Ballard) |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | 1.4.40 |
Keywords: | Cc: | vinc17@… | |
Port: |
Description
In ISO-8859-1 locales, portindex does a UTF-8 encoding twice:
$ grep xmlittre PortIndex stardict-xmlittre 329 portdir textproc/stardict-xmlittre variants universal description {XMLittré dictionary for stardict} name stardict-xmlittre version 2.4.2 categories textproc homepage http://francois.gannaz.free.fr/Littre/accueil.php revision 0 epoch 0 maintainers vincent-opdarw@vinc17.org long_description {XMLittré dictionary for stardict.}
instead of
stardict-xmlittre 327 portdir textproc/stardict-xmlittre variants universal description {XMLittré dictionary for stardict} name stardict-xmlittre version 2.4.2 categories textproc homepage http://francois.gannaz.free.fr/Littre/accueil.php revision 0 epoch 0 maintainers vincent-opdarw@vinc17.org long_description {XMLittré dictionary for stardict.}
Change History (6)
comment:1 Changed 18 years ago by kballard (Lily Ballard)
comment:2 Changed 18 years ago by kballard (Lily Ballard)
Owner: | changed from macports-dev@… to eridius@… |
---|---|
Status: | new → assigned |
comment:3 Changed 17 years ago by kballard (Lily Ballard)
I set my locale to ISO-8859-1 and re-ran PortIndex, and the entry for xmlittre is still UTF-8-encoded. Can you give me instructions to reproduce your results?
comment:4 Changed 17 years ago by vinc17@…
The locale
command outputs:
LANG="POSIX" LC_COLLATE="POSIX" LC_CTYPE="en_US.ISO8859-1" LC_MESSAGES="POSIX" LC_MONETARY="POSIX" LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_ALL="POSIX/en_US.ISO8859-1/POSIX/POSIX/POSIX/POSIX"
(LANG
and LC_COLLATE
are set to POSIX
, LC_CTYPE
is set to en_US.ISO8859-1
, and the other variables are not set.)
prunille:~/software/dports> grep xmlittr PortIndex | hexdump -C | tail -5 00000130 64 65 73 63 72 69 70 74 69 6f 6e 20 7b 58 4d 4c |description {XML| 00000140 69 74 74 72 c3 83 c2 a9 20 64 69 63 74 69 6f 6e |ittrÃ.© diction| 00000150 61 72 79 20 66 6f 72 20 73 74 61 72 64 69 63 74 |ary for stardict| 00000160 2e 7d 0a |.}.| 00000163
comment:5 Changed 17 years ago by kballard (Lily Ballard)
Ah hah. I guess I wasn't setting it to ISO-8859-1 correctly. I can reproduce this issue now. Seems like the solution is to replace source with a combination of open/fconfigure/read/close.
comment:6 Changed 17 years ago by kballard (Lily Ballard)
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Ok, I killed the silly [fconfigure $fd -encoding utf-8] calls I had in there and now I do an [encoding system utf-8] in dportinit. This will cause all file access to default to utf-8 (but stdin and stdout keep their original values, so it should display non-ASCII text just fine).
Committed in r25975.
It sounds like it's reading the UTF-8-encoded Portfile as ISO-8859-1, and then UTF-8-encoding that for output. That's a bit odd, as I tried testing exactly that case and it seemed to autodetect UTF-8, but I guess it's not working right.