Opened 15 years ago
Closed 14 years ago
#21517 closed enhancement (fixed)
python25, python26: japanese locale errors
Reported by: | null.atou@… | Owned by: | jyrkiwahlstedt |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | 1.8.0 |
Keywords: | Cc: | MarcusCalhoun-Lopez (Marcus Calhoun-Lopez) | |
Port: | python26 python25 |
Description
When the python script eyeD3 execute under Japanese environment, 'LookupError: unknown encoding: x-mac-japanese' has occurred.
% __CF_USER_TEXT_ENCODING=$UID:0:0 eyeD3 hoge.mp3 <--- encoding is 'mac-roman' (snip, no error but many Mojibake occur!) % __CF_USER_TEXT_ENCODING=$UID:1:14 eyeD3 hoge.mp3 <--- In Japanese environment, __CF_USER_TEXT_ENCODING already set (snip) Uncaught exception: unknown encoding: x-mac-japanese Traceback (most recent call last): File "/opt/local/bin/eyeD3", line 1215, in <module> retval = main(); File "/opt/local/bin/eyeD3", line 1192, in main retval = app.handleFile(f); File "/opt/local/bin/eyeD3", line 566, in handleFile self.printTag(self.tag); File "/opt/local/bin/eyeD3", line 937, in printTag "replace"), LookupError: unknown encoding: x-mac-japanese
This is because python26 (and also python25) don't look LANG env-var(ja_JP.UTF-8 in many case in Japan), but get an encoding name 'x-mac-japanese' from CoreFoundation CFStringGetSystemEncoding() and CFStringConvertEncodingToIANACharSetName() (see 'Lib/locale.py' and a source 'Modules/_localemodule.c'). Then, unfortunately, a python only knows codecs in the codec tablehttp://docs.python.org/library/codecs#standard-encodings. In the table, there are no 'x-mac-japanese' or 'x-mac-trad-chinese' or 'x-mac-korean' etc... So a simple test is here:
% __CF_USER_TEXT_ENCODING=$UID:1:14 python -c 'import locale; print "getdefaultlocale is", locale.getdefaultlocale(), ", getpreferredencoding :", locale.getpreferredencoding();' getdefaultlocale is (None, 'x-mac-japanese') , getpreferredencoding : x-mac-japanese % __CF_USER_TEXT_ENCODING=$UID:1:14 /usr/bin/python -c 'import locale; print "getdefaultlocale is", locale.getdefaultlocale(), ", getpreferredencoding :", locale.getpreferredencoding();' getdefaultlocale is ('ja_JP', 'UTF8') , getpreferredencoding : UTF-8
Yes, apple's python2.6.1 in Snow Leopard (and also apple's python2.5.1 in Leopard) looks not CF_... but LANG.
I referred to web pages (in Japanese) here and here, and made a simple patch arround locale problem. Under this patch, python26 looks LANG env-var and get well-known encoding 'UTF-8', so, no error, no mojibake is occurred in eyeD3:-)
By the way, there is no unknown encoding error in python31 because a similar change applies in Modules/_localemodule.c in version 3.1.
Attachments (2)
Change History (12)
Changed 15 years ago by null.atou@…
Attachment: | patch-locale-from-apple-darwinsource.diff added |
---|
comment:1 Changed 15 years ago by jmroot (Joshua Root)
Cc: | mcalhoun@… mww@… added |
---|---|
Keywords: | Python locale removed |
Owner: | changed from macports-tickets@… to blb@… |
Port: | python26, python25 → python26 python25 |
You should also open a bug upstream (if you haven't already).
comment:2 Changed 15 years ago by blb@…
This looks like it could be python issue 1276 which has a different fix (which was only commited to py3k branch and not 2.x). Does that fix work as well? Do you have a file which can be used to reproduce this issue (if that mp3 isn't shareable)?
comment:3 Changed 15 years ago by null.atou@…
In Japanese environment, when make a new user, default LANG is set to ja_JP.UTF-8. While a dot file ~/.CFUserTextEncoding(this file affects __CF_USER_TEXT_ENCODING) is '1:14'. '1' means that encoding is CP10001; x-mac-japanese. But, we Japanese don't use x-mac-japanese but use UTF-8 in Terminal. Yes, we prefer UTF-8, not x-mac-japanese.
At a glance, the way of python-issue-1276 seems adding encodings(x-mac-japanese, etc.) to the table of codecs. But these patches don't correct the locale problem in the environment of LANG=ja_JP.UTF-8, because LANG was still ignored. And I found, in Python 3.1 (more correctly 3.1rc2), they disappear the special routine for darwin or __APPLE__ that use CoreFoundation function as mentioned above, so as to follow standard UNIX manner. This change looks same as Apple's patches.
% LANG=ja_JP.UTF-8 /usr/bin/python2.6 -c 'import locale; print(locale.getpreferredencoding());' UTF-8 % LANG=ja_JP.UTF-8 /opt/local/bin/python3.1 -c 'import locale; print(locale.getpreferredencoding());' UTF-8 % LANG=ja_JP.UTF-8 /opt/local/bin/python2.6 -c 'import locale; print(locale.getpreferredencoding());' x-mac-japanese # bad!
Changed 15 years ago by null.atou@…
IDv2.3, Artist and album tags use Japanese language
comment:4 Changed 15 years ago by null.atou@…
You can test eyeD3 hoge.mp3
% __CF_USER_TEXT_ENCODING=$UID:1:0 /opt/local/bin/eyeD3 hoge.mp3 (LookupError) % __CF_USER_TEXT_ENCODING=$UID:0:0 /opt/local/bin/eyeD3 hoge.mp3 (snip) title: artist: ???????? album: ???????? year: 2009
hmm, not mojibake but ??????? :-) and private build version from source to /usr/local/bin and using /usr/bin/python, result is expected.
% LANG=ja_JP.UTF-8 /usr/local/bin/eyeD3 hoge.mp3 hoge.mp3 [ 18.00 KB ] ------------------------------------------------------------------------------- Time: 00:01 MPEG1, Layer III [ 128 kb/s @ 44100 Hz - Joint stereo ] ------------------------------------------------------------------------------- ID3 v2.3: title: artist: アップルの中の人 album: システムサウンド year: 2009
comment:5 follow-up: 6 Changed 15 years ago by blb@…
Great, thanks for the reproducible tests; python26 fixed in r58097.
The other question is whether we want to bother with python25, since it doesn't work quite right on 10.6 and not being as well maintained upstream anymore?
comment:6 Changed 15 years ago by null.atou@…
comment:7 Changed 15 years ago by jmroot (Joshua Root)
Cc: | jwa@… added; mww@… removed |
---|
comment:8 Changed 14 years ago by blb@…
Cc: | mcalhoun@… removed |
---|---|
Owner: | changed from blb@… to mcalhoun@… |
comment:9 Changed 14 years ago by jmroot (Joshua Root)
Cc: | mcalhoun@… added; jwa@… removed |
---|---|
Owner: | changed from mcalhoun@… to jwa@… |
Summary: | py26-eyed3-0.6.17 LookupError: unknown encoding: x-mac-japanese → python25, python26: japanese locale errors |
comment:10 Changed 14 years ago by jmroot (Joshua Root)
Resolution: | → fixed |
---|---|
Status: | new → closed |
Applied to python25 in r74671.
patch to Lib/locale.py and Modules/_localemodule.c