Opened 15 years ago

Closed 14 years ago

#21517 closed enhancement (fixed)

python25, python26: japanese locale errors

Reported by: null.atou@… Owned by: jyrkiwahlstedt
Priority: Normal Milestone:
Component: ports Version: 1.8.0
Keywords: Cc: MarcusCalhoun-Lopez (Marcus Calhoun-Lopez)
Port: python26 python25

Description

When the python script eyeD3 execute under Japanese environment, 'LookupError: unknown encoding: x-mac-japanese' has occurred.

% __CF_USER_TEXT_ENCODING=$UID:0:0 eyeD3 hoge.mp3   <--- encoding is 'mac-roman'
(snip, no error but many Mojibake occur!)
% __CF_USER_TEXT_ENCODING=$UID:1:14 eyeD3 hoge.mp3  <--- In Japanese environment, __CF_USER_TEXT_ENCODING already set
(snip)
Uncaught exception: unknown encoding: x-mac-japanese
Traceback (most recent call last):
  File "/opt/local/bin/eyeD3", line 1215, in <module>
    retval = main();
  File "/opt/local/bin/eyeD3", line 1192, in main
    retval = app.handleFile(f);
  File "/opt/local/bin/eyeD3", line 566, in handleFile
    self.printTag(self.tag);
  File "/opt/local/bin/eyeD3", line 937, in printTag
    "replace"),
LookupError: unknown encoding: x-mac-japanese

This is because python26 (and also python25) don't look LANG env-var(ja_JP.UTF-8 in many case in Japan), but get an encoding name 'x-mac-japanese' from CoreFoundation CFStringGetSystemEncoding() and CFStringConvertEncodingToIANACharSetName() (see 'Lib/locale.py' and a source 'Modules/_localemodule.c'). Then, unfortunately, a python only knows codecs in the codec tablehttp://docs.python.org/library/codecs#standard-encodings. In the table, there are no 'x-mac-japanese' or 'x-mac-trad-chinese' or 'x-mac-korean' etc... So a simple test is here:

% __CF_USER_TEXT_ENCODING=$UID:1:14 python -c 'import locale; print "getdefaultlocale is", locale.getdefaultlocale(), ", getpreferredencoding :", locale.getpreferredencoding();'
getdefaultlocale is (None, 'x-mac-japanese') , getpreferredencoding : x-mac-japanese
% __CF_USER_TEXT_ENCODING=$UID:1:14 /usr/bin/python -c 'import locale; print "getdefaultlocale is", locale.getdefaultlocale(), ", getpreferredencoding :", locale.getpreferredencoding();'
getdefaultlocale is ('ja_JP', 'UTF8') , getpreferredencoding : UTF-8

Yes, apple's python2.6.1 in Snow Leopard (and also apple's python2.5.1 in Leopard) looks not CF_... but LANG.

I referred to web pages (in Japanese) here and here, and made a simple patch arround locale problem. Under this patch, python26 looks LANG env-var and get well-known encoding 'UTF-8', so, no error, no mojibake is occurred in eyeD3:-)

By the way, there is no unknown encoding error in python31 because a similar change applies in Modules/_localemodule.c in version 3.1.

Attachments (2)

patch-locale-from-apple-darwinsource.diff (1.1 KB) - added by null.atou@… 15 years ago.
patch to Lib/locale.py and Modules/_localemodule.c
hoge.mp3 (18.0 KB) - added by null.atou@… 15 years ago.
IDv2.3, Artist and album tags use Japanese language

Download all attachments as: .zip

Change History (12)

Changed 15 years ago by null.atou@…

patch to Lib/locale.py and Modules/_localemodule.c

comment:1 Changed 15 years ago by jmroot (Joshua Root)

Cc: mcalhoun@… mww@… added
Keywords: Python locale removed
Owner: changed from macports-tickets@… to blb@…
Port: python26, python25python26 python25

You should also open a bug upstream (if you haven't already).

comment:2 Changed 15 years ago by blb@…

This looks like it could be python issue 1276 which has a different fix (which was only commited to py3k branch and not 2.x). Does that fix work as well? Do you have a file which can be used to reproduce this issue (if that mp3 isn't shareable)?

comment:3 Changed 15 years ago by null.atou@…

In Japanese environment, when make a new user, default LANG is set to ja_JP.UTF-8. While a dot file ~/.CFUserTextEncoding(this file affects __CF_USER_TEXT_ENCODING) is '1:14'. '1' means that encoding is CP10001; x-mac-japanese. But, we Japanese don't use x-mac-japanese but use UTF-8 in Terminal. Yes, we prefer UTF-8, not x-mac-japanese.

At a glance, the way of python-issue-1276 seems adding encodings(x-mac-japanese, etc.) to the table of codecs. But these patches don't correct the locale problem in the environment of LANG=ja_JP.UTF-8, because LANG was still ignored. And I found, in Python 3.1 (more correctly 3.1rc2), they disappear the special routine for darwin or __APPLE__ that use CoreFoundation function as mentioned above, so as to follow standard UNIX manner. This change looks same as Apple's patches.

% LANG=ja_JP.UTF-8 /usr/bin/python2.6 -c 'import locale; print(locale.getpreferredencoding());'
UTF-8
% LANG=ja_JP.UTF-8 /opt/local/bin/python3.1 -c 'import locale; print(locale.getpreferredencoding());'
UTF-8
% LANG=ja_JP.UTF-8 /opt/local/bin/python2.6 -c 'import locale; print(locale.getpreferredencoding());' 
x-mac-japanese  # bad!

Changed 15 years ago by null.atou@…

Attachment: hoge.mp3 added

IDv2.3, Artist and album tags use Japanese language

comment:4 Changed 15 years ago by null.atou@…

You can test eyeD3 hoge.mp3

% __CF_USER_TEXT_ENCODING=$UID:1:0 /opt/local/bin/eyeD3 hoge.mp3
(LookupError)
% __CF_USER_TEXT_ENCODING=$UID:0:0 /opt/local/bin/eyeD3 hoge.mp3
(snip)
title: 		artist: ????????
album: ????????		year: 2009

hmm, not mojibake but ??????? :-) and private build version from source to /usr/local/bin and using /usr/bin/python, result is expected.

% LANG=ja_JP.UTF-8 /usr/local/bin/eyeD3 hoge.mp3

hoge.mp3	[ 18.00 KB ]
-------------------------------------------------------------------------------
Time: 00:01	MPEG1, Layer III	[ 128 kb/s @ 44100 Hz - Joint stereo ]
-------------------------------------------------------------------------------
ID3 v2.3:
title: 		artist: アップルの中の人
album: システムサウンド		year: 2009

comment:5 Changed 15 years ago by blb@…

Great, thanks for the reproducible tests; python26 fixed in r58097.

The other question is whether we want to bother with python25, since it doesn't work quite right on 10.6 and not being as well maintained upstream anymore?

comment:6 in reply to:  5 Changed 15 years ago by null.atou@…

Replying to blb@…:

python26 fixed in r58097.

Thank you! eyeD3 works well, python looks LANG. No more error, no more mojibake. Thank you!

The other question is whether we want to bother with python25

Yes off course, because as the same behavior to LANG env-var.

comment:7 Changed 15 years ago by jmroot (Joshua Root)

Cc: jwa@… added; mww@… removed

comment:8 Changed 14 years ago by blb@…

Cc: mcalhoun@… removed
Owner: changed from blb@… to mcalhoun@…

comment:9 Changed 14 years ago by jmroot (Joshua Root)

Cc: mcalhoun@… added; jwa@… removed
Owner: changed from mcalhoun@… to jwa@…
Summary: py26-eyed3-0.6.17 LookupError: unknown encoding: x-mac-japanesepython25, python26: japanese locale errors

comment:10 Changed 14 years ago by jmroot (Joshua Root)

Resolution: fixed
Status: newclosed

Applied to python25 in r74671.

Note: See TracTickets for help on using tickets.