Opened 17 years ago

Last modified 15 years ago

#11774 closed enhancement

Python >= 2.2 should allow for UCS-4 builds — at Version 2

Reported by: jarkko.laiho@… Owned by: mww@…
Priority: Normal Milestone:
Component: ports Version:
Keywords: python unicode ucs2 ucs4 Cc: jarkko.laiho@…, mww@…, jann@…
Port:

Description (last modified by mww@…)

Since version 2.2, it has been possible to compile Python with the option --enable-unicode=ucs4, enabling Python Unicode strings to contain characters beyond the Basic Multilingual Plane (e.g. unichr(65535+1) would no longer result in a ValueError). Without that specific option, Python defaults to using the rather obsolete UCS-2 (resulting in a "narrow build").

This can be seen by issuing the following commands in the interactive interpreter:

>>> import sys
>>> sys.maxunicode
65535

The result is the same with both the version of Python (2.3) that ships with 10.4 Tiger and with python25 from MacPorts. I have not tested this with the ports of other Python versions.

In contrast, on my Gentoo Linux system, the following output is produced instead, since it (like in all modern Linux distributions) is a "wide build":

>>> import sys
>>> sys.maxunicode
1114111

This build option is a source of some confusion in various Python implementations and platforms.

wchar_t on Mac OS X is 32 bits, and PEP 261 states: "It is also proposed that one day --enable-unicode will just default to the width of your platforms wchar_t." UCS-4 builds of Python would seem to make sense on the Mac.

If not a new default option, could a variant be made for Python ports >= 2.2 to compile with UCS-4 support? Is there a reason not to go UCS-4?

Change History (2)

comment:1 Changed 17 years ago by roederja

Cc: mww@… jann@… added

Mww, what do you think about this ?

comment:2 Changed 17 years ago by mww@…

Description: modified (diff)
Owner: changed from macports-dev@… to mww@…
Status: newassigned

If this does not break compatibility in some way, I have no objections to add this to the default build options.

Are there any problems that might occur? (besides people using "clever tricks" that include provoking ValueError...)

Note: See TracTickets for help on using tickets.