Opened 17 years ago

Last modified 15 years ago

#11774 closed enhancement

Python >= 2.2 should allow for UCS-4 builds — at Initial Version

Reported by: jarkko.laiho@… Owned by: macports-dev@…
Priority: Normal Milestone:
Component: ports Version:
Keywords: python unicode ucs2 ucs4 Cc: jarkko.laiho@…
Port:

Description

Since version 2.2, it has been possible to compile Python with the option --enable-unicode=ucs4, enabling Python Unicode strings to contain characters beyond the Basic Multilingual Plane (e.g. unichr(65535+1) would no longer result in a ValueError). Without that specific option, Python defaults to using the rather obsolete UCS-2 (resulting in a "narrow build").

This can be seen by issuing the following commands in the interactive interpreter:

>>> import sys
>>> sys.maxunicode
65535

The result is the same with both the version of Python (2.3) that ships with 10.4 Tiger and with python25 from MacPorts. I have not tested this with the ports of other Python versions.

In contrast, on my Gentoo Linux system, the following output is produced instead, since it (like in all modern Linux distributions) is a "wide build":

>>> import sys
>>> sys.maxunicode
1114111

This build option is a source of some confusion in various Python implementations and platforms.

wchar_t on Mac OS X is 32 bits, and PEP 261 states: "It is also proposed that one day --enable-unicode will just default to the width of your platforms wchar_t." UCS-4 builds of Python would seem to make sense on the Mac.

If not a new default option, could a variant be made for Python ports >= 2.2 to compile with UCS-4 support? Is there a reason not to go UCS-4?

Change History (0)

Note: See TracTickets for help on using tickets.