Benjamin Peterson
3aca40d3cb
closes bpo-36861: Update Unicode database to 12.1.0. (GH-13214)
Adds ㋿.
7 years ago
Inada Naoki
6fec905de5
bpo-36642: make unicodedata const (GH-12855)
7 years ago
Benjamin Peterson
738c19f4c5
closes bpo-33376: Update to Unicode 12.0.0. (GH-12256)
7 years ago
Benjamin Peterson
7c69c1c0fb
update to Unicode 11.0.0 (closes bpo-33778) (GH-7439)
Also, standardize indentation of generated tables.
8 years ago
Benjamin Peterson
279a96206f
bpo-30736: upgrade to Unicode 10.0 ( #2344 )
Straightforward. While we're at it, though, strip trailing whitespace from generated tables.
9 years ago
Jon Dufresne
3972628de3
bpo-30296 Remove unnecessary tuples, lists, sets, and dicts ( #1489 )
* Replaced list(<generator expression>) with list comprehension
* Replaced dict(<generator expression>) with dict comprehension
* Replaced set(<list literal>) with set literal
* Replaced builtin func(<list comprehension>) with func(<generator
expression>) when supported (e.g. any(), all(), tuple(), min(), &
max())
9 years ago
Benjamin Peterson
6775231597
Unicode 9.0.0
Not completely mechanical since support for East Asian Width changes—emoji
codepoints became Wide—had to be added to unicodedata.
10 years ago
Benjamin Peterson
4801383c29
upgrade to Unicode 8.0.0
11 years ago
R David Murray
5f16f90d1b
#18176 : Change generic UCD PropList link to version specific link.
12 years ago
R David Murray
5bd62420f4
#18176 : fix another reference and add it to the makeunicodedata comment.
12 years ago
R David Murray
7445a383a6
#18176 : updated stdtypes UCD link, added reminder to makeunicodedata.
Patch by Alexander Belopolsky.
12 years ago
Benjamin Peterson
3032ed7cb1
upgrade to unicode 7.0.0
12 years ago
Benjamin Peterson
94d08d908b
upgrade unicode db to 6.3.0 ( closes #19221 )
13 years ago
Ezio Melotti
7c4a7e6f3c
#18803 : fix more typos. Patch by Févry Thibault.
13 years ago
Antoine Pitrou
9ed5f27266
Issue #18722 : Remove uses of the "register" keyword in C code.
13 years ago
Benjamin Peterson
b8350f1c7d
upgrade to UCD 6.2
14 years ago
Florent Xicluna
c20740109d
Some cleanup in the Tools directory.
14 years ago
Benjamin Peterson
71f660e00f
update to Unicode 6.1
14 years ago
Benjamin Peterson
ad9c569825
delta encoding of upper/lower/title makes a glorious return ( #12736 )
14 years ago
Benjamin Peterson
d5890c8db5
add str.casefold() ( closes #13752 )
14 years ago
Benjamin Peterson
b2bf01d824
use full unicode mappings for upper/lower/title case ( #12736 )
Also broaden the category of characters that count as lowercase/uppercase.
14 years ago
Ezio Melotti
931b8aac80
#12753 : Add support for Unicode name aliases and named sequences.
15 years ago
Ezio Melotti
2a1e926d63
Fix ResourceWarnings in makeunicodedata.py.
15 years ago
Ezio Melotti
13925008dc
#11565 : Fix several typos. Patch by Piotr Kasprzyk.
15 years ago
Martin v. Löwis
5cbc71e50a
Issue #10459 : Update CJK character names to Unicode 6.0.
16 years ago
Martin v. Löwis
baecd7243a
Upgrade to Unicode 6.0.0.
makeunicodedata.py: download all data files from unicode.org,
switch to extracting Unihan data from zip file.
Read linebreakprops and derivednormalizationprops even for
old versions, even though they are not used in delta records.
test:unicode.py: U+11000 is now assigned, use U+14000 instead.
16 years ago
Amaury Forgeot d'Arc
feb7307db4
#9210 : remove --with-wctype-functions configure option.
The internal unicode database is now always used.
(after 5 years: see
http://mail.python.org/pipermail/python-dev/2004-December/050193.html
)
16 years ago
Amaury Forgeot d'Arc
324ac65ceb
#5127 : Even on narrow unicode builds, the C functions that access the Unicode
Database (Py_UNICODE_TOLOWER, Py_UNICODE_ISDECIMAL, and others) now accept
and return characters from the full Unicode range (Py_UCS4).
The differences from Python code are few:
- unicodedata.numeric(), unicodedata.decimal() and unicodedata.digit()
now return the correct value for large code points
- repr() may consider more characters as printable.
16 years ago
Florent Xicluna
806d8cf0e8
Merged revisions 79494,79496 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r79494 | florent.xicluna | 2010-03-30 10:24:06 +0200 (mar, 30 mar 2010) | 2 lines
#7643 : Unicode codepoints VT (0x0B) and FF (0x0C) are linebreaks according to Unicode Standard Annex #14 .
........
r79496 | florent.xicluna | 2010-03-30 18:29:03 +0200 (mar, 30 mar 2010) | 2 lines
Highlight the change of behavior related to r79494. Now VT and FF are linebreaks.
........
16 years ago
Florent Xicluna
22b243809e
#7643 : Unicode codepoints VT (0x0B) and FF (0x0C) are linebreaks according to Unicode Standard Annex #14 .
16 years ago
Florent Xicluna
f089fd67fc
Merged revisions 78982,78986 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r78982 | florent.xicluna | 2010-03-15 15:00:58 +0100 (lun, 15 mar 2010) | 2 lines
Remove py3k deprecation warnings from these Unicode tools.
........
r78986 | florent.xicluna | 2010-03-15 19:08:58 +0100 (lun, 15 mar 2010) | 3 lines
Issue #7783 and #7787 : open_urlresource invalidates the outdated files from the local cache.
Use this feature to fix test_normalization.
........
16 years ago
Florent Xicluna
faa663f03d
Fixed a failure in test_bigmem.
Merged revision 79059 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r79059 | florent.xicluna | 2010-03-18 22:50:06 +0100 (jeu, 18 mar 2010) | 2 lines
Issue #8024 : Update the Unicode database to 5.2
........
16 years ago
Florent Xicluna
f1789dee30
Revert Unicode UCD 5.2 upgrade in 3.x. It broke repr() for unicode objects, and gave failures in test_bigmem. Revert 79062, 79065 and 79083.
16 years ago
Florent Xicluna
8c8042734a
Missing update from previous changeset r79062.
16 years ago
Florent Xicluna
2e0a53fdf6
Issue #8024 : Update the Unicode database to 5.2
16 years ago
Florent Xicluna
dc36472472
Remove py3k deprecation warnings from these Unicode tools.
16 years ago
Amaury Forgeot d'Arc
919765a095
Merged revisions 75396 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r75396 | amaury.forgeotdarc | 2009-10-13 23:29:34 +0200 (mar., 13 oct. 2009) | 3 lines
#7112 : Fix compilation warning in unicodetype_db.h
makeunicodedata now generates double literals
........
17 years ago
Amaury Forgeot d'Arc
5c92d4301d
#7112 : Fix compilation warning in unicodetype_db.h
makeunicodedata now generates double literals
17 years ago
Amaury Forgeot d'Arc
7d52079395
Merged revisions 75272-75273 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r75272 | amaury.forgeotdarc | 2009-10-06 21:56:32 +0200 (mar., 06 oct. 2009) | 5 lines
#1571184 : makeunicodedata.py now generates the functions _PyUnicode_ToNumeric,
_PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace.
It now also parses the Unihan.txt for numeric values.
........
r75273 | amaury.forgeotdarc | 2009-10-06 22:02:09 +0200 (mar., 06 oct. 2009) | 2 lines
Add Anders Chrigstrom to Misc/ACKS for his work on unicodedata.
........
17 years ago
Amaury Forgeot d'Arc
d0052d17b1
#1571184 : makeunicodedata.py now generates the functions _PyUnicode_ToNumeric,
_PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace.
It now also parses the Unihan.txt for numeric values.
17 years ago
Antoine Pitrou
7a0fedfd1d
Merged revisions 72054 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r72054 | antoine.pitrou | 2009-04-27 23:53:26 +0200 (lun., 27 avril 2009) | 5 lines
Issue #1734234 : Massively speedup `unicodedata.normalize()` when the
string is already in normalized form, by performing a quick check beforehand.
Original patch by Rauli Ruohonen.
........
17 years ago
Antoine Pitrou
e988e286b2
Issue #1734234 : Massively speedup `unicodedata.normalize()` when the
string is already in normalized form, by performing a quick check beforehand.
Original patch by Rauli Ruohonen.
17 years ago
Walter Dörwald
1b08b30743
Merged revisions 71894 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r71894 | walter.doerwald | 2009-04-25 16:03:16 +0200 (Sa, 25 Apr 2009) | 4 lines
Issue #5828 (Invalid behavior of unicode.lower): Fixed bogus logic in
makeunicodedata.py and regenerated the Unicode database (This fixes
u'\u1d79'.lower() == '\x00').
........
17 years ago
Walter Dörwald
6c863d1ab2
Merged revisions 71894 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r71894 | walter.doerwald | 2009-04-25 16:03:16 +0200 (Sa, 25 Apr 2009) | 4 lines
Issue #5828 (Invalid behavior of unicode.lower): Fixed bogus logic in
makeunicodedata.py and regenerated the Unicode database (This fixes
u'\u1d79'.lower() == '\x00').
........
17 years ago
Walter Dörwald
5d98ec76bb
Issue #5828 (Invalid behavior of unicode.lower): Fixed bogus logic in
makeunicodedata.py and regenerated the Unicode database (This fixes
u'\u1d79'.lower() == '\x00').
17 years ago
Benjamin Peterson
09832740d1
fix isprintable() on space characters #5126
17 years ago
Mark Dickinson
a56c467ac3
Issue #1717 : Remove cmp. Stage 1: remove all uses of cmp and __cmp__ from
the standard library and tests.
17 years ago
Martin v. Löwis
93cbca33f2
Merged revisions 66362 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r66362 | martin.v.loewis | 2008-09-10 15:38:12 +0200 (Mi, 10 Sep 2008) | 3 lines
Issue #3811 : The Unicode database was updated to 5.1.
Reviewed by Fredrik Lundh and Marc-Andre Lemburg.
........
18 years ago
Martin v. Löwis
24329ba176
Issue #3811 : The Unicode database was updated to 5.1.
Reviewed by Fredrik Lundh and Marc-Andre Lemburg.
18 years ago
Georg Brandl
d52429fb49
Issue #3282 : str.isprintable() should return False for undefined Unicode characters.
18 years ago