Benjamin Peterson
b2bf01d824
use full unicode mappings for upper/lower/title case ( #12736 )
Also broaden the category of characters that count as lowercase/uppercase.
14 years ago
Antoine Pitrou
94f6fa62bf
Issue #13738 : Simplify implementation of bytes.lower() and bytes.upper().
14 years ago
Victor Stinner
3fe553160c
Add a new PyUnicode_Fill() function
It is faster than the unicode_fill() function which was implemented in
formatter_unicode.c.
14 years ago
Benjamin Peterson
5e458f520c
also decref the right thing
14 years ago
Benjamin Peterson
4c13a4a352
ready the correct string
14 years ago
Benjamin Peterson
22a29708fd
fix some possible refleaks from PyUnicode_READY error conditions
14 years ago
Benjamin Peterson
9ca3ffac94
== -1 is convention
14 years ago
Benjamin Peterson
e157cf1012
make switch more robust
14 years ago
Benjamin Peterson
2199227be4
fix weird indentation
14 years ago
Antoine Pitrou
5b62942074
Issue #13577 : Built-in methods and functions now have a __qualname__.
Patch by sbt.
14 years ago
Benjamin Peterson
c0b95d18fa
4 space indentation
14 years ago
Benjamin Peterson
ead6b53659
fix spacing around switch statements
14 years ago
Benjamin Peterson
53aa1d7c57
fix possible if unlikely leak
14 years ago
Georg Brandl
ac0675cc01
Small clarification in docstring of dict.update(): the positional argument is not required.
14 years ago
Victor Stinner
6099a03202
Issue #13624 : Write a specialized UTF-8 encoder to allow more optimization
The main bottleneck was the PyUnicode_READ() macro.
14 years ago
Victor Stinner
73f53b57d1
Optimize str * n for len(str)==1 and UCS-2 or UCS-4
14 years ago
Victor Stinner
f644110816
Issue #13621 : Optimize str.replace(char1, char2)
Use findchar() which is more optimized than a dummy loop using
PyUnicode_READ(). PyUnicode_READ() is a complex and slow macro.
14 years ago
Victor Stinner
f8eac00779
Issue #13623 : Fix a performance regression introduced by issue #12170 in
bytes.find() and handle correctly OverflowError (raise the same ValueError than
the error for -1).
14 years ago
Victor Stinner
bb2e9c477d
Issue #11231 : Fix bytes and bytearray docstrings
Patch written by Brice Berna.
14 years ago
Benjamin Peterson
f2fe7f0881
fix possible NULL dereference
14 years ago
Victor Stinner
2f197078fb
The locale decoder raises a UnicodeDecodeError instead of an OSError
Search the invalid character using mbrtowc().
14 years ago
Victor Stinner
1b57967b96
Issue #13560 : Locale codec functions use the classic "errors" parameter,
instead of surrogateescape
So it would be possible to support more error handlers later.
14 years ago
Victor Stinner
ab59594326
What's New in Python 3.3: complete the deprecation list
Add also FIXMEs in unicodeobject.c
14 years ago
Victor Stinner
1f33f2b0c3
Issue #13560 : os.strerror() now uses the current locale encoding instead of UTF-8
14 years ago
Victor Stinner
f2ea71fcc8
Issue #13560 : Add PyUnicode_EncodeLocale()
* Use PyUnicode_EncodeLocale() in time.strftime() if wcsftime() is not
available
* Document my last changes in Misc/NEWS
14 years ago
Victor Stinner
af02e1c85a
Add PyUnicode_DecodeLocaleAndSize() and PyUnicode_DecodeLocale()
* PyUnicode_DecodeLocaleAndSize() and PyUnicode_DecodeLocale() decode a string
from the current locale encoding
* _Py_char2wchar() writes an "error code" in the size argument to indicate
if the function failed because of memory allocation failure or because of a
decoding error. The function doesn't write the error message directly to
stderr.
* Fix time.strftime() (if wcsftime() is missing): decode strftime() result
from the current locale encoding, not from the filesystem encoding.
14 years ago
Antoine Pitrou
093ce9cd8c
Issue #6695 : Full garbage collection runs now clear the freelist of set objects.
Initial patch by Matthias Troffaes.
14 years ago
Benjamin Peterson
bfebb7b54a
improve abstract property support ( closes #11610 )
Thanks to Darren Dale for patch.
14 years ago
Antoine Pitrou
e0e2735f41
Fix OSError.__init__ and OSError.__new__ so that each of them can be
overriden and take additional arguments (followup to issue #12555 ).
14 years ago
Antoine Pitrou
2e872082f6
Fix the fix for issue #12149 : it was incorrect, although it had the side
effect of appearing to resolve the issue. Thanks to Mark Shannon for
noticing.
14 years ago
Florent Xicluna
aa6c1d240f
Issue #13575 : there is only one class type.
14 years ago
Antoine Pitrou
9d57481f04
Issue #13577 : various kinds of descriptors now have a __qualname__ attribute.
Patch by sbt.
14 years ago
Victor Stinner
16e6a80923
PyUnicode_Resize(): warn about canonical representation
Call also directly unicode_resize() in unicodeobject.c
14 years ago
Victor Stinner
b0a82a6a7f
Fix PyUnicode_Resize() for compact string: leave the string unchanged on error
Fix also PyUnicode_Resize() doc
14 years ago
Victor Stinner
bf6e560d0c
Make PyUnicode_Copy() private => _PyUnicode_Copy()
Undocument the function.
Make also decode_utf8_errors() as private (static).
14 years ago
Victor Stinner
7a9105a380
resize_copy() now supports legacy ready strings
14 years ago
Victor Stinner
488fa49acf
Rewrite PyUnicode_Append(); unicode_modifiable() is more strict
* Rename unicode_resizable() to unicode_modifiable()
* Rename _PyUnicode_Dirty() to unicode_check_modifiable() to make it clear
that the function is private
* Inline PyUnicode_Concat() and unicode_append_inplace() in PyUnicode_Append()
to simplify the code
* unicode_modifiable() return 0 if the hash has been computed or if the string
is not an exact unicode string
* Remove _PyUnicode_DIRTY(): no need to reset the hash anymore, because if the
hash has already been computed, you cannot modify a string inplace anymore
* PyUnicode_Concat() checks for integer overflow
14 years ago
Victor Stinner
c4b495497a
Create unicode_result_unchanged() subfunction
14 years ago
Victor Stinner
eaab604829
Fix fixup() for unchanged unicode subtype
If maxchar_new == 0 and self is a unicode subtype, return u instead of duplicating u.
14 years ago
Victor Stinner
e6b2d4407a
unicode_fromascii() doesn't check string content twice in debug mode
_PyUnicode_CheckConsistency() also checks string content.
14 years ago
Victor Stinner
a1d12bb119
Call directly PyUnicode_DecodeUTF8Stateful() instead of PyUnicode_DecodeUTF8()
* Remove micro-optimization from PyUnicode_FromStringAndSize():
PyUnicode_DecodeUTF8Stateful() has already these optimizations (for size=0
and one ascii char).
* Rename utf8_max_char_size_and_char_count() to utf8_scanner(), and remove an
useless variable
14 years ago
Victor Stinner
382955ff4e
Use directly unicode_empty instead of PyUnicode_New(0, 0)
14 years ago
Victor Stinner
785938eebd
Move the slowest UTF-8 decoder to its own subfunction
* Create decode_utf8_errors()
* Reuse unicode_fromascii()
* decode_utf8_errors() doesn't refit at the beginning
* Remove refit_partial_string(), use unicode_adjust_maxchar() instead
14 years ago
Victor Stinner
84def3774d
Fix error handling in resize_compact()
14 years ago
Victor Stinner
8faf8216e4
PyUnicode_FromWideChar() and PyUnicode_FromUnicode() raise a ValueError if a
character in not in range [U+0000; U+10ffff].
14 years ago
Antoine Pitrou
b0e1f8b38b
Issue #13503 : Use a more efficient reduction format for bytearrays with
pickle protocol >= 3. The old reduction format is kept with older
protocols in order to allow unpickling under Python 2.
Patch by Irmen de Jong.
14 years ago
Victor Stinner
0a54cf12a0
Fix PyObject_Repr(): don't call PyUnicode_READY() if res is NULL
14 years ago
Victor Stinner
b37b17423b
Replace PyUnicode_FromUnicode(NULL, 0) by PyUnicode_New(0, 0)
Create an empty string with the new Unicode API.
14 years ago
Victor Stinner
db88ae5d66
PyObject_Repr() ensures that the result is a ready Unicode string
And PyObject_Str() and PyObject_Repr() don't make strings ready in debug
mode to ensure that the caller makes the string ready before using it.
14 years ago
Victor Stinner
551ac95733
Py_UNICODE_HIGH_SURROGATE() and Py_UNICODE_LOW_SURROGATE() macros
And use surrogates macros everywhere in unicodeobject.c
14 years ago