Berker Peksag
ced8d4c6eb
Issue #27454 : Use PyDict_SetDefault in PyUnicode_InternInPlace
Patch by INADA Naoki.
10 years ago
Serhiy Storchaka
9305d83425
Issue #26754 : PyUnicode_FSDecoder() accepted a filename argument encoded as
an iterable of integers. Now only strings and byte-like objects are accepted.
10 years ago
Martin Panter
e26da7c03a
Issue #27171 : Fix typos in documentation, comments, and test function names
10 years ago
Serhiy Storchaka
dd40fc3e57
Issue #26765 : Moved common code and docstrings for bytes and bytearray methods
to bytes_methods.c.
10 years ago
Martin Panter
6245cb3c01
Correct “an” → “a” with “Unicode”, “user”, “UTF”, etc
This affects documentation, code comments, and a debugging messages.
10 years ago
Serhiy Storchaka
21a663ea28
Issue #26057 : Got rid of nonneeded use of PyUnicode_FromObject().
10 years ago
Serhiy Storchaka
57a01d3a0e
Issue #26200 : Added Py_SETREF and replaced Py_XSETREF with Py_SETREF
in places where Py_DECREF was used.
10 years ago
Serhiy Storchaka
48842714b9
Issue #22570 : Renamed Py_SETREF to Py_XSETREF.
10 years ago
Serhiy Storchaka
fbb1c5ee06
Issue #26494 : Fixed crash on iterating exhausting iterators.
Affected classes are generic sequence iterators, iterators of str, bytes,
bytearray, list, tuple, set, frozenset, dict, OrderedDict, corresponding
views and os.scandir() iterator.
10 years ago
Victor Stinner
337986740f
Issue #26464 : Fix unicode_fast_translate() again
Initialize i variable if the string is non-ASCII.
10 years ago
Victor Stinner
6c9aa8f2bf
Fix str.translate()
Issue #26464 : Fix str.translate() when string is ASCII and first replacements
removes character, but next replacement uses a non-ASCII character or a string
longer than 1 character. Regression introduced in Python 3.5.0.
10 years ago
Victor Stinner
5bc03a6d4d
Fix resize_compact()
Issue #26217 : resize_compact() must set wstr_length to 0 after freeing the wstr
string. Otherwise, an assertion fails in _PyUnicode_CheckConsistency().
10 years ago
Serhiy Storchaka
191321d11b
Issue #20440 : More use of Py_SETREF.
This patch is manually crafted and contains changes that couldn't be handled
automatically.
10 years ago
Serhiy Storchaka
ef1585eb9a
Issue #25923 : Added more const qualifiers to signatures of static and private functions.
10 years ago
Serhiy Storchaka
2d06e84455
Issue #25923 : Added the const qualifier to static constant arrays.
10 years ago
Serhiy Storchaka
5a57ade58e
Issue #20440 : Massive replacing unsafe attribute setting code with special
macro Py_SETREF.
10 years ago
Serhiy Storchaka
9b3a2eec1c
Issues #25890 , #25891 , #25892 : Removed unused variables in Windows code.
Reported by Alexander Riccio.
10 years ago
Serhiy Storchaka
31b9410654
Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.
10 years ago
Serhiy Storchaka
e800941d66
Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.
10 years ago
Serhiy Storchaka
7aa690860e
Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.
10 years ago
Benjamin Peterson
a4d33b3428
make the PyUnicode_FSConverter cleanup set the decrefed argument to NULL ( closes #25630 )
10 years ago
Serhiy Storchaka
413fdcea21
Issue #24821 : Refactor STRINGLIB(fastsearch_memchr_1char) and split it on
STRINGLIB(find_char) and STRINGLIB(rfind_char) that can be used independedly
without special preconditions.
10 years ago
Serhiy Storchaka
d65c9496da
Issue #25523 : Further a-to-an corrections.
10 years ago
Victor Stinner
358af13526
Issue #25353 : Optimize unicode escape and raw unicode escape encoders to use
the new _PyBytesWriter API.
10 years ago
Victor Stinner
6c2cdae9e6
Writer APIs: use empty string singletons
Modify _PyBytesWriter_Finish() and _PyUnicodeWriter_Finish() to return the
empty bytes/Unicode string if the string is empty.
10 years ago
Victor Stinner
6bd525b656
Optimize error handlers of ASCII and Latin1 encoders when the replacement
string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual
character.
Cleanup unicode_encode_ucs1():
* Rename repunicode to rep
* Clear rep object on error
* Factorize code between bytes and unicode path
10 years ago
Victor Stinner
ce179bf6ba
Add _PyBytesWriter_WriteBytes() to factorize the code
10 years ago
Victor Stinner
ad7715891e
_PyBytesWriter: simplify code to avoid "prealloc" parameters
Substract preallocate bytes from min_size before calling
_PyBytesWriter_Prepare().
10 years ago
Victor Stinner
3fa36ff5e4
Issue #25318 : Fix backslashreplace()
Fix code to estimate the needed space.
10 years ago
Victor Stinner
797485e101
Issue #25318 : Avoid sprintf() in backslashreplace()
Rewrite backslashreplace() to be closer to PyCodec_BackslashReplaceErrors().
Add also unit tests for non-BMP characters.
10 years ago
Victor Stinner
0016507c16
Issue #25318 : Move _PyBytesWriter to bytesobject.c
Declare also the private API in bytesobject.h.
10 years ago
Victor Stinner
e7bf86cd7d
Optimize backslashreplace error handler
Issue #25318 : Optimize backslashreplace and xmlcharrefreplace error handlers in
UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and
Latin1 encoders.
Use the new _PyBytesWriter API to optimize these error handlers for the
encoders. It avoids to create an exception and call the slow implementation of
the error handler.
10 years ago
Victor Stinner
fdfbf78114
Issue #25318 : Add _PyBytesWriter API
Add a new private API to optimize Unicode encoders. It uses a small buffer
allocated on the stack and supports overallocation.
Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable
overallocation for the UTF-8 encoder with error handlers.
unicode_encode_ucs1(): initialize collend to collstart+1 to not check the
current character twice, we already know that it is not ASCII.
10 years ago
Victor Stinner
74e8fac3c8
Issue #25301 : Fix compatibility with ISO C90
10 years ago
Victor Stinner
1d65d9192d
Issue #25301 : The UTF-8 decoder is now up to 15 times as fast for error
handlers: ``ignore``, ``replace`` and ``surrogateescape``.
10 years ago
Victor Stinner
eb36fdaad8
Fix _PyUnicodeWriter_PrepareKind()
Initialize kind to 0 (PyUnicode_WCHAR_KIND) to ensure that
_PyUnicodeWriter_PrepareKind() handles correctly read-only buffer: copy the
buffer.
10 years ago
Serhiy Storchaka
28b21e50c8
Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data:
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
10 years ago
Victor Stinner
3222da26fe
Make _PyUnicode_TranslateCharmap() symbol private
unicodeobject.h exposes PyUnicode_TranslateCharmap() and PyUnicode_Translate().
10 years ago
Victor Stinner
01ada3996b
Issue #25267 : The UTF-8 encoder is now up to 75 times as fast for error
handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``.
Patch co-written with Serhiy Storchaka.
10 years ago
Victor Stinner
c3713e9706
Optimize ascii/latin1+surrogateescape encoders
Issue #25227 : Optimize ASCII and latin1 encoders with the ``surrogateescape``
error handler: the encoders are now up to 3 times as fast.
Initial patch written by Serhiy Storchaka.
10 years ago
Victor Stinner
0030cd52da
Issue #25227 : Cleanup unicode_encode_ucs1() error handler
* Change limit type from unsigned int to Py_UCS4, to use the same type than the
"ch" variable (an Unicode character).
* Reuse ch variable for _Py_ERROR_XMLCHARREFREPLACE
* Add some newlines for readability
10 years ago
Victor Stinner
54385b206d
Issue #24870 : revert unwanted change
Sorry, I pushed the patch on the UTF-8 decoder by mistake :-(
10 years ago
Victor Stinner
5ebae87628
Issue #25207 , #14626 : Fix my commit.
It doesn't work to use #define XXX defined(YYY)" and then "#ifdef XXX"
to check YYY.
10 years ago
Victor Stinner
6174474bea
_PyUnicodeWriter_PrepareInternal(): make the assertion more strict
10 years ago
Victor Stinner
ca9381ea01
Issue #24870 : Add _PyUnicodeWriter_PrepareKind() macro
Add a macro which ensures that the writer has at least the requested kind.
10 years ago
Victor Stinner
5014920cb7
Issue #24870 : Reuse the new _Py_error_handler enum
Factorize code with the new get_error_handler() function.
Add some empty lines for readability.
10 years ago
Victor Stinner
f96418de05
Issue #24870 : Optimize the ASCII decoder for error handlers: surrogateescape,
ignore and replace. Initial patch written by Naoki Inada.
The decoder is now up to 60 times as fast for these error handlers.
Add also unit tests for the ASCII decoder.
10 years ago
Zachary Ware
79b98df023
Issue #21279 : Flesh out str.translate docs
Initial patch by Kinga Farkas, Martin Panter, and John Posner.
11 years ago
Raymond Hettinger
ac2ef65c32
Make the unicode equality test an external function rather than in-lining it.
The real benefit of the unicode specialized function comes from
bypassing the overhead of PyObject_RichCompareBool() and not
from being in-lined (especially since there was almost no shared
data between the caller and callee). Also, the in-lining was
having a negative effect on code generation for the callee.
11 years ago
Serhiy Storchaka
d4ea03c785
Issue #24284 : The startswith and endswith methods of the str class no longer
return True when finding the empty string and the indexes are completely out
of range.
11 years ago