Victor Stinner
0d92c4f667
Issue #16416 : Fix error handling in _Py_wchar2char() _Py_char2wchar() functions
14 years ago
Victor Stinner
fc009eff9e
Close #16311 : Use the _PyUnicodeWriter API in text decoders
* Remove unicode_widen(): replaced with _PyUnicodeWriter_Prepare()
* Remove unicode_putchar(): replaced with
PyUnicodeWriter_Prepare() + PyUnicode_WRITER()
* When handling an decoding error, only overallocate the buffer by +25%
instead of +100%
14 years ago
Ezio Melotti
f7ed5d111b
#8271 : the utf-8 decoder now outputs the correct number of U+FFFD characters when used with the "replace" error handler on invalid utf-8 sequences. Patch by Serhiy Storchaka, tests by Ezio Melotti.
14 years ago
Benjamin Peterson
c43112823b
initialize more global type objects ( closes #16369 )
14 years ago
Victor Stinner
e64322e034
Close #14625 : Rewrite the UTF-32 decoder. It is now 3x to 4x faster
Patch written by Serhiy Storchaka.
14 years ago
Victor Stinner
76df43de30
Issue #16330 : Use surrogate-related macros
Patch written by Serhiy Storchaka.
14 years ago
Mark Dickinson
fb90c0934c
Issue #14700 : Fix buggy overflow checks for large precision and width in new-style and old-style formatting.
14 years ago
Victor Stinner
c6cf1ba29e
Replace usage of the deprecated Py_UNICODE_COPY() with Py_MEMCPY() in resize_copy()
14 years ago
Victor Stinner
fe75fb4b3e
Optimize _PyUnicode_HasNULChars(): use findchar() instead of PyUnicode_Contains()
14 years ago
Victor Stinner
6fa627578a
Inline raise_translate_exception(): it is only used once
14 years ago
Victor Stinner
e5567ad236
Optimize PyUnicode_RichCompare() for Py_EQ and Py_NE: always use memcmp()
14 years ago
Christian Heimes
743e0cd6b5
Issue #16166 : Add PY_LITTLE_ENDIAN and PY_BIG_ENDIAN macros and unified
endianess detection and handling.
14 years ago
Chris Jerdonek
83fe2e1c22
Issue #14783 : Improve int() docstring and also str(), range(), and slice().
This commit rewrites the docstring for int() to incorporate the documentation
changes made in issue #16036 . It also switches the docstrings for int(),
str(), range(), and slice() to use multi-line signatures.
14 years ago
Victor Stinner
4c63a972d1
Cleanup PyUnicode_FromFormatV() for zero padding
Skip the "0" instead of parsing it twice: detect zero padding and then parsed
as a digit of the width.
14 years ago
Victor Stinner
15a1136547
Issue #16147 : PyUnicode_FromFormatV() doesn't need anymore to allocate a buffer
on the heap to format numbers.
14 years ago
Victor Stinner
ff5a848db5
Issue #16147 : PyUnicode_FromFormatV() now raises an error if the argument of
'%c' is not in the range(0x110000).
14 years ago
Victor Stinner
3921e90c5a
Issue #16147 : PyUnicode_FromFormatV() now detects integer overflow when parsing
width and precision
14 years ago
Victor Stinner
e215d960be
Issue #16147 : Rewrite PyUnicode_FromFormatV() to use _PyUnicodeWriter API
* Simplify the code: replace 4 steps with one unique step using the
_PyUnicodeWriter API. PyUnicode_Format() has the same design. It avoids to
store intermediate results which require to allocate an array of pointers on
the heap.
* Use the _PyUnicodeWriter API for speed (and its convinient API):
overallocate the buffer to reduce the number of "realloc()"
* Implement "width" and "precision" in Python, don't rely on sprintf(). It
avoids to need of a temporary buffer allocated on the heap: only use a small
buffer allocated in the stack.
* Add _PyUnicodeWriter_WriteCstr() function
* Split PyUnicode_FromFormatV() into two functions: add
unicode_fromformat_arg().
* Inline parse_format_flags(): the format of an argument is now only parsed
once, it's no more needed to have a subfunction.
* Optimize PyUnicode_FromFormatV() for characters between two "%" arguments:
search the next "%" and copy the substring in one chunk, instead of copying
character per character.
14 years ago
Mark Dickinson
c04ddff290
Issue #16096 : Fix several occurrences of potential signed integer overflow. Thanks Serhiy Storchaka.
14 years ago
Victor Stinner
8c6db45d3e
In debug mode, unicode_write_cstr() now checks that non-ASCII characters are
not written into an ASCII string
14 years ago
Ezio Melotti
e7f90375b1
#16127 : remove outdated references to narrow builds. Patch by Serhiy Storchaka.
14 years ago
Victor Stinner
1929407406
Fix PyUnicode_Format(): return NULL if PyUnicode_READY(uformat) failed
This error cannot occur in practice: PyUnicode_FromObject() always return
a "ready" string.
14 years ago
Victor Stinner
770e19e0cc
Optimize unicode_compare(): use memcmp() when comparing two UCS1 strings
14 years ago
Victor Stinner
90db9c47dc
Enable also ptr==ptr optimization in PyUnicode_Compare()
It was already implemented in PyUnicode_RichCompare()
14 years ago
Victor Stinner
aa7712711d
unicode_result_wchar(): move the assert() to the "#ifdef Py_DEBUG" block
14 years ago
Victor Stinner
a4708231e6
Split the huge PyUnicode_Format() function (+540 lines) into subfunctions
14 years ago
Victor Stinner
a049443fab
PyUnicode_Format(): disable overallocation when we are writing the last part
of the output string
14 years ago
Victor Stinner
afffce489b
Unicode: resize_compact() and resize_inplace() fills also the Unicode strings
with invalid bytes in debug mode, as done by PyUnicode_New()
14 years ago
Victor Stinner
c89d28fdfc
Issue #15609 : Fix refleak introduced by my last optimization
14 years ago
Victor Stinner
621ef3d84f
Issue #15609 : Optimize str%args for integer argument
- Use _PyLong_FormatWriter() instead of formatlong() when possible, to avoid
a temporary buffer
- Enable the fast path when width is smaller or equals to the length,
and when the precision is bigger or equals to the length
- Add unit tests!
- formatlong() uses PyUnicode_Resize() instead of _PyUnicode_FromASCII()
to resize the output string
14 years ago
Antoine Pitrou
6f80f5d444
Issue #15379 : Fix passing of non-BMP characters as integers for the charmap decoder (already working as unicode strings).
Patch by Serhiy Storchaka.
14 years ago
Antoine Pitrou
ca8aa4acf6
Issue #15144 : Fix possible integer overflow when handling pointers as integer values, by using Py_uintptr_t instead of size_t.
Patch by Serhiy Storchaka.
14 years ago
Christian Heimes
5f520f4fed
Issue #15900 : Fixed reference leak in PyUnicode_TranslateCharmap()
14 years ago
Christian Heimes
fd30236494
Fixed memory leak in error branch of formatfloat(). CID 719687
14 years ago
Christian Heimes
f4f9939a96
Fixed memory leak in error branch of formatfloat(). CID 719687
14 years ago
Christian Heimes
bdc7e69f42
Issue #15900 : Fixed reference leak in PyUnicode_TranslateCharmap()
14 years ago
Antoine Pitrou
057119b0b7
Fix C++-style comment (xlc compilation failure)
14 years ago
Benjamin Peterson
28a6cfaefc
use the stricter PyMapping_Check ( closes #15801 )
14 years ago
Stefan Krah
8528c3145e
Issue #15728 : Fix leak in PyUnicode_AsWideCharString(). Found by Coverity.
14 years ago
Nick Coghlan
573b1fd779
Fix str docstring
14 years ago
Antoine Pitrou
b4bbee25b1
Issue #14579 : Fix CVE-2012-2135: vulnerability in the utf-16 decoder after error handling.
Patch by Serhiy Storchaka.
14 years ago
Mark Dickinson
01ac8b6ab1
Use correct types for ASCII_CHAR_MASK integer constants.
14 years ago
Antoine Pitrou
aaefac76dd
Issue #14874 : Restore charmap decoding speed to pre-PEP 393 levels.
Patch by Serhiy Storchaka.
14 years ago
Victor Stinner
f185226244
_copy_characters(): move debug code at the top to avoid noisy #ifdef
And don't use assert() anymore if check_maxchar is set: return -1 on error
instead.
14 years ago
Victor Stinner
07621338fb
Fix PyUnicode_GetSize(): Don't replace _PyUnicode_Ready() exception
14 years ago
Victor Stinner
8a8b3eaabe
Fix a compiler warning in _copy_characters() and remove debug code
14 years ago
Victor Stinner
24e403bbee
Oops, fix my previous change on _copy_characters()
14 years ago
Victor Stinner
ca439eecea
Fix unicode_adjust_maxchar(): catch PyUnicode_New() failure
14 years ago
Victor Stinner
184252ad3f
Fix "%f" format of str%args if the result is not an ASCII or latin1 string
14 years ago
Victor Stinner
9a77770add
Remove debug code
14 years ago