Mark Dickinson
01ac8b6ab1
Use correct types for ASCII_CHAR_MASK integer constants.
14 years ago
Antoine Pitrou
aaefac76dd
Issue #14874 : Restore charmap decoding speed to pre-PEP 393 levels.
Patch by Serhiy Storchaka.
14 years ago
Victor Stinner
f185226244
_copy_characters(): move debug code at the top to avoid noisy #ifdef
And don't use assert() anymore if check_maxchar is set: return -1 on error
instead.
14 years ago
Victor Stinner
07621338fb
Fix PyUnicode_GetSize(): Don't replace _PyUnicode_Ready() exception
14 years ago
Victor Stinner
8a8b3eaabe
Fix a compiler warning in _copy_characters() and remove debug code
14 years ago
Victor Stinner
24e403bbee
Oops, fix my previous change on _copy_characters()
14 years ago
Victor Stinner
ca439eecea
Fix unicode_adjust_maxchar(): catch PyUnicode_New() failure
14 years ago
Victor Stinner
184252ad3f
Fix "%f" format of str%args if the result is not an ASCII or latin1 string
14 years ago
Victor Stinner
9a77770add
Remove debug code
14 years ago
Victor Stinner
c9d369f1bf
Optimize _PyUnicode_FastCopyCharacters() when maxchar(from) > maxchar(to)
14 years ago
Victor Stinner
f05e17ece9
unicodeobject.c: Remove debug code
14 years ago
Antoine Pitrou
27f6a3b0bf
Issue #15026 : utf-16 encoding is now significantly faster (up to 10x).
Patch by Serhiy Storchaka.
14 years ago
Kristján Valur Jónsson
55e5dc8371
Rearrange code to beat an optimizer bug affecting Release x64 on windows
with VS2010sp1
14 years ago
Victor Stinner
d7b7c7472b
Issue #14993 : Use standard "unsigned char" instead of a unsigned char bitfield
14 years ago
Kristjan Valur Jonsson
85634d7a2e
Issue #14909 : A number of places were using PyMem_Realloc() apis and
PyObject_GC_Resize() with incorrect error handling. In case of errors,
the original object would be leaked. This checkin fixes those cases.
14 years ago
Victor Stinner
3a7d096f2f
Issue #14744 : Fix compilation on Windows (part 2)
14 years ago
Victor Stinner
d3f0882dfb
Issue #14744 : Use the new _PyUnicodeWriter internal API to speed up str%args and str.format(args)
* Formatting string, int, float and complex use the _PyUnicodeWriter API. It
avoids a temporary buffer in most cases.
* Add _PyUnicodeWriter_WriteStr() to restore the PyAccu optimization: just
keep a reference to the string if the output is only composed of one string
* Disable overallocation when formatting the last argument of str%args and
str.format(args)
* Overallocation allocates at least 100 characters: add min_length attribute
to the _PyUnicodeWriter structure
* Add new private functions: _PyUnicode_FastCopyCharacters(),
_PyUnicode_FastFill() and _PyUnicode_FromASCII()
The speed up is around 20% in average.
14 years ago
Antoine Pitrou
63065d761e
Issue #14624 : UTF-16 decoding is now 3x to 4x faster on various inputs.
Patch by Serhiy Storchaka.
14 years ago
Martin v. Löwis
b05c0738d8
Silence VS 2010 signed/unsigned warnings.
14 years ago
Antoine Pitrou
758153badb
Fix refleaks introduced by 83da67651687.
14 years ago
Antoine Pitrou
e45c0c5cef
Fix logic error introduced by 83da67651687.
14 years ago
Benjamin Peterson
1ff2e35e84
simplify by shortcutting when the kind of the needle is larger than the haystack
14 years ago
Antoine Pitrou
ca5f91b888
Issue #14738 : Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy Storchaka.
14 years ago
Victor Stinner
3b1a74a9c3
Rename unicode_write_t structure and its methods to "_PyUnicodeWriter"
14 years ago
Victor Stinner
ee4544c920
Issue #14744 : Inline unicode_writer_write_char() and unicode_write_str()
Optimize also PyUnicode_Format(): call unicode_writer_prepare() only once
per argument.
14 years ago
Victor Stinner
f59c28c930
unicode_writer_finish() checks string consistency
14 years ago
Victor Stinner
106802547c
Backout ab500b297900: the check for integer overflow is wrong
Issue #14716 : Change integer overflow check in unicode_writer_prepare()
to compute the limit at compile time instead of runtime. Patch writen by Serhiy
Storchaka.
14 years ago
Victor Stinner
0576f9b4cf
Issue #14716 : Change integer overflow check in unicode_writer_prepare()
to compute the limit at compile time instead of runtime. Patch writen by Serhiy
Storchaka.
14 years ago
Victor Stinner
202fdca133
Close #14716 : str.format() now uses the new "unicode writer" API instead of the
PyAccu API. For example, it makes str.format() from 25% to 30% faster on Linux.
14 years ago
Mark Dickinson
99e2e5552a
Issue #14700 : Fix two broken and undefined-behaviour-inducing overflow checks in old-style string formatting. Thanks Serhiy Storchaka for report and original patch.
14 years ago
Victor Stinner
d0dba6eee8
unicode_writer: don't force inline when it is not necessary
Keep inline for performance critical functions (functions used in loops)
14 years ago
Benjamin Peterson
b63f49f2b4
if the kind of the string to count is larger than the string to search, shortcut to 0
14 years ago
Victor Stinner
a7b654be30
unicode_writer: add finish() method and assertions to write_str() method
* The write_str() method does nothing if the length is zero.
* Replace "struct unicode_writer_t" with "unicode_writer_t"
14 years ago
Victor Stinner
bf4e266397
Issue #14687 : Remove redundant length attribute of unicode_write_t
The length can be read directly from the buffer
14 years ago
Victor Stinner
7989157e49
Issue #14687 : Cleanup unicode_writer_prepare()
"Inline" PyUnicode_Resize(): call directly resize_compact()
14 years ago
Victor Stinner
f2c76aa6cb
Issue #14687 : str%tuple now uses an optimistic "unicode writer" instead of an
accumulator. Directly write characters into the output (don't use a temporary
list): resize and widen the string on demand.
14 years ago
Victor Stinner
1b487b467b
Issue #14624 , #14687 : Optimize unicode_widen()
Don't convert uninitialized characters. Patch written by Serhiy Storchaka.
14 years ago
Victor Stinner
3a7f7977f1
Remove buggy assertion in PyUnicode_Substring()
Use also directly unicode_empty, instead of PyUnicode_New(0,0).
14 years ago
Victor Stinner
684d5fd420
Fix PyUnicode_Substring() for start >= length and start > end
Remove the fast-path for 1-character string: unicode_fromascii() and
_PyUnicode_FromUCS*() now have their own fast-path for 1-character strings.
14 years ago
Victor Stinner
b6cd014d75
Unicode: optimize creating of 1-character strings
14 years ago
Victor Stinner
bff7c96834
Issue #14687 : Optimize str%tuple for the "%(name)s" syntax
Avoid an useless and expensive call to PyUnicode_READ().
14 years ago
Victor Stinner
e6abb488c9
unicodeobject.c: Add MAX_MAXCHAR() macro to (micro-)optimize the computation
of the second argument of PyUnicode_New().
* Create also align_maxchar() function
* Optimize fix_decimal_and_space_to_ascii(): don't compute the maximum
character when ch <= 127 (it is ASCII)
14 years ago
Victor Stinner
438106b66e
Issue #14687 : Cleanup PyUnicode_Format()
14 years ago
Victor Stinner
b5c3ea3af3
Issue #14687 : Optimize str%args
* formatfloat() uses unicode_fromascii() instead of PyUnicode_DecodeASCII()
to not have to check characters, we know that it is really ASCII
* Use PyUnicode_FromOrdinal() instead of _PyUnicode_FromUCS4() to format
a character: if avoids a call to ucs4lib_find_max_char() to compute
the maximum character (whereas we already know it, it is just the character
itself)
14 years ago
Victor Stinner
b80e46eca4
Issue #14687 : Avoid an useless duplicated string in PyUnicode_Format()
14 years ago
Victor Stinner
aff3cc659b
Issue #14687 : Cleanup PyUnicode_Format()
14 years ago
Victor Stinner
b11d91d969
Fix my previous commit: bool is a long, restore the specical case for bool
14 years ago
Victor Stinner
d0880d57b0
Simplify and optimize formatlong()
* Remove _PyBytes_FormatLong(): inline it into formatlong()
* the input type is always a long, so remove the code for bool
* don't duplicate the string if the length does not change
* Use PyUnicode_DATA() instead of _PyUnicode_AsString()
14 years ago
Victor Stinner
94d558b063
Optimize _PyUnicode_FindMaxChar() find pure ASCII strings
14 years ago
Victor Stinner
8f825060f1
Check newly created consistency using _PyUnicode_CheckConsistency(str, 1)
* In debug mode, fill the string data with invalid characters
* Simplify also reference counting in PyCodec_BackslashReplaceErrors()
and PyCodec_XMLCharRefReplaceError()
14 years ago