 branches/innodb+: Merge revisions 5144:5524 from branches/zip
------------------------------------------------------------------------
r5147 | marko | 2009-05-27 06:55:14 -0400 (Wed, 27 May 2009) | 1 line
branches/zip: ibuf0ibuf.c: Improve a comment.
------------------------------------------------------------------------
r5149 | marko | 2009-05-27 07:46:42 -0400 (Wed, 27 May 2009) | 34 lines
branches/zip: Merge revisions 4994:5148 from branches/5.1:
------------------------------------------------------------------------
r5126 | vasil | 2009-05-26 16:57:12 +0300 (Tue, 26 May 2009) | 9 lines
branches/5.1:
Preparation for the fix of
Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Non-functional change: move FSP_* macros from fsp0fsp.h to a new file
fsp0types.h. This is needed in order to be able to use FSP_EXTENT_SIZE
in mtr0log.ic.
------------------------------------------------------------------------
r5127 | vasil | 2009-05-26 17:05:43 +0300 (Tue, 26 May 2009) | 9 lines
branches/5.1:
Preparation for the fix of
Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Do not include unnecessary headers mtr0log.h and fut0lst.h in trx0sys.h
and include fsp0fsp.h just before it is needed. This is needed in order
to be able to use TRX_SYS_SPACE in mtr0log.ic.
------------------------------------------------------------------------
r5128 | vasil | 2009-05-26 17:26:37 +0300 (Tue, 26 May 2009) | 7 lines
branches/5.1:
Fix Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Do not write redo log for the pages in the doublewrite buffer. Also, do not
make a dummy change to the page because this is not needed.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5169 | marko | 2009-05-28 03:21:55 -0400 (Thu, 28 May 2009) | 1 line
branches/zip: mtr0mtr.h: Add Doxygen comments for the redo log entry types.
------------------------------------------------------------------------
r5176 | marko | 2009-05-28 07:14:02 -0400 (Thu, 28 May 2009) | 1 line
branches/zip: Correct a debug assertion that was added in r5125.
------------------------------------------------------------------------
r5201 | marko | 2009-06-01 06:35:25 -0400 (Mon, 01 Jun 2009) | 2 lines
branches/zip: Clean up some comments.
Make the rec parameter of mlog_open_and_write_index() const.
------------------------------------------------------------------------
r5234 | marko | 2009-06-03 08:26:41 -0400 (Wed, 03 Jun 2009) | 44 lines
branches/zip: Merge revisions 5148:5233 from branches/5.1:
------------------------------------------------------------------------
r5150 | vasil | 2009-05-27 18:56:03 +0300 (Wed, 27 May 2009) | 4 lines
branches/5.1:
Whitespace fixup.
------------------------------------------------------------------------
r5191 | vasil | 2009-05-30 17:46:05 +0300 (Sat, 30 May 2009) | 19 lines
branches/5.1:
Merge a change from MySQL (this fixes the failing innodb_mysql test):
------------------------------------------------------------
revno: 1810.3894.10
committer: Sergey Glukhov <Sergey.Glukhov@sun.com>
branch nick: mysql-5.0-bugteam
timestamp: Tue 2009-05-19 11:32:21 +0500
message:
Bug#39793 Foreign keys not constructed when column has a '#' in a comment or default value
Internal InnoDN FK parser does not recognize '\'' as quotation symbol.
Suggested fix is to add '\'' symbol check for quotation condition
(dict_strip_comments() function).
modified:
innobase/dict/dict0dict.c
mysql-test/r/innodb_mysql.result
mysql-test/t/innodb_mysql.test
------------------------------------------------------------------------
r5233 | marko | 2009-06-03 15:12:44 +0300 (Wed, 03 Jun 2009) | 11 lines
branches/5.1: Merge the test case from r5232 from branches/5.0:
------------------------------------------------------------------------
r5232 | marko | 2009-06-03 14:31:04 +0300 (Wed, 03 Jun 2009) | 21 lines
branches/5.0: Merge r3590 from branches/5.1 in order to fix Bug #40565
(Update Query Results in "1 Row Affected" But Should Be "Zero Rows").
Also, add a test case for Bug #40565.
rb://128 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
r5250 | marko | 2009-06-04 02:58:23 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Add Doxygen comments to the rest of buf0*.
------------------------------------------------------------------------
r5251 | marko | 2009-06-04 02:59:51 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Replace <= in a function comment.
------------------------------------------------------------------------
r5253 | marko | 2009-06-04 06:37:35 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Add missing Doxygen comments for page0zip.
------------------------------------------------------------------------
r5261 | vasil | 2009-06-05 11:13:31 -0400 (Fri, 05 Jun 2009) | 15 lines
branches/zip:
Fix Mantis Issue#244 fix bug in linear read ahead (no check on access pattern)
The changes are:
1) Take into account access pattern when deciding whether or not to do linear
read ahead.
2) Expose a knob innodb_read_ahead_factor = [0-64] default (8), dynamic,
global to control linear read ahead behvior
3) Disable random read ahead. Keep the code for now.
Submitted by: Inaam (rb://122)
Approved by: Heikki (rb://122)
------------------------------------------------------------------------
r5262 | vasil | 2009-06-05 12:04:25 -0400 (Fri, 05 Jun 2009) | 22 lines
branches/zip:
Enable functionality to have multiple background io helper threads.
This patch is based on percona contributions.
More details about this patch will be written at:
https://svn.innodb.com/innobase/MultipleBackgroundThreads
The patch essentially does the following:
expose following knobs:
innodb_read_io_threads = [1 - 64] default 1
innodb_write_io_threads = [1 - 64] default 1
deprecate innodb_file_io_threads (this parameter was relevant only on windows)
Internally it allows multiple segments for read and write IO request arrays
where one thread works on one segement.
Submitted by: Inaam (rb://124)
Approved by: Heikki (rb://124)
------------------------------------------------------------------------
r5263 | vasil | 2009-06-05 12:19:37 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Whitespace cleanup.
------------------------------------------------------------------------
r5264 | vasil | 2009-06-05 12:26:58 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5261.
------------------------------------------------------------------------
r5265 | vasil | 2009-06-05 12:34:11 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5262.
------------------------------------------------------------------------
r5268 | inaam | 2009-06-08 12:18:21 -0400 (Mon, 08 Jun 2009) | 7 lines
branches/zip
Non functional change:
Added legal notices acknowledging percona contribution to the multiple
IO helper threads patch i.e.: r5262
------------------------------------------------------------------------
r5283 | inaam | 2009-06-09 13:46:29 -0400 (Tue, 09 Jun 2009) | 9 lines
branches/zip
rb://130
Enable Group Commit functionality that was broken in 5.0 when
distributed transactions were introduced.
Reviewed by: Heikki
------------------------------------------------------------------------
r5319 | marko | 2009-06-11 04:40:33 -0400 (Thu, 11 Jun 2009) | 3 lines
branches/zip: Declare os_thread_id_t as unsigned long,
because ulint is wrong on Win64.
Pointed out by Vladislav Vaintroub <wlad@sun.com>.
------------------------------------------------------------------------
r5320 | inaam | 2009-06-11 09:15:41 -0400 (Thu, 11 Jun 2009) | 14 lines
branches/zip rb://131
This patch changes the following defaults:
max_dirty_pages_pct: default from 90 to 75. max allowed from 100 to 99
additional_mem_pool_size: default from 1 to 8 MB
buffer_pool_size: default from 8 to 128 MB
log_buffer_size: default from 1 to 8 MB
read_io_threads/write_io_threads: default from 1 to 4
The log file sizes are untouched because of upgrade issues
Reviewed by: Heikki
------------------------------------------------------------------------
r5330 | marko | 2009-06-16 04:08:59 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: buf_page_get_gen(): Reduce mutex holding time by adjusting
buf_pool->n_pend_unzip while only holding buf_pool_mutex.
------------------------------------------------------------------------
r5331 | marko | 2009-06-16 05:00:48 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: buf_page_get_zip(): Eliminate a buf_page_get_mutex() call.
The function must switch on the block state anyway.
------------------------------------------------------------------------
r5332 | vasil | 2009-06-16 05:03:27 -0400 (Tue, 16 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5283 and r5320.
------------------------------------------------------------------------
r5333 | marko | 2009-06-16 05:27:46 -0400 (Tue, 16 Jun 2009) | 1 line
branches/zip: buf_page_io_query(): Remove unused function.
------------------------------------------------------------------------
r5335 | marko | 2009-06-16 09:23:10 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: innodb.test: Adjust the tolerance of
innodb_buffer_pool_pages_total for r5320.
------------------------------------------------------------------------
r5342 | marko | 2009-06-17 06:15:32 -0400 (Wed, 17 Jun 2009) | 60 lines
branches/zip: Merge revisions 5233:5341 from branches/5.1:
------------------------------------------------------------------------
r5233 | marko | 2009-06-03 15:12:44 +0300 (Wed, 03 Jun 2009) | 11 lines
branches/5.1: Merge the test case from r5232 from branches/5.0:
------------------------------------------------------------------------
r5232 | marko | 2009-06-03 14:31:04 +0300 (Wed, 03 Jun 2009) | 21 lines
branches/5.0: Merge r3590 from branches/5.1 in order to fix Bug #40565
(Update Query Results in "1 Row Affected" But Should Be "Zero Rows").
Also, add a test case for Bug #40565.
rb://128 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
r5243 | sunny | 2009-06-04 03:17:14 +0300 (Thu, 04 Jun 2009) | 14 lines
branches/5.1: When the InnoDB and MySQL data dictionaries go out of sync, before
the bug fix we would assert on missing autoinc columns. With this fix we allow
MySQL to open the table but set the next autoinc value for the column to the
MAX value. This effectively disables the next value generation. INSERTs will
fail with a generic AUTOINC failure. However, the user should be able to
read/dump the table, set the column values explicitly, use ALTER TABLE to
set the next autoinc value and/or sync the two data dictionaries to resume
normal operations.
Fix Bug#44030 Error: (1500) Couldn't read the MAX(ID) autoinc value from the
index (PRIMARY)
rb://118
------------------------------------------------------------------------
r5252 | sunny | 2009-06-04 10:16:24 +0300 (Thu, 04 Jun 2009) | 2 lines
branches/5.1: The version of the result file checked in was broken in r5243.
------------------------------------------------------------------------
r5259 | vasil | 2009-06-05 10:29:16 +0300 (Fri, 05 Jun 2009) | 7 lines
branches/5.1:
Remove the word "Error" from the printout because the mysqltest suite
interprets it as an error and thus the innodb-autoinc test fails.
Approved by: Sunny (via IM)
------------------------------------------------------------------------
r5339 | marko | 2009-06-17 11:01:37 +0300 (Wed, 17 Jun 2009) | 2 lines
branches/5.1: Add missing #include "mtr0log.h" so that the code compiles
with -DUNIV_MUST_NOT_INLINE.
(null merge; this had already been committed in branches/zip)
------------------------------------------------------------------------
r5340 | marko | 2009-06-17 12:11:49 +0300 (Wed, 17 Jun 2009) | 4 lines
branches/5.1: row_unlock_for_mysql(): When the clustered index is unknown,
refuse to unlock the record.
(Bug #45357, caused by the fix of Bug #39320).
rb://132 approved by Sunny Bains.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5343 | vasil | 2009-06-17 08:56:12 -0400 (Wed, 17 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5342.
------------------------------------------------------------------------
r5344 | marko | 2009-06-17 09:03:45 -0400 (Wed, 17 Jun 2009) | 1 line
branches/zip: row_merge_read_rec(): Fix a UNIV_DEBUG bug (Bug #45426)
------------------------------------------------------------------------
r5391 | marko | 2009-06-22 05:31:35 -0400 (Mon, 22 Jun 2009) | 2 lines
branches/zip: buf_page_get_zip(): Fix a bogus warning about
block_mutex being possibly uninitialized.
------------------------------------------------------------------------
r5392 | marko | 2009-06-22 07:58:20 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: ha_innobase::check_if_incompatible_data(): When
ROW_FORMAT=DEFAULT, do not compare to get_row_type().
Without this change, fast index creation will be disabled
in recent versions of MySQL 5.1.
------------------------------------------------------------------------
r5393 | pekka | 2009-06-22 09:27:55 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: Minor changes for Hot Backup to build correctly. (The
code bracketed between #ifdef UNIV_HOTBACKUP and #endif /* UNIV_HOTBACKUP */).
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r5394 | pekka | 2009-06-22 09:46:34 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: Add functions for checking the format of tablespaces
for Hot Backup build (UNIV_HOTBACKUP defined).
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r5397 | calvin | 2009-06-23 16:59:42 -0400 (Tue, 23 Jun 2009) | 7 lines
branches/zip: change the header file path.
Change the header file path from ../storage/innobase/include/
to ../include/. In the planned 5.1 + plugin release, the source
directory of the plugin will not be in storage/innobase.
Approved by: Heikki (IM)
------------------------------------------------------------------------
r5407 | calvin | 2009-06-24 09:51:08 -0400 (Wed, 24 Jun 2009) | 4 lines
branches/zip: remove relative path of header files.
Suggested by Marko.
------------------------------------------------------------------------
r5412 | marko | 2009-06-25 06:27:08 -0400 (Thu, 25 Jun 2009) | 1 line
branches/zip: Replace a DBUG_ASSERT with ut_a to track down Issue #290.
------------------------------------------------------------------------
r5415 | marko | 2009-06-25 06:45:57 -0400 (Thu, 25 Jun 2009) | 3 lines
branches/zip: dict_index_find_cols(): Print diagnostic on name mismatch.
This addresses Bug #44571 but does not fix it.
rb://135 approved by Sunny Bains.
------------------------------------------------------------------------
r5417 | marko | 2009-06-25 08:20:56 -0400 (Thu, 25 Jun 2009) | 1 line
branches/zip: ha_innodb.cc: Move the misplaced Doxygen @file comment.
------------------------------------------------------------------------
r5418 | marko | 2009-06-25 08:55:52 -0400 (Thu, 25 Jun 2009) | 5 lines
branches/zip: Fix a race condition caused by
SET GLOBAL innodb_commit_concurrency=DEFAULT. (Bug #45749)
When innodb_commit_concurrency is initially set nonzero,
DEFAULT would change it back to 0, triggering Bug #42101.
rb://139 approved by Heikki Tuuri.
------------------------------------------------------------------------
r5423 | calvin | 2009-06-26 16:52:52 -0400 (Fri, 26 Jun 2009) | 2 lines
branches/zip: Fix typos.
------------------------------------------------------------------------
r5425 | marko | 2009-06-29 04:52:30 -0400 (Mon, 29 Jun 2009) | 4 lines
branches/zip: ha_innobase::add_index(), ha_innobase::final_drop_index():
Start prebuilt->trx before locking the table. This should fix Issue #293
and could fix Issue #229.
Approved by Sunny (over IM).
------------------------------------------------------------------------
r5426 | marko | 2009-06-29 05:24:27 -0400 (Mon, 29 Jun 2009) | 3 lines
branches/zip: buf_page_get_gen(): Fix a race condition when reading
buf_fix_count. This could explain Issue #156.
Tested by Michael.
------------------------------------------------------------------------
r5427 | marko | 2009-06-29 05:54:53 -0400 (Mon, 29 Jun 2009) | 5 lines
branches/zip: lock_print_info_all_transactions(), buf_read_recv_pages():
Tolerate missing tablespaces (zip_size==ULINT_UNDEFINED).
buf_page_get_gen(): Add ut_ad(ut_is_2pow(zip_size)).
Issue #289, rb://136 approved by Sunny Bains
------------------------------------------------------------------------
r5428 | marko | 2009-06-29 07:06:29 -0400 (Mon, 29 Jun 2009) | 2 lines
branches/zip: row_sel_store_mysql_rec(): Add missing pointer cast.
Do not do arithmetics on void pointers.
------------------------------------------------------------------------
r5429 | marko | 2009-06-29 09:49:54 -0400 (Mon, 29 Jun 2009) | 13 lines
branches/zip: Do not crash on SET GLOBAL innodb_file_format=DEFAULT
or SET GLOBAL innodb_file_format_check=DEFAULT.
innodb_file_format.test: New test for innodb_file_format and
innodb_file_format_check.
innodb_file_format_name_validate(): Store the string in *save.
innodb_file_format_name_update(): Check the string again.
innodb_file_format_check_validate(): Store the string in *save.
innodb_file_format_check_update(): Check the string again.
Issue #282, rb://140 approved by Heikki Tuuri
------------------------------------------------------------------------
r5430 | marko | 2009-06-29 09:58:07 -0400 (Mon, 29 Jun 2009) | 2 lines
branches/zip: lock_rec_validate_page(): Add another assertion
to track down Issue #289.
------------------------------------------------------------------------
r5431 | marko | 2009-06-29 09:58:40 -0400 (Mon, 29 Jun 2009) | 1 line
branches/zip: Revert an accidentally made change in r5430 to univ.i.
------------------------------------------------------------------------
r5437 | marko | 2009-06-30 05:10:01 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: ibuf_dummy_index_free(): Beautify the comment.
------------------------------------------------------------------------
r5438 | marko | 2009-06-30 05:10:32 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: fseg_free(): Remove this unused function.
------------------------------------------------------------------------
r5439 | marko | 2009-06-30 05:15:22 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: fseg_validate(): Enclose in #ifdef UNIV_DEBUG.
This function is unused, but it could turn out to be a useful debugging aid.
------------------------------------------------------------------------
r5441 | marko | 2009-06-30 06:30:14 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: ha_delete(): Remove this unused function that was
very similar to ha_search_and_delete_if_found().
------------------------------------------------------------------------
r5442 | marko | 2009-06-30 06:45:41 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: lock_is_on_table(), lock_table_unlock(): Unused, remove.
------------------------------------------------------------------------
r5443 | marko | 2009-06-30 07:03:00 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_event_create_auto(): Unused, remove.
------------------------------------------------------------------------
r5444 | marko | 2009-06-30 07:19:49 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: que_graph_try_free(): Unused, remove.
------------------------------------------------------------------------
r5445 | marko | 2009-06-30 07:28:11 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: row_build_row_ref_from_row(): Unused, remove.
------------------------------------------------------------------------
r5446 | marko | 2009-06-30 07:35:45 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: srv_que_round_robin(), srv_que_task_enqueue(): Unused, remove.
------------------------------------------------------------------------
r5447 | marko | 2009-06-30 07:37:58 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: srv_que_task_queue_check(): Unused, remove.
------------------------------------------------------------------------
r5448 | marko | 2009-06-30 07:56:36 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: mem_heap_cat(): Unused, remove.
------------------------------------------------------------------------
r5449 | marko | 2009-06-30 08:00:50 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: innobase_start_or_create_for_mysql():
Invoke os_get_os_version() at most once.
------------------------------------------------------------------------
r5450 | marko | 2009-06-30 08:02:20 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_file_close_no_error_handling(): Unused, remove.
------------------------------------------------------------------------
r5451 | marko | 2009-06-30 08:09:49 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: page_set_max_trx_id(): Make the code compile
with UNIV_HOTBACKUP.
------------------------------------------------------------------------
r5452 | marko | 2009-06-30 08:10:26 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: os_file_close_no_error_handling(): Restore,
as this function is used within InnoDB Hot Backup.
------------------------------------------------------------------------
r5453 | marko | 2009-06-30 08:14:01 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_process_set_priority_boost(): Unused, remove.
------------------------------------------------------------------------
r5454 | marko | 2009-06-30 08:42:52 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: Replace a non-ASCII character
(ISO 8859-1 encoded U+00AD SOFT HYPHEN) with a cheap ASCII substitute.
------------------------------------------------------------------------
r5456 | inaam | 2009-06-30 14:21:09 -0400 (Tue, 30 Jun 2009) | 4 lines
branches/zip
Non functional change. s/Percona/Percona Inc./
------------------------------------------------------------------------
r5470 | vasil | 2009-07-02 09:12:36 -0400 (Thu, 02 Jul 2009) | 16 lines
branches/zip:
Use PAUSE instruction inside spinloop if it is available.
The patch was originally developed by Mikael Ronstrom <mikael@mysql.com>
and can be found here:
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2768
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2771
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2772
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2774
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2777
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2799
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2800
Approved by: Heikki (rb://137)
------------------------------------------------------------------------
r5481 | vasil | 2009-07-06 13:16:32 -0400 (Mon, 06 Jul 2009) | 4 lines
branches/zip:
Remove unnecessary quotes and simplify plug.in.
------------------------------------------------------------------------
r5482 | calvin | 2009-07-06 18:36:35 -0400 (Mon, 06 Jul 2009) | 5 lines
branches/zip: add COPYING files for Percona and Sun Micro.
1.0.4 contains patches based on contributions from Percona
and Sun Microsystems.
------------------------------------------------------------------------
r5483 | calvin | 2009-07-07 05:36:43 -0400 (Tue, 07 Jul 2009) | 3 lines
branches/zip: add IB_HAVE_PAUSE_INSTRUCTION to CMake.
Windows will support PAUSE instruction by default.
------------------------------------------------------------------------
r5484 | inaam | 2009-07-07 18:57:14 -0400 (Tue, 07 Jul 2009) | 13 lines
branches/zip rb://126
Based on contribution from Google Inc.
This patch introduces a new parameter innodb_io_capacity to control the
rate at which master threads performs various tasks. The default value
is 200 and higher values imply more aggressive flushing and ibuf merges
from within the master thread.
This patch also changes the ibuf merge from synchronous to asynchronous.
Another minor change is not to force the master thread to wait for a
log flush to complete every second.
Approved by: Heikki
------------------------------------------------------------------------
r5485 | inaam | 2009-07-07 19:00:49 -0400 (Tue, 07 Jul 2009) | 18 lines
branches/zip rb://138
The current implementation is to try to flush the neighbors of every
page that we flush. This patch makes the following distinction:
1) If the flush is from flush_list AND
2) If the flush is intended to move the oldest_modification LSN ahead
(this happens when a user thread sees little space in the log file and
attempts to flush pages from the buffer pool so that a checkpoint can
be made)
THEN
Do not try to flush the neighbors. Just focus on flushing dirty pages at
the end of flush_list
Approved by: Heikki
------------------------------------------------------------------------
r5486 | inaam | 2009-07-08 12:11:40 -0400 (Wed, 08 Jul 2009) | 29 lines
branches/zip rb://133
This patch introduces heuristics based flushing rate of dirty pages to
avoid IO bursts at checkpoint.
1) log_capacity / log_generated per second gives us number of seconds
in which ALL dirty pages need to be flushed. Based on this rough
assumption we can say that
n_dirty_pages / (log_capacity / log_generation_rate) = desired_flush_rate
2) We use weighted averages (hard coded to 20 seconds) of
log_generation_rate to avoid resonance.
3) From the desired_flush_rate we subtract the number of pages that have
been flushed due to LRU flushing. That gives us pages that we should
flush as part of flush_list cleanup. And that is the number (capped by
maximum io_capacity) that we try to flush from the master thread.
Knobs:
======
innodb_adaptive_flushing: boolean, global, dynamic, default TRUE.
Since this heuristic is very experimental and has the potential to
dramatically change the IO pattern I think it is a good idea to leave a
knob to turn it off.
Approved by: Heikki
------------------------------------------------------------------------
r5487 | calvin | 2009-07-08 12:42:28 -0400 (Wed, 08 Jul 2009) | 7 lines
branches/zip: fix PAUSE instruction patch on Windows
The original PAUSE instruction patch (r5470) does not
compile on Windows. Also, there is an elegant way of
doing it on Windows - YieldProcessor().
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r5489 | vasil | 2009-07-10 05:02:22 -0400 (Fri, 10 Jul 2009) | 9 lines
branches/zip:
Change the defaults for
innodb_sync_spin_loops: 20 -> 30
innodb_spin_wait_delay: 5 -> 6
This change was proposed by Sun/MySQL based on their performance testing,
see https://svn.innodb.com/innobase/Release_tasks_for_InnoDB_Plugin_V1.0.4
------------------------------------------------------------------------
r5490 | vasil | 2009-07-10 05:04:20 -0400 (Fri, 10 Jul 2009) | 4 lines
branches/zip:
Add ChangeLog entry for 5489.
------------------------------------------------------------------------
r5491 | calvin | 2009-07-10 12:19:17 -0400 (Fri, 10 Jul 2009) | 6 lines
branches/zip: add copyright info to files related to PAUSE
instruction patch, contributed by Sun Microsystems.
------------------------------------------------------------------------
r5492 | calvin | 2009-07-10 17:47:34 -0400 (Fri, 10 Jul 2009) | 5 lines
branches/zip: add ChangeLog entries for r5484-r5486.
------------------------------------------------------------------------
r5494 | vasil | 2009-07-13 03:37:35 -0400 (Mon, 13 Jul 2009) | 6 lines
branches/zip:
Restore the original value of innodb_sync_spin_loops at the end, previously
the test assumed that setting it to 20 will do this, but now the default is
30 and MTR's internal check failed.
------------------------------------------------------------------------
r5495 | inaam | 2009-07-13 11:48:45 -0400 (Mon, 13 Jul 2009) | 5 lines
branches/zip rb://138 (REVERT)
Revert the flush neighbors patch as it shows regression in
the benchmarks run by Michael.
------------------------------------------------------------------------
r5496 | inaam | 2009-07-13 14:04:57 -0400 (Mon, 13 Jul 2009) | 4 lines
branches/zip
Fixed warnings on windows where ulint != ib_uint64_t
------------------------------------------------------------------------
r5497 | calvin | 2009-07-13 15:01:00 -0400 (Mon, 13 Jul 2009) | 9 lines
branches/zip: fix run-time symbols clash on Solaris.
This patch is from Sergey Vojtovich of Sun Microsystems,
to fix run-time symbols clash on Solaris with older C++
compiler:
- when finding out a way to hide symbols, make decision basing
on compiler, not operating system.
- Sun Studio supports __hidden declaration specifier for this
purpose.
------------------------------------------------------------------------
r5498 | vasil | 2009-07-14 03:16:18 -0400 (Tue, 14 Jul 2009) | 92 lines
branches/zip: Merge r5341:5497 from branches/5.1, skipping:
c5419 because it is merge from branches/zip into branches/5.1
c5466 because the source code has been adjusted to match the MySQL
behavior and the innodb-autoinc test does not fail in branches/zip,
if c5466 is merged, then innodb-autoinc starts failing, Sunny suggested
not to merge c5466.
and resolving conflicts in c5410, c5440, c5488:
------------------------------------------------------------------------
r5410 | marko | 2009-06-24 22:26:34 +0300 (Wed, 24 Jun 2009) | 2 lines
Changed paths:
M /branches/5.1/include/trx0sys.ic
M /branches/5.1/trx/trx0purge.c
M /branches/5.1/trx/trx0sys.c
M /branches/5.1/trx/trx0undo.c
branches/5.1: Add missing #include "mtr0log.h" to avoid warnings
when compiling with -DUNIV_MUST_NOT_INLINE.
------------------------------------------------------------------------
r5419 | marko | 2009-06-25 16:11:57 +0300 (Thu, 25 Jun 2009) | 18 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb_bug42101-nonzero.result
M /branches/5.1/mysql-test/innodb_bug42101-nonzero.test
M /branches/5.1/mysql-test/innodb_bug42101.result
M /branches/5.1/mysql-test/innodb_bug42101.test
branches/5.1: Merge r5418 from branches/zip:
------------------------------------------------------------------------
r5418 | marko | 2009-06-25 15:55:52 +0300 (Thu, 25 Jun 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/mysql-test/innodb_bug42101-nonzero.result
M /branches/zip/mysql-test/innodb_bug42101-nonzero.test
M /branches/zip/mysql-test/innodb_bug42101.result
M /branches/zip/mysql-test/innodb_bug42101.test
branches/zip: Fix a race condition caused by
SET GLOBAL innodb_commit_concurrency=DEFAULT. (Bug #45749)
When innodb_commit_concurrency is initially set nonzero,
DEFAULT would change it back to 0, triggering Bug #42101.
rb://139 approved by Heikki Tuuri.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5440 | vasil | 2009-06-30 13:04:29 +0300 (Tue, 30 Jun 2009) | 8 lines
Changed paths:
M /branches/5.1/fil/fil0fil.c
branches/5.1:
Fix Bug#45814 URL reference in InnoDB server errors needs adjusting to match documentation
by changing the URL from
http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html to
http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting-datadict.html
------------------------------------------------------------------------
r5466 | vasil | 2009-07-02 10:46:45 +0300 (Thu, 02 Jul 2009) | 6 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Adjust the failing innodb-autoinc test to conform to the latest behavior
of the MySQL code. The idea and the comment in innodb-autoinc.test come
from Sunny.
------------------------------------------------------------------------
r5488 | vasil | 2009-07-09 19:16:44 +0300 (Thu, 09 Jul 2009) | 13 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
A /branches/5.1/mysql-test/innodb_bug21704.result
A /branches/5.1/mysql-test/innodb_bug21704.test
branches/5.1:
Fix Bug#21704 Renaming column does not update FK definition
by checking whether a column that participates in a FK definition is being
renamed and denying the ALTER in this case.
The patch was originally developed by Davi Arnaut <Davi.Arnaut@Sun.COM>:
http://lists.mysql.com/commits/77714
and was later adjusted to conform to InnoDB coding style by me (Vasil),
I also added some more comments and moved the bug specific mysql-test to
a separate file to make it more manageable and flexible.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5499 | calvin | 2009-07-14 12:55:10 -0400 (Tue, 14 Jul 2009) | 3 lines
branches/zip: add a missing file in Makefile.am
This change was suggested by MySQL.
------------------------------------------------------------------------
r5500 | calvin | 2009-07-14 13:03:26 -0400 (Tue, 14 Jul 2009) | 3 lines
branches/zip: minor change
Remove an extra "with".
------------------------------------------------------------------------
r5501 | vasil | 2009-07-14 13:58:15 -0400 (Tue, 14 Jul 2009) | 5 lines
branches/zip:
Add @ZLIB_INCLUDES@ so that the InnoDB Plugin picks up the same zlib.h
header file that is eventually used by mysqld.
------------------------------------------------------------------------
r5502 | vasil | 2009-07-14 13:59:59 -0400 (Tue, 14 Jul 2009) | 4 lines
branches/zip:
Add include/ut0auxconf.h to noinst_HEADERS
------------------------------------------------------------------------
r5503 | vasil | 2009-07-14 14:16:11 -0400 (Tue, 14 Jul 2009) | 8 lines
branches/zip:
Non-functional change:
put files in noinst_HEADERS and libinnobase_a_SOURCES one per line and sort
alphabetically, so it is easier to find if a file is there or not and
also diffs show exactly the added or removed file instead of surrounding
lines too.
------------------------------------------------------------------------
r5504 | calvin | 2009-07-15 04:58:44 -0400 (Wed, 15 Jul 2009) | 6 lines
branches/zip: fix compile errors on Win64
Both srv_read_ahead_factor and srv_io_capacity should
be defined as ulong.
Approved by: Sunny
------------------------------------------------------------------------
r5508 | calvin | 2009-07-16 09:40:47 -0400 (Thu, 16 Jul 2009) | 16 lines
branches/zip: Support inlining of functions and prefetch with
Sun Studio
Those changes are contributed by Sun/MySQL. Two sets of changes
in this patch when Sun Studio is used:
- Explicit inlining of functions
- Prefetch Support
This patch has been tested by Sunny with the plugin statically
built in. Since we've never built the plugin as a dynamically
loaded module on Solaris, it is a separate task to change
plug.in.
rb://142
Approved by: Heikki
------------------------------------------------------------------------
r5509 | calvin | 2009-07-16 09:45:28 -0400 (Thu, 16 Jul 2009) | 2 lines
branches/zip: add ChangeLog entry for r5508.
------------------------------------------------------------------------
r5512 | sunny | 2009-07-19 19:52:48 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Remove unused extern ref to timed_mutexes.
------------------------------------------------------------------------
r5513 | sunny | 2009-07-19 19:58:43 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Undo r5512
------------------------------------------------------------------------
r5514 | sunny | 2009-07-19 20:08:49 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Only use my_bool when UNIV_HOTBACKUP is not defined.
------------------------------------------------------------------------
r5515 | sunny | 2009-07-20 03:29:14 -0400 (Mon, 20 Jul 2009) | 2 lines
branches/zip: The dict_table_t::autoinc_mutex field is not used in HotBackup.
------------------------------------------------------------------------
r5516 | sunny | 2009-07-20 03:46:05 -0400 (Mon, 20 Jul 2009) | 4 lines
branches/zip: Make this file usable from within HotBackup. A new file has
been introduced called hb_univ.i. This file should have all the HotBackup
specific configuration.
------------------------------------------------------------------------
r5517 | sunny | 2009-07-20 03:55:11 -0400 (Mon, 20 Jul 2009) | 2 lines
Add /* UNIV_HOTBACK */
------------------------------------------------------------------------
r5519 | vasil | 2009-07-20 04:45:18 -0400 (Mon, 20 Jul 2009) | 31 lines
branches/zip: Merge r5497:5518 from branches/5.1:
------------------------------------------------------------------------
r5518 | vasil | 2009-07-20 11:29:47 +0300 (Mon, 20 Jul 2009) | 22 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2874.2.1
committer: Anurag Shekhar <anurag.shekhar@sun.com>
branch nick: mysql-5.1-bugteam-windows-warning
timestamp: Wed 2009-05-13 15:41:24 +0530
message:
Bug #39802 On Windows, 32-bit time_t should be enforced
This patch fixes compilation warning, "conversion from 'time_t' to 'ulong',
possible loss of data".
The fix is to typecast time_t to ulong before assigning it to ulong.
Backported this from 6.0-bugteam tree.
modified:
storage/archive/ha_archive.cc
storage/federated/ha_federated.cc
storage/innobase/handler/ha_innodb.cc
storage/myisam/ha_myisam.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r5520 | vasil | 2009-07-20 04:51:47 -0400 (Mon, 20 Jul 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5498 and r5519.
------------------------------------------------------------------------
r5524 | inaam | 2009-07-20 12:23:15 -0400 (Mon, 20 Jul 2009) | 9 lines
branches/zip
Change the read ahead parameter name to innodb_read_ahead_threshold.
Change the meaning of this parameter to signify the number of pages
that must be sequentially accessed for InnoDB to trigger a readahead
request.
Suggested by: Ken
------------------------------------------------------------------------
16 years ago  branches/innodb+: Merge revisions 5144:5524 from branches/zip
------------------------------------------------------------------------
r5147 | marko | 2009-05-27 06:55:14 -0400 (Wed, 27 May 2009) | 1 line
branches/zip: ibuf0ibuf.c: Improve a comment.
------------------------------------------------------------------------
r5149 | marko | 2009-05-27 07:46:42 -0400 (Wed, 27 May 2009) | 34 lines
branches/zip: Merge revisions 4994:5148 from branches/5.1:
------------------------------------------------------------------------
r5126 | vasil | 2009-05-26 16:57:12 +0300 (Tue, 26 May 2009) | 9 lines
branches/5.1:
Preparation for the fix of
Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Non-functional change: move FSP_* macros from fsp0fsp.h to a new file
fsp0types.h. This is needed in order to be able to use FSP_EXTENT_SIZE
in mtr0log.ic.
------------------------------------------------------------------------
r5127 | vasil | 2009-05-26 17:05:43 +0300 (Tue, 26 May 2009) | 9 lines
branches/5.1:
Preparation for the fix of
Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Do not include unnecessary headers mtr0log.h and fut0lst.h in trx0sys.h
and include fsp0fsp.h just before it is needed. This is needed in order
to be able to use TRX_SYS_SPACE in mtr0log.ic.
------------------------------------------------------------------------
r5128 | vasil | 2009-05-26 17:26:37 +0300 (Tue, 26 May 2009) | 7 lines
branches/5.1:
Fix Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Do not write redo log for the pages in the doublewrite buffer. Also, do not
make a dummy change to the page because this is not needed.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5169 | marko | 2009-05-28 03:21:55 -0400 (Thu, 28 May 2009) | 1 line
branches/zip: mtr0mtr.h: Add Doxygen comments for the redo log entry types.
------------------------------------------------------------------------
r5176 | marko | 2009-05-28 07:14:02 -0400 (Thu, 28 May 2009) | 1 line
branches/zip: Correct a debug assertion that was added in r5125.
------------------------------------------------------------------------
r5201 | marko | 2009-06-01 06:35:25 -0400 (Mon, 01 Jun 2009) | 2 lines
branches/zip: Clean up some comments.
Make the rec parameter of mlog_open_and_write_index() const.
------------------------------------------------------------------------
r5234 | marko | 2009-06-03 08:26:41 -0400 (Wed, 03 Jun 2009) | 44 lines
branches/zip: Merge revisions 5148:5233 from branches/5.1:
------------------------------------------------------------------------
r5150 | vasil | 2009-05-27 18:56:03 +0300 (Wed, 27 May 2009) | 4 lines
branches/5.1:
Whitespace fixup.
------------------------------------------------------------------------
r5191 | vasil | 2009-05-30 17:46:05 +0300 (Sat, 30 May 2009) | 19 lines
branches/5.1:
Merge a change from MySQL (this fixes the failing innodb_mysql test):
------------------------------------------------------------
revno: 1810.3894.10
committer: Sergey Glukhov <Sergey.Glukhov@sun.com>
branch nick: mysql-5.0-bugteam
timestamp: Tue 2009-05-19 11:32:21 +0500
message:
Bug#39793 Foreign keys not constructed when column has a '#' in a comment or default value
Internal InnoDN FK parser does not recognize '\'' as quotation symbol.
Suggested fix is to add '\'' symbol check for quotation condition
(dict_strip_comments() function).
modified:
innobase/dict/dict0dict.c
mysql-test/r/innodb_mysql.result
mysql-test/t/innodb_mysql.test
------------------------------------------------------------------------
r5233 | marko | 2009-06-03 15:12:44 +0300 (Wed, 03 Jun 2009) | 11 lines
branches/5.1: Merge the test case from r5232 from branches/5.0:
------------------------------------------------------------------------
r5232 | marko | 2009-06-03 14:31:04 +0300 (Wed, 03 Jun 2009) | 21 lines
branches/5.0: Merge r3590 from branches/5.1 in order to fix Bug #40565
(Update Query Results in "1 Row Affected" But Should Be "Zero Rows").
Also, add a test case for Bug #40565.
rb://128 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
r5250 | marko | 2009-06-04 02:58:23 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Add Doxygen comments to the rest of buf0*.
------------------------------------------------------------------------
r5251 | marko | 2009-06-04 02:59:51 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Replace <= in a function comment.
------------------------------------------------------------------------
r5253 | marko | 2009-06-04 06:37:35 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Add missing Doxygen comments for page0zip.
------------------------------------------------------------------------
r5261 | vasil | 2009-06-05 11:13:31 -0400 (Fri, 05 Jun 2009) | 15 lines
branches/zip:
Fix Mantis Issue#244 fix bug in linear read ahead (no check on access pattern)
The changes are:
1) Take into account access pattern when deciding whether or not to do linear
read ahead.
2) Expose a knob innodb_read_ahead_factor = [0-64] default (8), dynamic,
global to control linear read ahead behvior
3) Disable random read ahead. Keep the code for now.
Submitted by: Inaam (rb://122)
Approved by: Heikki (rb://122)
------------------------------------------------------------------------
r5262 | vasil | 2009-06-05 12:04:25 -0400 (Fri, 05 Jun 2009) | 22 lines
branches/zip:
Enable functionality to have multiple background io helper threads.
This patch is based on percona contributions.
More details about this patch will be written at:
https://svn.innodb.com/innobase/MultipleBackgroundThreads
The patch essentially does the following:
expose following knobs:
innodb_read_io_threads = [1 - 64] default 1
innodb_write_io_threads = [1 - 64] default 1
deprecate innodb_file_io_threads (this parameter was relevant only on windows)
Internally it allows multiple segments for read and write IO request arrays
where one thread works on one segement.
Submitted by: Inaam (rb://124)
Approved by: Heikki (rb://124)
------------------------------------------------------------------------
r5263 | vasil | 2009-06-05 12:19:37 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Whitespace cleanup.
------------------------------------------------------------------------
r5264 | vasil | 2009-06-05 12:26:58 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5261.
------------------------------------------------------------------------
r5265 | vasil | 2009-06-05 12:34:11 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5262.
------------------------------------------------------------------------
r5268 | inaam | 2009-06-08 12:18:21 -0400 (Mon, 08 Jun 2009) | 7 lines
branches/zip
Non functional change:
Added legal notices acknowledging percona contribution to the multiple
IO helper threads patch i.e.: r5262
------------------------------------------------------------------------
r5283 | inaam | 2009-06-09 13:46:29 -0400 (Tue, 09 Jun 2009) | 9 lines
branches/zip
rb://130
Enable Group Commit functionality that was broken in 5.0 when
distributed transactions were introduced.
Reviewed by: Heikki
------------------------------------------------------------------------
r5319 | marko | 2009-06-11 04:40:33 -0400 (Thu, 11 Jun 2009) | 3 lines
branches/zip: Declare os_thread_id_t as unsigned long,
because ulint is wrong on Win64.
Pointed out by Vladislav Vaintroub <wlad@sun.com>.
------------------------------------------------------------------------
r5320 | inaam | 2009-06-11 09:15:41 -0400 (Thu, 11 Jun 2009) | 14 lines
branches/zip rb://131
This patch changes the following defaults:
max_dirty_pages_pct: default from 90 to 75. max allowed from 100 to 99
additional_mem_pool_size: default from 1 to 8 MB
buffer_pool_size: default from 8 to 128 MB
log_buffer_size: default from 1 to 8 MB
read_io_threads/write_io_threads: default from 1 to 4
The log file sizes are untouched because of upgrade issues
Reviewed by: Heikki
------------------------------------------------------------------------
r5330 | marko | 2009-06-16 04:08:59 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: buf_page_get_gen(): Reduce mutex holding time by adjusting
buf_pool->n_pend_unzip while only holding buf_pool_mutex.
------------------------------------------------------------------------
r5331 | marko | 2009-06-16 05:00:48 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: buf_page_get_zip(): Eliminate a buf_page_get_mutex() call.
The function must switch on the block state anyway.
------------------------------------------------------------------------
r5332 | vasil | 2009-06-16 05:03:27 -0400 (Tue, 16 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5283 and r5320.
------------------------------------------------------------------------
r5333 | marko | 2009-06-16 05:27:46 -0400 (Tue, 16 Jun 2009) | 1 line
branches/zip: buf_page_io_query(): Remove unused function.
------------------------------------------------------------------------
r5335 | marko | 2009-06-16 09:23:10 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: innodb.test: Adjust the tolerance of
innodb_buffer_pool_pages_total for r5320.
------------------------------------------------------------------------
r5342 | marko | 2009-06-17 06:15:32 -0400 (Wed, 17 Jun 2009) | 60 lines
branches/zip: Merge revisions 5233:5341 from branches/5.1:
------------------------------------------------------------------------
r5233 | marko | 2009-06-03 15:12:44 +0300 (Wed, 03 Jun 2009) | 11 lines
branches/5.1: Merge the test case from r5232 from branches/5.0:
------------------------------------------------------------------------
r5232 | marko | 2009-06-03 14:31:04 +0300 (Wed, 03 Jun 2009) | 21 lines
branches/5.0: Merge r3590 from branches/5.1 in order to fix Bug #40565
(Update Query Results in "1 Row Affected" But Should Be "Zero Rows").
Also, add a test case for Bug #40565.
rb://128 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
r5243 | sunny | 2009-06-04 03:17:14 +0300 (Thu, 04 Jun 2009) | 14 lines
branches/5.1: When the InnoDB and MySQL data dictionaries go out of sync, before
the bug fix we would assert on missing autoinc columns. With this fix we allow
MySQL to open the table but set the next autoinc value for the column to the
MAX value. This effectively disables the next value generation. INSERTs will
fail with a generic AUTOINC failure. However, the user should be able to
read/dump the table, set the column values explicitly, use ALTER TABLE to
set the next autoinc value and/or sync the two data dictionaries to resume
normal operations.
Fix Bug#44030 Error: (1500) Couldn't read the MAX(ID) autoinc value from the
index (PRIMARY)
rb://118
------------------------------------------------------------------------
r5252 | sunny | 2009-06-04 10:16:24 +0300 (Thu, 04 Jun 2009) | 2 lines
branches/5.1: The version of the result file checked in was broken in r5243.
------------------------------------------------------------------------
r5259 | vasil | 2009-06-05 10:29:16 +0300 (Fri, 05 Jun 2009) | 7 lines
branches/5.1:
Remove the word "Error" from the printout because the mysqltest suite
interprets it as an error and thus the innodb-autoinc test fails.
Approved by: Sunny (via IM)
------------------------------------------------------------------------
r5339 | marko | 2009-06-17 11:01:37 +0300 (Wed, 17 Jun 2009) | 2 lines
branches/5.1: Add missing #include "mtr0log.h" so that the code compiles
with -DUNIV_MUST_NOT_INLINE.
(null merge; this had already been committed in branches/zip)
------------------------------------------------------------------------
r5340 | marko | 2009-06-17 12:11:49 +0300 (Wed, 17 Jun 2009) | 4 lines
branches/5.1: row_unlock_for_mysql(): When the clustered index is unknown,
refuse to unlock the record.
(Bug #45357, caused by the fix of Bug #39320).
rb://132 approved by Sunny Bains.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5343 | vasil | 2009-06-17 08:56:12 -0400 (Wed, 17 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5342.
------------------------------------------------------------------------
r5344 | marko | 2009-06-17 09:03:45 -0400 (Wed, 17 Jun 2009) | 1 line
branches/zip: row_merge_read_rec(): Fix a UNIV_DEBUG bug (Bug #45426)
------------------------------------------------------------------------
r5391 | marko | 2009-06-22 05:31:35 -0400 (Mon, 22 Jun 2009) | 2 lines
branches/zip: buf_page_get_zip(): Fix a bogus warning about
block_mutex being possibly uninitialized.
------------------------------------------------------------------------
r5392 | marko | 2009-06-22 07:58:20 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: ha_innobase::check_if_incompatible_data(): When
ROW_FORMAT=DEFAULT, do not compare to get_row_type().
Without this change, fast index creation will be disabled
in recent versions of MySQL 5.1.
------------------------------------------------------------------------
r5393 | pekka | 2009-06-22 09:27:55 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: Minor changes for Hot Backup to build correctly. (The
code bracketed between #ifdef UNIV_HOTBACKUP and #endif /* UNIV_HOTBACKUP */).
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r5394 | pekka | 2009-06-22 09:46:34 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: Add functions for checking the format of tablespaces
for Hot Backup build (UNIV_HOTBACKUP defined).
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r5397 | calvin | 2009-06-23 16:59:42 -0400 (Tue, 23 Jun 2009) | 7 lines
branches/zip: change the header file path.
Change the header file path from ../storage/innobase/include/
to ../include/. In the planned 5.1 + plugin release, the source
directory of the plugin will not be in storage/innobase.
Approved by: Heikki (IM)
------------------------------------------------------------------------
r5407 | calvin | 2009-06-24 09:51:08 -0400 (Wed, 24 Jun 2009) | 4 lines
branches/zip: remove relative path of header files.
Suggested by Marko.
------------------------------------------------------------------------
r5412 | marko | 2009-06-25 06:27:08 -0400 (Thu, 25 Jun 2009) | 1 line
branches/zip: Replace a DBUG_ASSERT with ut_a to track down Issue #290.
------------------------------------------------------------------------
r5415 | marko | 2009-06-25 06:45:57 -0400 (Thu, 25 Jun 2009) | 3 lines
branches/zip: dict_index_find_cols(): Print diagnostic on name mismatch.
This addresses Bug #44571 but does not fix it.
rb://135 approved by Sunny Bains.
------------------------------------------------------------------------
r5417 | marko | 2009-06-25 08:20:56 -0400 (Thu, 25 Jun 2009) | 1 line
branches/zip: ha_innodb.cc: Move the misplaced Doxygen @file comment.
------------------------------------------------------------------------
r5418 | marko | 2009-06-25 08:55:52 -0400 (Thu, 25 Jun 2009) | 5 lines
branches/zip: Fix a race condition caused by
SET GLOBAL innodb_commit_concurrency=DEFAULT. (Bug #45749)
When innodb_commit_concurrency is initially set nonzero,
DEFAULT would change it back to 0, triggering Bug #42101.
rb://139 approved by Heikki Tuuri.
------------------------------------------------------------------------
r5423 | calvin | 2009-06-26 16:52:52 -0400 (Fri, 26 Jun 2009) | 2 lines
branches/zip: Fix typos.
------------------------------------------------------------------------
r5425 | marko | 2009-06-29 04:52:30 -0400 (Mon, 29 Jun 2009) | 4 lines
branches/zip: ha_innobase::add_index(), ha_innobase::final_drop_index():
Start prebuilt->trx before locking the table. This should fix Issue #293
and could fix Issue #229.
Approved by Sunny (over IM).
------------------------------------------------------------------------
r5426 | marko | 2009-06-29 05:24:27 -0400 (Mon, 29 Jun 2009) | 3 lines
branches/zip: buf_page_get_gen(): Fix a race condition when reading
buf_fix_count. This could explain Issue #156.
Tested by Michael.
------------------------------------------------------------------------
r5427 | marko | 2009-06-29 05:54:53 -0400 (Mon, 29 Jun 2009) | 5 lines
branches/zip: lock_print_info_all_transactions(), buf_read_recv_pages():
Tolerate missing tablespaces (zip_size==ULINT_UNDEFINED).
buf_page_get_gen(): Add ut_ad(ut_is_2pow(zip_size)).
Issue #289, rb://136 approved by Sunny Bains
------------------------------------------------------------------------
r5428 | marko | 2009-06-29 07:06:29 -0400 (Mon, 29 Jun 2009) | 2 lines
branches/zip: row_sel_store_mysql_rec(): Add missing pointer cast.
Do not do arithmetics on void pointers.
------------------------------------------------------------------------
r5429 | marko | 2009-06-29 09:49:54 -0400 (Mon, 29 Jun 2009) | 13 lines
branches/zip: Do not crash on SET GLOBAL innodb_file_format=DEFAULT
or SET GLOBAL innodb_file_format_check=DEFAULT.
innodb_file_format.test: New test for innodb_file_format and
innodb_file_format_check.
innodb_file_format_name_validate(): Store the string in *save.
innodb_file_format_name_update(): Check the string again.
innodb_file_format_check_validate(): Store the string in *save.
innodb_file_format_check_update(): Check the string again.
Issue #282, rb://140 approved by Heikki Tuuri
------------------------------------------------------------------------
r5430 | marko | 2009-06-29 09:58:07 -0400 (Mon, 29 Jun 2009) | 2 lines
branches/zip: lock_rec_validate_page(): Add another assertion
to track down Issue #289.
------------------------------------------------------------------------
r5431 | marko | 2009-06-29 09:58:40 -0400 (Mon, 29 Jun 2009) | 1 line
branches/zip: Revert an accidentally made change in r5430 to univ.i.
------------------------------------------------------------------------
r5437 | marko | 2009-06-30 05:10:01 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: ibuf_dummy_index_free(): Beautify the comment.
------------------------------------------------------------------------
r5438 | marko | 2009-06-30 05:10:32 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: fseg_free(): Remove this unused function.
------------------------------------------------------------------------
r5439 | marko | 2009-06-30 05:15:22 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: fseg_validate(): Enclose in #ifdef UNIV_DEBUG.
This function is unused, but it could turn out to be a useful debugging aid.
------------------------------------------------------------------------
r5441 | marko | 2009-06-30 06:30:14 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: ha_delete(): Remove this unused function that was
very similar to ha_search_and_delete_if_found().
------------------------------------------------------------------------
r5442 | marko | 2009-06-30 06:45:41 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: lock_is_on_table(), lock_table_unlock(): Unused, remove.
------------------------------------------------------------------------
r5443 | marko | 2009-06-30 07:03:00 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_event_create_auto(): Unused, remove.
------------------------------------------------------------------------
r5444 | marko | 2009-06-30 07:19:49 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: que_graph_try_free(): Unused, remove.
------------------------------------------------------------------------
r5445 | marko | 2009-06-30 07:28:11 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: row_build_row_ref_from_row(): Unused, remove.
------------------------------------------------------------------------
r5446 | marko | 2009-06-30 07:35:45 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: srv_que_round_robin(), srv_que_task_enqueue(): Unused, remove.
------------------------------------------------------------------------
r5447 | marko | 2009-06-30 07:37:58 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: srv_que_task_queue_check(): Unused, remove.
------------------------------------------------------------------------
r5448 | marko | 2009-06-30 07:56:36 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: mem_heap_cat(): Unused, remove.
------------------------------------------------------------------------
r5449 | marko | 2009-06-30 08:00:50 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: innobase_start_or_create_for_mysql():
Invoke os_get_os_version() at most once.
------------------------------------------------------------------------
r5450 | marko | 2009-06-30 08:02:20 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_file_close_no_error_handling(): Unused, remove.
------------------------------------------------------------------------
r5451 | marko | 2009-06-30 08:09:49 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: page_set_max_trx_id(): Make the code compile
with UNIV_HOTBACKUP.
------------------------------------------------------------------------
r5452 | marko | 2009-06-30 08:10:26 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: os_file_close_no_error_handling(): Restore,
as this function is used within InnoDB Hot Backup.
------------------------------------------------------------------------
r5453 | marko | 2009-06-30 08:14:01 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_process_set_priority_boost(): Unused, remove.
------------------------------------------------------------------------
r5454 | marko | 2009-06-30 08:42:52 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: Replace a non-ASCII character
(ISO 8859-1 encoded U+00AD SOFT HYPHEN) with a cheap ASCII substitute.
------------------------------------------------------------------------
r5456 | inaam | 2009-06-30 14:21:09 -0400 (Tue, 30 Jun 2009) | 4 lines
branches/zip
Non functional change. s/Percona/Percona Inc./
------------------------------------------------------------------------
r5470 | vasil | 2009-07-02 09:12:36 -0400 (Thu, 02 Jul 2009) | 16 lines
branches/zip:
Use PAUSE instruction inside spinloop if it is available.
The patch was originally developed by Mikael Ronstrom <mikael@mysql.com>
and can be found here:
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2768
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2771
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2772
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2774
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2777
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2799
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2800
Approved by: Heikki (rb://137)
------------------------------------------------------------------------
r5481 | vasil | 2009-07-06 13:16:32 -0400 (Mon, 06 Jul 2009) | 4 lines
branches/zip:
Remove unnecessary quotes and simplify plug.in.
------------------------------------------------------------------------
r5482 | calvin | 2009-07-06 18:36:35 -0400 (Mon, 06 Jul 2009) | 5 lines
branches/zip: add COPYING files for Percona and Sun Micro.
1.0.4 contains patches based on contributions from Percona
and Sun Microsystems.
------------------------------------------------------------------------
r5483 | calvin | 2009-07-07 05:36:43 -0400 (Tue, 07 Jul 2009) | 3 lines
branches/zip: add IB_HAVE_PAUSE_INSTRUCTION to CMake.
Windows will support PAUSE instruction by default.
------------------------------------------------------------------------
r5484 | inaam | 2009-07-07 18:57:14 -0400 (Tue, 07 Jul 2009) | 13 lines
branches/zip rb://126
Based on contribution from Google Inc.
This patch introduces a new parameter innodb_io_capacity to control the
rate at which master threads performs various tasks. The default value
is 200 and higher values imply more aggressive flushing and ibuf merges
from within the master thread.
This patch also changes the ibuf merge from synchronous to asynchronous.
Another minor change is not to force the master thread to wait for a
log flush to complete every second.
Approved by: Heikki
------------------------------------------------------------------------
r5485 | inaam | 2009-07-07 19:00:49 -0400 (Tue, 07 Jul 2009) | 18 lines
branches/zip rb://138
The current implementation is to try to flush the neighbors of every
page that we flush. This patch makes the following distinction:
1) If the flush is from flush_list AND
2) If the flush is intended to move the oldest_modification LSN ahead
(this happens when a user thread sees little space in the log file and
attempts to flush pages from the buffer pool so that a checkpoint can
be made)
THEN
Do not try to flush the neighbors. Just focus on flushing dirty pages at
the end of flush_list
Approved by: Heikki
------------------------------------------------------------------------
r5486 | inaam | 2009-07-08 12:11:40 -0400 (Wed, 08 Jul 2009) | 29 lines
branches/zip rb://133
This patch introduces heuristics based flushing rate of dirty pages to
avoid IO bursts at checkpoint.
1) log_capacity / log_generated per second gives us number of seconds
in which ALL dirty pages need to be flushed. Based on this rough
assumption we can say that
n_dirty_pages / (log_capacity / log_generation_rate) = desired_flush_rate
2) We use weighted averages (hard coded to 20 seconds) of
log_generation_rate to avoid resonance.
3) From the desired_flush_rate we subtract the number of pages that have
been flushed due to LRU flushing. That gives us pages that we should
flush as part of flush_list cleanup. And that is the number (capped by
maximum io_capacity) that we try to flush from the master thread.
Knobs:
======
innodb_adaptive_flushing: boolean, global, dynamic, default TRUE.
Since this heuristic is very experimental and has the potential to
dramatically change the IO pattern I think it is a good idea to leave a
knob to turn it off.
Approved by: Heikki
------------------------------------------------------------------------
r5487 | calvin | 2009-07-08 12:42:28 -0400 (Wed, 08 Jul 2009) | 7 lines
branches/zip: fix PAUSE instruction patch on Windows
The original PAUSE instruction patch (r5470) does not
compile on Windows. Also, there is an elegant way of
doing it on Windows - YieldProcessor().
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r5489 | vasil | 2009-07-10 05:02:22 -0400 (Fri, 10 Jul 2009) | 9 lines
branches/zip:
Change the defaults for
innodb_sync_spin_loops: 20 -> 30
innodb_spin_wait_delay: 5 -> 6
This change was proposed by Sun/MySQL based on their performance testing,
see https://svn.innodb.com/innobase/Release_tasks_for_InnoDB_Plugin_V1.0.4
------------------------------------------------------------------------
r5490 | vasil | 2009-07-10 05:04:20 -0400 (Fri, 10 Jul 2009) | 4 lines
branches/zip:
Add ChangeLog entry for 5489.
------------------------------------------------------------------------
r5491 | calvin | 2009-07-10 12:19:17 -0400 (Fri, 10 Jul 2009) | 6 lines
branches/zip: add copyright info to files related to PAUSE
instruction patch, contributed by Sun Microsystems.
------------------------------------------------------------------------
r5492 | calvin | 2009-07-10 17:47:34 -0400 (Fri, 10 Jul 2009) | 5 lines
branches/zip: add ChangeLog entries for r5484-r5486.
------------------------------------------------------------------------
r5494 | vasil | 2009-07-13 03:37:35 -0400 (Mon, 13 Jul 2009) | 6 lines
branches/zip:
Restore the original value of innodb_sync_spin_loops at the end, previously
the test assumed that setting it to 20 will do this, but now the default is
30 and MTR's internal check failed.
------------------------------------------------------------------------
r5495 | inaam | 2009-07-13 11:48:45 -0400 (Mon, 13 Jul 2009) | 5 lines
branches/zip rb://138 (REVERT)
Revert the flush neighbors patch as it shows regression in
the benchmarks run by Michael.
------------------------------------------------------------------------
r5496 | inaam | 2009-07-13 14:04:57 -0400 (Mon, 13 Jul 2009) | 4 lines
branches/zip
Fixed warnings on windows where ulint != ib_uint64_t
------------------------------------------------------------------------
r5497 | calvin | 2009-07-13 15:01:00 -0400 (Mon, 13 Jul 2009) | 9 lines
branches/zip: fix run-time symbols clash on Solaris.
This patch is from Sergey Vojtovich of Sun Microsystems,
to fix run-time symbols clash on Solaris with older C++
compiler:
- when finding out a way to hide symbols, make decision basing
on compiler, not operating system.
- Sun Studio supports __hidden declaration specifier for this
purpose.
------------------------------------------------------------------------
r5498 | vasil | 2009-07-14 03:16:18 -0400 (Tue, 14 Jul 2009) | 92 lines
branches/zip: Merge r5341:5497 from branches/5.1, skipping:
c5419 because it is merge from branches/zip into branches/5.1
c5466 because the source code has been adjusted to match the MySQL
behavior and the innodb-autoinc test does not fail in branches/zip,
if c5466 is merged, then innodb-autoinc starts failing, Sunny suggested
not to merge c5466.
and resolving conflicts in c5410, c5440, c5488:
------------------------------------------------------------------------
r5410 | marko | 2009-06-24 22:26:34 +0300 (Wed, 24 Jun 2009) | 2 lines
Changed paths:
M /branches/5.1/include/trx0sys.ic
M /branches/5.1/trx/trx0purge.c
M /branches/5.1/trx/trx0sys.c
M /branches/5.1/trx/trx0undo.c
branches/5.1: Add missing #include "mtr0log.h" to avoid warnings
when compiling with -DUNIV_MUST_NOT_INLINE.
------------------------------------------------------------------------
r5419 | marko | 2009-06-25 16:11:57 +0300 (Thu, 25 Jun 2009) | 18 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb_bug42101-nonzero.result
M /branches/5.1/mysql-test/innodb_bug42101-nonzero.test
M /branches/5.1/mysql-test/innodb_bug42101.result
M /branches/5.1/mysql-test/innodb_bug42101.test
branches/5.1: Merge r5418 from branches/zip:
------------------------------------------------------------------------
r5418 | marko | 2009-06-25 15:55:52 +0300 (Thu, 25 Jun 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/mysql-test/innodb_bug42101-nonzero.result
M /branches/zip/mysql-test/innodb_bug42101-nonzero.test
M /branches/zip/mysql-test/innodb_bug42101.result
M /branches/zip/mysql-test/innodb_bug42101.test
branches/zip: Fix a race condition caused by
SET GLOBAL innodb_commit_concurrency=DEFAULT. (Bug #45749)
When innodb_commit_concurrency is initially set nonzero,
DEFAULT would change it back to 0, triggering Bug #42101.
rb://139 approved by Heikki Tuuri.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5440 | vasil | 2009-06-30 13:04:29 +0300 (Tue, 30 Jun 2009) | 8 lines
Changed paths:
M /branches/5.1/fil/fil0fil.c
branches/5.1:
Fix Bug#45814 URL reference in InnoDB server errors needs adjusting to match documentation
by changing the URL from
http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html to
http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting-datadict.html
------------------------------------------------------------------------
r5466 | vasil | 2009-07-02 10:46:45 +0300 (Thu, 02 Jul 2009) | 6 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Adjust the failing innodb-autoinc test to conform to the latest behavior
of the MySQL code. The idea and the comment in innodb-autoinc.test come
from Sunny.
------------------------------------------------------------------------
r5488 | vasil | 2009-07-09 19:16:44 +0300 (Thu, 09 Jul 2009) | 13 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
A /branches/5.1/mysql-test/innodb_bug21704.result
A /branches/5.1/mysql-test/innodb_bug21704.test
branches/5.1:
Fix Bug#21704 Renaming column does not update FK definition
by checking whether a column that participates in a FK definition is being
renamed and denying the ALTER in this case.
The patch was originally developed by Davi Arnaut <Davi.Arnaut@Sun.COM>:
http://lists.mysql.com/commits/77714
and was later adjusted to conform to InnoDB coding style by me (Vasil),
I also added some more comments and moved the bug specific mysql-test to
a separate file to make it more manageable and flexible.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5499 | calvin | 2009-07-14 12:55:10 -0400 (Tue, 14 Jul 2009) | 3 lines
branches/zip: add a missing file in Makefile.am
This change was suggested by MySQL.
------------------------------------------------------------------------
r5500 | calvin | 2009-07-14 13:03:26 -0400 (Tue, 14 Jul 2009) | 3 lines
branches/zip: minor change
Remove an extra "with".
------------------------------------------------------------------------
r5501 | vasil | 2009-07-14 13:58:15 -0400 (Tue, 14 Jul 2009) | 5 lines
branches/zip:
Add @ZLIB_INCLUDES@ so that the InnoDB Plugin picks up the same zlib.h
header file that is eventually used by mysqld.
------------------------------------------------------------------------
r5502 | vasil | 2009-07-14 13:59:59 -0400 (Tue, 14 Jul 2009) | 4 lines
branches/zip:
Add include/ut0auxconf.h to noinst_HEADERS
------------------------------------------------------------------------
r5503 | vasil | 2009-07-14 14:16:11 -0400 (Tue, 14 Jul 2009) | 8 lines
branches/zip:
Non-functional change:
put files in noinst_HEADERS and libinnobase_a_SOURCES one per line and sort
alphabetically, so it is easier to find if a file is there or not and
also diffs show exactly the added or removed file instead of surrounding
lines too.
------------------------------------------------------------------------
r5504 | calvin | 2009-07-15 04:58:44 -0400 (Wed, 15 Jul 2009) | 6 lines
branches/zip: fix compile errors on Win64
Both srv_read_ahead_factor and srv_io_capacity should
be defined as ulong.
Approved by: Sunny
------------------------------------------------------------------------
r5508 | calvin | 2009-07-16 09:40:47 -0400 (Thu, 16 Jul 2009) | 16 lines
branches/zip: Support inlining of functions and prefetch with
Sun Studio
Those changes are contributed by Sun/MySQL. Two sets of changes
in this patch when Sun Studio is used:
- Explicit inlining of functions
- Prefetch Support
This patch has been tested by Sunny with the plugin statically
built in. Since we've never built the plugin as a dynamically
loaded module on Solaris, it is a separate task to change
plug.in.
rb://142
Approved by: Heikki
------------------------------------------------------------------------
r5509 | calvin | 2009-07-16 09:45:28 -0400 (Thu, 16 Jul 2009) | 2 lines
branches/zip: add ChangeLog entry for r5508.
------------------------------------------------------------------------
r5512 | sunny | 2009-07-19 19:52:48 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Remove unused extern ref to timed_mutexes.
------------------------------------------------------------------------
r5513 | sunny | 2009-07-19 19:58:43 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Undo r5512
------------------------------------------------------------------------
r5514 | sunny | 2009-07-19 20:08:49 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Only use my_bool when UNIV_HOTBACKUP is not defined.
------------------------------------------------------------------------
r5515 | sunny | 2009-07-20 03:29:14 -0400 (Mon, 20 Jul 2009) | 2 lines
branches/zip: The dict_table_t::autoinc_mutex field is not used in HotBackup.
------------------------------------------------------------------------
r5516 | sunny | 2009-07-20 03:46:05 -0400 (Mon, 20 Jul 2009) | 4 lines
branches/zip: Make this file usable from within HotBackup. A new file has
been introduced called hb_univ.i. This file should have all the HotBackup
specific configuration.
------------------------------------------------------------------------
r5517 | sunny | 2009-07-20 03:55:11 -0400 (Mon, 20 Jul 2009) | 2 lines
Add /* UNIV_HOTBACK */
------------------------------------------------------------------------
r5519 | vasil | 2009-07-20 04:45:18 -0400 (Mon, 20 Jul 2009) | 31 lines
branches/zip: Merge r5497:5518 from branches/5.1:
------------------------------------------------------------------------
r5518 | vasil | 2009-07-20 11:29:47 +0300 (Mon, 20 Jul 2009) | 22 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2874.2.1
committer: Anurag Shekhar <anurag.shekhar@sun.com>
branch nick: mysql-5.1-bugteam-windows-warning
timestamp: Wed 2009-05-13 15:41:24 +0530
message:
Bug #39802 On Windows, 32-bit time_t should be enforced
This patch fixes compilation warning, "conversion from 'time_t' to 'ulong',
possible loss of data".
The fix is to typecast time_t to ulong before assigning it to ulong.
Backported this from 6.0-bugteam tree.
modified:
storage/archive/ha_archive.cc
storage/federated/ha_federated.cc
storage/innobase/handler/ha_innodb.cc
storage/myisam/ha_myisam.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r5520 | vasil | 2009-07-20 04:51:47 -0400 (Mon, 20 Jul 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5498 and r5519.
------------------------------------------------------------------------
r5524 | inaam | 2009-07-20 12:23:15 -0400 (Mon, 20 Jul 2009) | 9 lines
branches/zip
Change the read ahead parameter name to innodb_read_ahead_threshold.
Change the meaning of this parameter to signify the number of pages
that must be sequentially accessed for InnoDB to trigger a readahead
request.
Suggested by: Ken
------------------------------------------------------------------------
16 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  branches/innodb+: Merge revisions 5144:5524 from branches/zip
------------------------------------------------------------------------
r5147 | marko | 2009-05-27 06:55:14 -0400 (Wed, 27 May 2009) | 1 line
branches/zip: ibuf0ibuf.c: Improve a comment.
------------------------------------------------------------------------
r5149 | marko | 2009-05-27 07:46:42 -0400 (Wed, 27 May 2009) | 34 lines
branches/zip: Merge revisions 4994:5148 from branches/5.1:
------------------------------------------------------------------------
r5126 | vasil | 2009-05-26 16:57:12 +0300 (Tue, 26 May 2009) | 9 lines
branches/5.1:
Preparation for the fix of
Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Non-functional change: move FSP_* macros from fsp0fsp.h to a new file
fsp0types.h. This is needed in order to be able to use FSP_EXTENT_SIZE
in mtr0log.ic.
------------------------------------------------------------------------
r5127 | vasil | 2009-05-26 17:05:43 +0300 (Tue, 26 May 2009) | 9 lines
branches/5.1:
Preparation for the fix of
Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Do not include unnecessary headers mtr0log.h and fut0lst.h in trx0sys.h
and include fsp0fsp.h just before it is needed. This is needed in order
to be able to use TRX_SYS_SPACE in mtr0log.ic.
------------------------------------------------------------------------
r5128 | vasil | 2009-05-26 17:26:37 +0300 (Tue, 26 May 2009) | 7 lines
branches/5.1:
Fix Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Do not write redo log for the pages in the doublewrite buffer. Also, do not
make a dummy change to the page because this is not needed.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5169 | marko | 2009-05-28 03:21:55 -0400 (Thu, 28 May 2009) | 1 line
branches/zip: mtr0mtr.h: Add Doxygen comments for the redo log entry types.
------------------------------------------------------------------------
r5176 | marko | 2009-05-28 07:14:02 -0400 (Thu, 28 May 2009) | 1 line
branches/zip: Correct a debug assertion that was added in r5125.
------------------------------------------------------------------------
r5201 | marko | 2009-06-01 06:35:25 -0400 (Mon, 01 Jun 2009) | 2 lines
branches/zip: Clean up some comments.
Make the rec parameter of mlog_open_and_write_index() const.
------------------------------------------------------------------------
r5234 | marko | 2009-06-03 08:26:41 -0400 (Wed, 03 Jun 2009) | 44 lines
branches/zip: Merge revisions 5148:5233 from branches/5.1:
------------------------------------------------------------------------
r5150 | vasil | 2009-05-27 18:56:03 +0300 (Wed, 27 May 2009) | 4 lines
branches/5.1:
Whitespace fixup.
------------------------------------------------------------------------
r5191 | vasil | 2009-05-30 17:46:05 +0300 (Sat, 30 May 2009) | 19 lines
branches/5.1:
Merge a change from MySQL (this fixes the failing innodb_mysql test):
------------------------------------------------------------
revno: 1810.3894.10
committer: Sergey Glukhov <Sergey.Glukhov@sun.com>
branch nick: mysql-5.0-bugteam
timestamp: Tue 2009-05-19 11:32:21 +0500
message:
Bug#39793 Foreign keys not constructed when column has a '#' in a comment or default value
Internal InnoDN FK parser does not recognize '\'' as quotation symbol.
Suggested fix is to add '\'' symbol check for quotation condition
(dict_strip_comments() function).
modified:
innobase/dict/dict0dict.c
mysql-test/r/innodb_mysql.result
mysql-test/t/innodb_mysql.test
------------------------------------------------------------------------
r5233 | marko | 2009-06-03 15:12:44 +0300 (Wed, 03 Jun 2009) | 11 lines
branches/5.1: Merge the test case from r5232 from branches/5.0:
------------------------------------------------------------------------
r5232 | marko | 2009-06-03 14:31:04 +0300 (Wed, 03 Jun 2009) | 21 lines
branches/5.0: Merge r3590 from branches/5.1 in order to fix Bug #40565
(Update Query Results in "1 Row Affected" But Should Be "Zero Rows").
Also, add a test case for Bug #40565.
rb://128 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
r5250 | marko | 2009-06-04 02:58:23 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Add Doxygen comments to the rest of buf0*.
------------------------------------------------------------------------
r5251 | marko | 2009-06-04 02:59:51 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Replace <= in a function comment.
------------------------------------------------------------------------
r5253 | marko | 2009-06-04 06:37:35 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Add missing Doxygen comments for page0zip.
------------------------------------------------------------------------
r5261 | vasil | 2009-06-05 11:13:31 -0400 (Fri, 05 Jun 2009) | 15 lines
branches/zip:
Fix Mantis Issue#244 fix bug in linear read ahead (no check on access pattern)
The changes are:
1) Take into account access pattern when deciding whether or not to do linear
read ahead.
2) Expose a knob innodb_read_ahead_factor = [0-64] default (8), dynamic,
global to control linear read ahead behvior
3) Disable random read ahead. Keep the code for now.
Submitted by: Inaam (rb://122)
Approved by: Heikki (rb://122)
------------------------------------------------------------------------
r5262 | vasil | 2009-06-05 12:04:25 -0400 (Fri, 05 Jun 2009) | 22 lines
branches/zip:
Enable functionality to have multiple background io helper threads.
This patch is based on percona contributions.
More details about this patch will be written at:
https://svn.innodb.com/innobase/MultipleBackgroundThreads
The patch essentially does the following:
expose following knobs:
innodb_read_io_threads = [1 - 64] default 1
innodb_write_io_threads = [1 - 64] default 1
deprecate innodb_file_io_threads (this parameter was relevant only on windows)
Internally it allows multiple segments for read and write IO request arrays
where one thread works on one segement.
Submitted by: Inaam (rb://124)
Approved by: Heikki (rb://124)
------------------------------------------------------------------------
r5263 | vasil | 2009-06-05 12:19:37 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Whitespace cleanup.
------------------------------------------------------------------------
r5264 | vasil | 2009-06-05 12:26:58 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5261.
------------------------------------------------------------------------
r5265 | vasil | 2009-06-05 12:34:11 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5262.
------------------------------------------------------------------------
r5268 | inaam | 2009-06-08 12:18:21 -0400 (Mon, 08 Jun 2009) | 7 lines
branches/zip
Non functional change:
Added legal notices acknowledging percona contribution to the multiple
IO helper threads patch i.e.: r5262
------------------------------------------------------------------------
r5283 | inaam | 2009-06-09 13:46:29 -0400 (Tue, 09 Jun 2009) | 9 lines
branches/zip
rb://130
Enable Group Commit functionality that was broken in 5.0 when
distributed transactions were introduced.
Reviewed by: Heikki
------------------------------------------------------------------------
r5319 | marko | 2009-06-11 04:40:33 -0400 (Thu, 11 Jun 2009) | 3 lines
branches/zip: Declare os_thread_id_t as unsigned long,
because ulint is wrong on Win64.
Pointed out by Vladislav Vaintroub <wlad@sun.com>.
------------------------------------------------------------------------
r5320 | inaam | 2009-06-11 09:15:41 -0400 (Thu, 11 Jun 2009) | 14 lines
branches/zip rb://131
This patch changes the following defaults:
max_dirty_pages_pct: default from 90 to 75. max allowed from 100 to 99
additional_mem_pool_size: default from 1 to 8 MB
buffer_pool_size: default from 8 to 128 MB
log_buffer_size: default from 1 to 8 MB
read_io_threads/write_io_threads: default from 1 to 4
The log file sizes are untouched because of upgrade issues
Reviewed by: Heikki
------------------------------------------------------------------------
r5330 | marko | 2009-06-16 04:08:59 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: buf_page_get_gen(): Reduce mutex holding time by adjusting
buf_pool->n_pend_unzip while only holding buf_pool_mutex.
------------------------------------------------------------------------
r5331 | marko | 2009-06-16 05:00:48 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: buf_page_get_zip(): Eliminate a buf_page_get_mutex() call.
The function must switch on the block state anyway.
------------------------------------------------------------------------
r5332 | vasil | 2009-06-16 05:03:27 -0400 (Tue, 16 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5283 and r5320.
------------------------------------------------------------------------
r5333 | marko | 2009-06-16 05:27:46 -0400 (Tue, 16 Jun 2009) | 1 line
branches/zip: buf_page_io_query(): Remove unused function.
------------------------------------------------------------------------
r5335 | marko | 2009-06-16 09:23:10 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: innodb.test: Adjust the tolerance of
innodb_buffer_pool_pages_total for r5320.
------------------------------------------------------------------------
r5342 | marko | 2009-06-17 06:15:32 -0400 (Wed, 17 Jun 2009) | 60 lines
branches/zip: Merge revisions 5233:5341 from branches/5.1:
------------------------------------------------------------------------
r5233 | marko | 2009-06-03 15:12:44 +0300 (Wed, 03 Jun 2009) | 11 lines
branches/5.1: Merge the test case from r5232 from branches/5.0:
------------------------------------------------------------------------
r5232 | marko | 2009-06-03 14:31:04 +0300 (Wed, 03 Jun 2009) | 21 lines
branches/5.0: Merge r3590 from branches/5.1 in order to fix Bug #40565
(Update Query Results in "1 Row Affected" But Should Be "Zero Rows").
Also, add a test case for Bug #40565.
rb://128 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
r5243 | sunny | 2009-06-04 03:17:14 +0300 (Thu, 04 Jun 2009) | 14 lines
branches/5.1: When the InnoDB and MySQL data dictionaries go out of sync, before
the bug fix we would assert on missing autoinc columns. With this fix we allow
MySQL to open the table but set the next autoinc value for the column to the
MAX value. This effectively disables the next value generation. INSERTs will
fail with a generic AUTOINC failure. However, the user should be able to
read/dump the table, set the column values explicitly, use ALTER TABLE to
set the next autoinc value and/or sync the two data dictionaries to resume
normal operations.
Fix Bug#44030 Error: (1500) Couldn't read the MAX(ID) autoinc value from the
index (PRIMARY)
rb://118
------------------------------------------------------------------------
r5252 | sunny | 2009-06-04 10:16:24 +0300 (Thu, 04 Jun 2009) | 2 lines
branches/5.1: The version of the result file checked in was broken in r5243.
------------------------------------------------------------------------
r5259 | vasil | 2009-06-05 10:29:16 +0300 (Fri, 05 Jun 2009) | 7 lines
branches/5.1:
Remove the word "Error" from the printout because the mysqltest suite
interprets it as an error and thus the innodb-autoinc test fails.
Approved by: Sunny (via IM)
------------------------------------------------------------------------
r5339 | marko | 2009-06-17 11:01:37 +0300 (Wed, 17 Jun 2009) | 2 lines
branches/5.1: Add missing #include "mtr0log.h" so that the code compiles
with -DUNIV_MUST_NOT_INLINE.
(null merge; this had already been committed in branches/zip)
------------------------------------------------------------------------
r5340 | marko | 2009-06-17 12:11:49 +0300 (Wed, 17 Jun 2009) | 4 lines
branches/5.1: row_unlock_for_mysql(): When the clustered index is unknown,
refuse to unlock the record.
(Bug #45357, caused by the fix of Bug #39320).
rb://132 approved by Sunny Bains.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5343 | vasil | 2009-06-17 08:56:12 -0400 (Wed, 17 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5342.
------------------------------------------------------------------------
r5344 | marko | 2009-06-17 09:03:45 -0400 (Wed, 17 Jun 2009) | 1 line
branches/zip: row_merge_read_rec(): Fix a UNIV_DEBUG bug (Bug #45426)
------------------------------------------------------------------------
r5391 | marko | 2009-06-22 05:31:35 -0400 (Mon, 22 Jun 2009) | 2 lines
branches/zip: buf_page_get_zip(): Fix a bogus warning about
block_mutex being possibly uninitialized.
------------------------------------------------------------------------
r5392 | marko | 2009-06-22 07:58:20 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: ha_innobase::check_if_incompatible_data(): When
ROW_FORMAT=DEFAULT, do not compare to get_row_type().
Without this change, fast index creation will be disabled
in recent versions of MySQL 5.1.
------------------------------------------------------------------------
r5393 | pekka | 2009-06-22 09:27:55 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: Minor changes for Hot Backup to build correctly. (The
code bracketed between #ifdef UNIV_HOTBACKUP and #endif /* UNIV_HOTBACKUP */).
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r5394 | pekka | 2009-06-22 09:46:34 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: Add functions for checking the format of tablespaces
for Hot Backup build (UNIV_HOTBACKUP defined).
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r5397 | calvin | 2009-06-23 16:59:42 -0400 (Tue, 23 Jun 2009) | 7 lines
branches/zip: change the header file path.
Change the header file path from ../storage/innobase/include/
to ../include/. In the planned 5.1 + plugin release, the source
directory of the plugin will not be in storage/innobase.
Approved by: Heikki (IM)
------------------------------------------------------------------------
r5407 | calvin | 2009-06-24 09:51:08 -0400 (Wed, 24 Jun 2009) | 4 lines
branches/zip: remove relative path of header files.
Suggested by Marko.
------------------------------------------------------------------------
r5412 | marko | 2009-06-25 06:27:08 -0400 (Thu, 25 Jun 2009) | 1 line
branches/zip: Replace a DBUG_ASSERT with ut_a to track down Issue #290.
------------------------------------------------------------------------
r5415 | marko | 2009-06-25 06:45:57 -0400 (Thu, 25 Jun 2009) | 3 lines
branches/zip: dict_index_find_cols(): Print diagnostic on name mismatch.
This addresses Bug #44571 but does not fix it.
rb://135 approved by Sunny Bains.
------------------------------------------------------------------------
r5417 | marko | 2009-06-25 08:20:56 -0400 (Thu, 25 Jun 2009) | 1 line
branches/zip: ha_innodb.cc: Move the misplaced Doxygen @file comment.
------------------------------------------------------------------------
r5418 | marko | 2009-06-25 08:55:52 -0400 (Thu, 25 Jun 2009) | 5 lines
branches/zip: Fix a race condition caused by
SET GLOBAL innodb_commit_concurrency=DEFAULT. (Bug #45749)
When innodb_commit_concurrency is initially set nonzero,
DEFAULT would change it back to 0, triggering Bug #42101.
rb://139 approved by Heikki Tuuri.
------------------------------------------------------------------------
r5423 | calvin | 2009-06-26 16:52:52 -0400 (Fri, 26 Jun 2009) | 2 lines
branches/zip: Fix typos.
------------------------------------------------------------------------
r5425 | marko | 2009-06-29 04:52:30 -0400 (Mon, 29 Jun 2009) | 4 lines
branches/zip: ha_innobase::add_index(), ha_innobase::final_drop_index():
Start prebuilt->trx before locking the table. This should fix Issue #293
and could fix Issue #229.
Approved by Sunny (over IM).
------------------------------------------------------------------------
r5426 | marko | 2009-06-29 05:24:27 -0400 (Mon, 29 Jun 2009) | 3 lines
branches/zip: buf_page_get_gen(): Fix a race condition when reading
buf_fix_count. This could explain Issue #156.
Tested by Michael.
------------------------------------------------------------------------
r5427 | marko | 2009-06-29 05:54:53 -0400 (Mon, 29 Jun 2009) | 5 lines
branches/zip: lock_print_info_all_transactions(), buf_read_recv_pages():
Tolerate missing tablespaces (zip_size==ULINT_UNDEFINED).
buf_page_get_gen(): Add ut_ad(ut_is_2pow(zip_size)).
Issue #289, rb://136 approved by Sunny Bains
------------------------------------------------------------------------
r5428 | marko | 2009-06-29 07:06:29 -0400 (Mon, 29 Jun 2009) | 2 lines
branches/zip: row_sel_store_mysql_rec(): Add missing pointer cast.
Do not do arithmetics on void pointers.
------------------------------------------------------------------------
r5429 | marko | 2009-06-29 09:49:54 -0400 (Mon, 29 Jun 2009) | 13 lines
branches/zip: Do not crash on SET GLOBAL innodb_file_format=DEFAULT
or SET GLOBAL innodb_file_format_check=DEFAULT.
innodb_file_format.test: New test for innodb_file_format and
innodb_file_format_check.
innodb_file_format_name_validate(): Store the string in *save.
innodb_file_format_name_update(): Check the string again.
innodb_file_format_check_validate(): Store the string in *save.
innodb_file_format_check_update(): Check the string again.
Issue #282, rb://140 approved by Heikki Tuuri
------------------------------------------------------------------------
r5430 | marko | 2009-06-29 09:58:07 -0400 (Mon, 29 Jun 2009) | 2 lines
branches/zip: lock_rec_validate_page(): Add another assertion
to track down Issue #289.
------------------------------------------------------------------------
r5431 | marko | 2009-06-29 09:58:40 -0400 (Mon, 29 Jun 2009) | 1 line
branches/zip: Revert an accidentally made change in r5430 to univ.i.
------------------------------------------------------------------------
r5437 | marko | 2009-06-30 05:10:01 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: ibuf_dummy_index_free(): Beautify the comment.
------------------------------------------------------------------------
r5438 | marko | 2009-06-30 05:10:32 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: fseg_free(): Remove this unused function.
------------------------------------------------------------------------
r5439 | marko | 2009-06-30 05:15:22 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: fseg_validate(): Enclose in #ifdef UNIV_DEBUG.
This function is unused, but it could turn out to be a useful debugging aid.
------------------------------------------------------------------------
r5441 | marko | 2009-06-30 06:30:14 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: ha_delete(): Remove this unused function that was
very similar to ha_search_and_delete_if_found().
------------------------------------------------------------------------
r5442 | marko | 2009-06-30 06:45:41 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: lock_is_on_table(), lock_table_unlock(): Unused, remove.
------------------------------------------------------------------------
r5443 | marko | 2009-06-30 07:03:00 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_event_create_auto(): Unused, remove.
------------------------------------------------------------------------
r5444 | marko | 2009-06-30 07:19:49 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: que_graph_try_free(): Unused, remove.
------------------------------------------------------------------------
r5445 | marko | 2009-06-30 07:28:11 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: row_build_row_ref_from_row(): Unused, remove.
------------------------------------------------------------------------
r5446 | marko | 2009-06-30 07:35:45 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: srv_que_round_robin(), srv_que_task_enqueue(): Unused, remove.
------------------------------------------------------------------------
r5447 | marko | 2009-06-30 07:37:58 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: srv_que_task_queue_check(): Unused, remove.
------------------------------------------------------------------------
r5448 | marko | 2009-06-30 07:56:36 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: mem_heap_cat(): Unused, remove.
------------------------------------------------------------------------
r5449 | marko | 2009-06-30 08:00:50 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: innobase_start_or_create_for_mysql():
Invoke os_get_os_version() at most once.
------------------------------------------------------------------------
r5450 | marko | 2009-06-30 08:02:20 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_file_close_no_error_handling(): Unused, remove.
------------------------------------------------------------------------
r5451 | marko | 2009-06-30 08:09:49 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: page_set_max_trx_id(): Make the code compile
with UNIV_HOTBACKUP.
------------------------------------------------------------------------
r5452 | marko | 2009-06-30 08:10:26 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: os_file_close_no_error_handling(): Restore,
as this function is used within InnoDB Hot Backup.
------------------------------------------------------------------------
r5453 | marko | 2009-06-30 08:14:01 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_process_set_priority_boost(): Unused, remove.
------------------------------------------------------------------------
r5454 | marko | 2009-06-30 08:42:52 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: Replace a non-ASCII character
(ISO 8859-1 encoded U+00AD SOFT HYPHEN) with a cheap ASCII substitute.
------------------------------------------------------------------------
r5456 | inaam | 2009-06-30 14:21:09 -0400 (Tue, 30 Jun 2009) | 4 lines
branches/zip
Non functional change. s/Percona/Percona Inc./
------------------------------------------------------------------------
r5470 | vasil | 2009-07-02 09:12:36 -0400 (Thu, 02 Jul 2009) | 16 lines
branches/zip:
Use PAUSE instruction inside spinloop if it is available.
The patch was originally developed by Mikael Ronstrom <mikael@mysql.com>
and can be found here:
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2768
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2771
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2772
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2774
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2777
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2799
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2800
Approved by: Heikki (rb://137)
------------------------------------------------------------------------
r5481 | vasil | 2009-07-06 13:16:32 -0400 (Mon, 06 Jul 2009) | 4 lines
branches/zip:
Remove unnecessary quotes and simplify plug.in.
------------------------------------------------------------------------
r5482 | calvin | 2009-07-06 18:36:35 -0400 (Mon, 06 Jul 2009) | 5 lines
branches/zip: add COPYING files for Percona and Sun Micro.
1.0.4 contains patches based on contributions from Percona
and Sun Microsystems.
------------------------------------------------------------------------
r5483 | calvin | 2009-07-07 05:36:43 -0400 (Tue, 07 Jul 2009) | 3 lines
branches/zip: add IB_HAVE_PAUSE_INSTRUCTION to CMake.
Windows will support PAUSE instruction by default.
------------------------------------------------------------------------
r5484 | inaam | 2009-07-07 18:57:14 -0400 (Tue, 07 Jul 2009) | 13 lines
branches/zip rb://126
Based on contribution from Google Inc.
This patch introduces a new parameter innodb_io_capacity to control the
rate at which master threads performs various tasks. The default value
is 200 and higher values imply more aggressive flushing and ibuf merges
from within the master thread.
This patch also changes the ibuf merge from synchronous to asynchronous.
Another minor change is not to force the master thread to wait for a
log flush to complete every second.
Approved by: Heikki
------------------------------------------------------------------------
r5485 | inaam | 2009-07-07 19:00:49 -0400 (Tue, 07 Jul 2009) | 18 lines
branches/zip rb://138
The current implementation is to try to flush the neighbors of every
page that we flush. This patch makes the following distinction:
1) If the flush is from flush_list AND
2) If the flush is intended to move the oldest_modification LSN ahead
(this happens when a user thread sees little space in the log file and
attempts to flush pages from the buffer pool so that a checkpoint can
be made)
THEN
Do not try to flush the neighbors. Just focus on flushing dirty pages at
the end of flush_list
Approved by: Heikki
------------------------------------------------------------------------
r5486 | inaam | 2009-07-08 12:11:40 -0400 (Wed, 08 Jul 2009) | 29 lines
branches/zip rb://133
This patch introduces heuristics based flushing rate of dirty pages to
avoid IO bursts at checkpoint.
1) log_capacity / log_generated per second gives us number of seconds
in which ALL dirty pages need to be flushed. Based on this rough
assumption we can say that
n_dirty_pages / (log_capacity / log_generation_rate) = desired_flush_rate
2) We use weighted averages (hard coded to 20 seconds) of
log_generation_rate to avoid resonance.
3) From the desired_flush_rate we subtract the number of pages that have
been flushed due to LRU flushing. That gives us pages that we should
flush as part of flush_list cleanup. And that is the number (capped by
maximum io_capacity) that we try to flush from the master thread.
Knobs:
======
innodb_adaptive_flushing: boolean, global, dynamic, default TRUE.
Since this heuristic is very experimental and has the potential to
dramatically change the IO pattern I think it is a good idea to leave a
knob to turn it off.
Approved by: Heikki
------------------------------------------------------------------------
r5487 | calvin | 2009-07-08 12:42:28 -0400 (Wed, 08 Jul 2009) | 7 lines
branches/zip: fix PAUSE instruction patch on Windows
The original PAUSE instruction patch (r5470) does not
compile on Windows. Also, there is an elegant way of
doing it on Windows - YieldProcessor().
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r5489 | vasil | 2009-07-10 05:02:22 -0400 (Fri, 10 Jul 2009) | 9 lines
branches/zip:
Change the defaults for
innodb_sync_spin_loops: 20 -> 30
innodb_spin_wait_delay: 5 -> 6
This change was proposed by Sun/MySQL based on their performance testing,
see https://svn.innodb.com/innobase/Release_tasks_for_InnoDB_Plugin_V1.0.4
------------------------------------------------------------------------
r5490 | vasil | 2009-07-10 05:04:20 -0400 (Fri, 10 Jul 2009) | 4 lines
branches/zip:
Add ChangeLog entry for 5489.
------------------------------------------------------------------------
r5491 | calvin | 2009-07-10 12:19:17 -0400 (Fri, 10 Jul 2009) | 6 lines
branches/zip: add copyright info to files related to PAUSE
instruction patch, contributed by Sun Microsystems.
------------------------------------------------------------------------
r5492 | calvin | 2009-07-10 17:47:34 -0400 (Fri, 10 Jul 2009) | 5 lines
branches/zip: add ChangeLog entries for r5484-r5486.
------------------------------------------------------------------------
r5494 | vasil | 2009-07-13 03:37:35 -0400 (Mon, 13 Jul 2009) | 6 lines
branches/zip:
Restore the original value of innodb_sync_spin_loops at the end, previously
the test assumed that setting it to 20 will do this, but now the default is
30 and MTR's internal check failed.
------------------------------------------------------------------------
r5495 | inaam | 2009-07-13 11:48:45 -0400 (Mon, 13 Jul 2009) | 5 lines
branches/zip rb://138 (REVERT)
Revert the flush neighbors patch as it shows regression in
the benchmarks run by Michael.
------------------------------------------------------------------------
r5496 | inaam | 2009-07-13 14:04:57 -0400 (Mon, 13 Jul 2009) | 4 lines
branches/zip
Fixed warnings on windows where ulint != ib_uint64_t
------------------------------------------------------------------------
r5497 | calvin | 2009-07-13 15:01:00 -0400 (Mon, 13 Jul 2009) | 9 lines
branches/zip: fix run-time symbols clash on Solaris.
This patch is from Sergey Vojtovich of Sun Microsystems,
to fix run-time symbols clash on Solaris with older C++
compiler:
- when finding out a way to hide symbols, make decision basing
on compiler, not operating system.
- Sun Studio supports __hidden declaration specifier for this
purpose.
------------------------------------------------------------------------
r5498 | vasil | 2009-07-14 03:16:18 -0400 (Tue, 14 Jul 2009) | 92 lines
branches/zip: Merge r5341:5497 from branches/5.1, skipping:
c5419 because it is merge from branches/zip into branches/5.1
c5466 because the source code has been adjusted to match the MySQL
behavior and the innodb-autoinc test does not fail in branches/zip,
if c5466 is merged, then innodb-autoinc starts failing, Sunny suggested
not to merge c5466.
and resolving conflicts in c5410, c5440, c5488:
------------------------------------------------------------------------
r5410 | marko | 2009-06-24 22:26:34 +0300 (Wed, 24 Jun 2009) | 2 lines
Changed paths:
M /branches/5.1/include/trx0sys.ic
M /branches/5.1/trx/trx0purge.c
M /branches/5.1/trx/trx0sys.c
M /branches/5.1/trx/trx0undo.c
branches/5.1: Add missing #include "mtr0log.h" to avoid warnings
when compiling with -DUNIV_MUST_NOT_INLINE.
------------------------------------------------------------------------
r5419 | marko | 2009-06-25 16:11:57 +0300 (Thu, 25 Jun 2009) | 18 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb_bug42101-nonzero.result
M /branches/5.1/mysql-test/innodb_bug42101-nonzero.test
M /branches/5.1/mysql-test/innodb_bug42101.result
M /branches/5.1/mysql-test/innodb_bug42101.test
branches/5.1: Merge r5418 from branches/zip:
------------------------------------------------------------------------
r5418 | marko | 2009-06-25 15:55:52 +0300 (Thu, 25 Jun 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/mysql-test/innodb_bug42101-nonzero.result
M /branches/zip/mysql-test/innodb_bug42101-nonzero.test
M /branches/zip/mysql-test/innodb_bug42101.result
M /branches/zip/mysql-test/innodb_bug42101.test
branches/zip: Fix a race condition caused by
SET GLOBAL innodb_commit_concurrency=DEFAULT. (Bug #45749)
When innodb_commit_concurrency is initially set nonzero,
DEFAULT would change it back to 0, triggering Bug #42101.
rb://139 approved by Heikki Tuuri.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5440 | vasil | 2009-06-30 13:04:29 +0300 (Tue, 30 Jun 2009) | 8 lines
Changed paths:
M /branches/5.1/fil/fil0fil.c
branches/5.1:
Fix Bug#45814 URL reference in InnoDB server errors needs adjusting to match documentation
by changing the URL from
http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html to
http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting-datadict.html
------------------------------------------------------------------------
r5466 | vasil | 2009-07-02 10:46:45 +0300 (Thu, 02 Jul 2009) | 6 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Adjust the failing innodb-autoinc test to conform to the latest behavior
of the MySQL code. The idea and the comment in innodb-autoinc.test come
from Sunny.
------------------------------------------------------------------------
r5488 | vasil | 2009-07-09 19:16:44 +0300 (Thu, 09 Jul 2009) | 13 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
A /branches/5.1/mysql-test/innodb_bug21704.result
A /branches/5.1/mysql-test/innodb_bug21704.test
branches/5.1:
Fix Bug#21704 Renaming column does not update FK definition
by checking whether a column that participates in a FK definition is being
renamed and denying the ALTER in this case.
The patch was originally developed by Davi Arnaut <Davi.Arnaut@Sun.COM>:
http://lists.mysql.com/commits/77714
and was later adjusted to conform to InnoDB coding style by me (Vasil),
I also added some more comments and moved the bug specific mysql-test to
a separate file to make it more manageable and flexible.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5499 | calvin | 2009-07-14 12:55:10 -0400 (Tue, 14 Jul 2009) | 3 lines
branches/zip: add a missing file in Makefile.am
This change was suggested by MySQL.
------------------------------------------------------------------------
r5500 | calvin | 2009-07-14 13:03:26 -0400 (Tue, 14 Jul 2009) | 3 lines
branches/zip: minor change
Remove an extra "with".
------------------------------------------------------------------------
r5501 | vasil | 2009-07-14 13:58:15 -0400 (Tue, 14 Jul 2009) | 5 lines
branches/zip:
Add @ZLIB_INCLUDES@ so that the InnoDB Plugin picks up the same zlib.h
header file that is eventually used by mysqld.
------------------------------------------------------------------------
r5502 | vasil | 2009-07-14 13:59:59 -0400 (Tue, 14 Jul 2009) | 4 lines
branches/zip:
Add include/ut0auxconf.h to noinst_HEADERS
------------------------------------------------------------------------
r5503 | vasil | 2009-07-14 14:16:11 -0400 (Tue, 14 Jul 2009) | 8 lines
branches/zip:
Non-functional change:
put files in noinst_HEADERS and libinnobase_a_SOURCES one per line and sort
alphabetically, so it is easier to find if a file is there or not and
also diffs show exactly the added or removed file instead of surrounding
lines too.
------------------------------------------------------------------------
r5504 | calvin | 2009-07-15 04:58:44 -0400 (Wed, 15 Jul 2009) | 6 lines
branches/zip: fix compile errors on Win64
Both srv_read_ahead_factor and srv_io_capacity should
be defined as ulong.
Approved by: Sunny
------------------------------------------------------------------------
r5508 | calvin | 2009-07-16 09:40:47 -0400 (Thu, 16 Jul 2009) | 16 lines
branches/zip: Support inlining of functions and prefetch with
Sun Studio
Those changes are contributed by Sun/MySQL. Two sets of changes
in this patch when Sun Studio is used:
- Explicit inlining of functions
- Prefetch Support
This patch has been tested by Sunny with the plugin statically
built in. Since we've never built the plugin as a dynamically
loaded module on Solaris, it is a separate task to change
plug.in.
rb://142
Approved by: Heikki
------------------------------------------------------------------------
r5509 | calvin | 2009-07-16 09:45:28 -0400 (Thu, 16 Jul 2009) | 2 lines
branches/zip: add ChangeLog entry for r5508.
------------------------------------------------------------------------
r5512 | sunny | 2009-07-19 19:52:48 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Remove unused extern ref to timed_mutexes.
------------------------------------------------------------------------
r5513 | sunny | 2009-07-19 19:58:43 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Undo r5512
------------------------------------------------------------------------
r5514 | sunny | 2009-07-19 20:08:49 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Only use my_bool when UNIV_HOTBACKUP is not defined.
------------------------------------------------------------------------
r5515 | sunny | 2009-07-20 03:29:14 -0400 (Mon, 20 Jul 2009) | 2 lines
branches/zip: The dict_table_t::autoinc_mutex field is not used in HotBackup.
------------------------------------------------------------------------
r5516 | sunny | 2009-07-20 03:46:05 -0400 (Mon, 20 Jul 2009) | 4 lines
branches/zip: Make this file usable from within HotBackup. A new file has
been introduced called hb_univ.i. This file should have all the HotBackup
specific configuration.
------------------------------------------------------------------------
r5517 | sunny | 2009-07-20 03:55:11 -0400 (Mon, 20 Jul 2009) | 2 lines
Add /* UNIV_HOTBACK */
------------------------------------------------------------------------
r5519 | vasil | 2009-07-20 04:45:18 -0400 (Mon, 20 Jul 2009) | 31 lines
branches/zip: Merge r5497:5518 from branches/5.1:
------------------------------------------------------------------------
r5518 | vasil | 2009-07-20 11:29:47 +0300 (Mon, 20 Jul 2009) | 22 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2874.2.1
committer: Anurag Shekhar <anurag.shekhar@sun.com>
branch nick: mysql-5.1-bugteam-windows-warning
timestamp: Wed 2009-05-13 15:41:24 +0530
message:
Bug #39802 On Windows, 32-bit time_t should be enforced
This patch fixes compilation warning, "conversion from 'time_t' to 'ulong',
possible loss of data".
The fix is to typecast time_t to ulong before assigning it to ulong.
Backported this from 6.0-bugteam tree.
modified:
storage/archive/ha_archive.cc
storage/federated/ha_federated.cc
storage/innobase/handler/ha_innodb.cc
storage/myisam/ha_myisam.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r5520 | vasil | 2009-07-20 04:51:47 -0400 (Mon, 20 Jul 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5498 and r5519.
------------------------------------------------------------------------
r5524 | inaam | 2009-07-20 12:23:15 -0400 (Mon, 20 Jul 2009) | 9 lines
branches/zip
Change the read ahead parameter name to innodb_read_ahead_threshold.
Change the meaning of this parameter to signify the number of pages
that must be sequentially accessed for InnoDB to trigger a readahead
request.
Suggested by: Ken
------------------------------------------------------------------------
16 years ago  branches/innodb+: Merge revisions 5144:5524 from branches/zip
------------------------------------------------------------------------
r5147 | marko | 2009-05-27 06:55:14 -0400 (Wed, 27 May 2009) | 1 line
branches/zip: ibuf0ibuf.c: Improve a comment.
------------------------------------------------------------------------
r5149 | marko | 2009-05-27 07:46:42 -0400 (Wed, 27 May 2009) | 34 lines
branches/zip: Merge revisions 4994:5148 from branches/5.1:
------------------------------------------------------------------------
r5126 | vasil | 2009-05-26 16:57:12 +0300 (Tue, 26 May 2009) | 9 lines
branches/5.1:
Preparation for the fix of
Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Non-functional change: move FSP_* macros from fsp0fsp.h to a new file
fsp0types.h. This is needed in order to be able to use FSP_EXTENT_SIZE
in mtr0log.ic.
------------------------------------------------------------------------
r5127 | vasil | 2009-05-26 17:05:43 +0300 (Tue, 26 May 2009) | 9 lines
branches/5.1:
Preparation for the fix of
Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Do not include unnecessary headers mtr0log.h and fut0lst.h in trx0sys.h
and include fsp0fsp.h just before it is needed. This is needed in order
to be able to use TRX_SYS_SPACE in mtr0log.ic.
------------------------------------------------------------------------
r5128 | vasil | 2009-05-26 17:26:37 +0300 (Tue, 26 May 2009) | 7 lines
branches/5.1:
Fix Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Do not write redo log for the pages in the doublewrite buffer. Also, do not
make a dummy change to the page because this is not needed.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5169 | marko | 2009-05-28 03:21:55 -0400 (Thu, 28 May 2009) | 1 line
branches/zip: mtr0mtr.h: Add Doxygen comments for the redo log entry types.
------------------------------------------------------------------------
r5176 | marko | 2009-05-28 07:14:02 -0400 (Thu, 28 May 2009) | 1 line
branches/zip: Correct a debug assertion that was added in r5125.
------------------------------------------------------------------------
r5201 | marko | 2009-06-01 06:35:25 -0400 (Mon, 01 Jun 2009) | 2 lines
branches/zip: Clean up some comments.
Make the rec parameter of mlog_open_and_write_index() const.
------------------------------------------------------------------------
r5234 | marko | 2009-06-03 08:26:41 -0400 (Wed, 03 Jun 2009) | 44 lines
branches/zip: Merge revisions 5148:5233 from branches/5.1:
------------------------------------------------------------------------
r5150 | vasil | 2009-05-27 18:56:03 +0300 (Wed, 27 May 2009) | 4 lines
branches/5.1:
Whitespace fixup.
------------------------------------------------------------------------
r5191 | vasil | 2009-05-30 17:46:05 +0300 (Sat, 30 May 2009) | 19 lines
branches/5.1:
Merge a change from MySQL (this fixes the failing innodb_mysql test):
------------------------------------------------------------
revno: 1810.3894.10
committer: Sergey Glukhov <Sergey.Glukhov@sun.com>
branch nick: mysql-5.0-bugteam
timestamp: Tue 2009-05-19 11:32:21 +0500
message:
Bug#39793 Foreign keys not constructed when column has a '#' in a comment or default value
Internal InnoDN FK parser does not recognize '\'' as quotation symbol.
Suggested fix is to add '\'' symbol check for quotation condition
(dict_strip_comments() function).
modified:
innobase/dict/dict0dict.c
mysql-test/r/innodb_mysql.result
mysql-test/t/innodb_mysql.test
------------------------------------------------------------------------
r5233 | marko | 2009-06-03 15:12:44 +0300 (Wed, 03 Jun 2009) | 11 lines
branches/5.1: Merge the test case from r5232 from branches/5.0:
------------------------------------------------------------------------
r5232 | marko | 2009-06-03 14:31:04 +0300 (Wed, 03 Jun 2009) | 21 lines
branches/5.0: Merge r3590 from branches/5.1 in order to fix Bug #40565
(Update Query Results in "1 Row Affected" But Should Be "Zero Rows").
Also, add a test case for Bug #40565.
rb://128 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
r5250 | marko | 2009-06-04 02:58:23 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Add Doxygen comments to the rest of buf0*.
------------------------------------------------------------------------
r5251 | marko | 2009-06-04 02:59:51 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Replace <= in a function comment.
------------------------------------------------------------------------
r5253 | marko | 2009-06-04 06:37:35 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Add missing Doxygen comments for page0zip.
------------------------------------------------------------------------
r5261 | vasil | 2009-06-05 11:13:31 -0400 (Fri, 05 Jun 2009) | 15 lines
branches/zip:
Fix Mantis Issue#244 fix bug in linear read ahead (no check on access pattern)
The changes are:
1) Take into account access pattern when deciding whether or not to do linear
read ahead.
2) Expose a knob innodb_read_ahead_factor = [0-64] default (8), dynamic,
global to control linear read ahead behvior
3) Disable random read ahead. Keep the code for now.
Submitted by: Inaam (rb://122)
Approved by: Heikki (rb://122)
------------------------------------------------------------------------
r5262 | vasil | 2009-06-05 12:04:25 -0400 (Fri, 05 Jun 2009) | 22 lines
branches/zip:
Enable functionality to have multiple background io helper threads.
This patch is based on percona contributions.
More details about this patch will be written at:
https://svn.innodb.com/innobase/MultipleBackgroundThreads
The patch essentially does the following:
expose following knobs:
innodb_read_io_threads = [1 - 64] default 1
innodb_write_io_threads = [1 - 64] default 1
deprecate innodb_file_io_threads (this parameter was relevant only on windows)
Internally it allows multiple segments for read and write IO request arrays
where one thread works on one segement.
Submitted by: Inaam (rb://124)
Approved by: Heikki (rb://124)
------------------------------------------------------------------------
r5263 | vasil | 2009-06-05 12:19:37 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Whitespace cleanup.
------------------------------------------------------------------------
r5264 | vasil | 2009-06-05 12:26:58 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5261.
------------------------------------------------------------------------
r5265 | vasil | 2009-06-05 12:34:11 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5262.
------------------------------------------------------------------------
r5268 | inaam | 2009-06-08 12:18:21 -0400 (Mon, 08 Jun 2009) | 7 lines
branches/zip
Non functional change:
Added legal notices acknowledging percona contribution to the multiple
IO helper threads patch i.e.: r5262
------------------------------------------------------------------------
r5283 | inaam | 2009-06-09 13:46:29 -0400 (Tue, 09 Jun 2009) | 9 lines
branches/zip
rb://130
Enable Group Commit functionality that was broken in 5.0 when
distributed transactions were introduced.
Reviewed by: Heikki
------------------------------------------------------------------------
r5319 | marko | 2009-06-11 04:40:33 -0400 (Thu, 11 Jun 2009) | 3 lines
branches/zip: Declare os_thread_id_t as unsigned long,
because ulint is wrong on Win64.
Pointed out by Vladislav Vaintroub <wlad@sun.com>.
------------------------------------------------------------------------
r5320 | inaam | 2009-06-11 09:15:41 -0400 (Thu, 11 Jun 2009) | 14 lines
branches/zip rb://131
This patch changes the following defaults:
max_dirty_pages_pct: default from 90 to 75. max allowed from 100 to 99
additional_mem_pool_size: default from 1 to 8 MB
buffer_pool_size: default from 8 to 128 MB
log_buffer_size: default from 1 to 8 MB
read_io_threads/write_io_threads: default from 1 to 4
The log file sizes are untouched because of upgrade issues
Reviewed by: Heikki
------------------------------------------------------------------------
r5330 | marko | 2009-06-16 04:08:59 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: buf_page_get_gen(): Reduce mutex holding time by adjusting
buf_pool->n_pend_unzip while only holding buf_pool_mutex.
------------------------------------------------------------------------
r5331 | marko | 2009-06-16 05:00:48 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: buf_page_get_zip(): Eliminate a buf_page_get_mutex() call.
The function must switch on the block state anyway.
------------------------------------------------------------------------
r5332 | vasil | 2009-06-16 05:03:27 -0400 (Tue, 16 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5283 and r5320.
------------------------------------------------------------------------
r5333 | marko | 2009-06-16 05:27:46 -0400 (Tue, 16 Jun 2009) | 1 line
branches/zip: buf_page_io_query(): Remove unused function.
------------------------------------------------------------------------
r5335 | marko | 2009-06-16 09:23:10 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: innodb.test: Adjust the tolerance of
innodb_buffer_pool_pages_total for r5320.
------------------------------------------------------------------------
r5342 | marko | 2009-06-17 06:15:32 -0400 (Wed, 17 Jun 2009) | 60 lines
branches/zip: Merge revisions 5233:5341 from branches/5.1:
------------------------------------------------------------------------
r5233 | marko | 2009-06-03 15:12:44 +0300 (Wed, 03 Jun 2009) | 11 lines
branches/5.1: Merge the test case from r5232 from branches/5.0:
------------------------------------------------------------------------
r5232 | marko | 2009-06-03 14:31:04 +0300 (Wed, 03 Jun 2009) | 21 lines
branches/5.0: Merge r3590 from branches/5.1 in order to fix Bug #40565
(Update Query Results in "1 Row Affected" But Should Be "Zero Rows").
Also, add a test case for Bug #40565.
rb://128 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
r5243 | sunny | 2009-06-04 03:17:14 +0300 (Thu, 04 Jun 2009) | 14 lines
branches/5.1: When the InnoDB and MySQL data dictionaries go out of sync, before
the bug fix we would assert on missing autoinc columns. With this fix we allow
MySQL to open the table but set the next autoinc value for the column to the
MAX value. This effectively disables the next value generation. INSERTs will
fail with a generic AUTOINC failure. However, the user should be able to
read/dump the table, set the column values explicitly, use ALTER TABLE to
set the next autoinc value and/or sync the two data dictionaries to resume
normal operations.
Fix Bug#44030 Error: (1500) Couldn't read the MAX(ID) autoinc value from the
index (PRIMARY)
rb://118
------------------------------------------------------------------------
r5252 | sunny | 2009-06-04 10:16:24 +0300 (Thu, 04 Jun 2009) | 2 lines
branches/5.1: The version of the result file checked in was broken in r5243.
------------------------------------------------------------------------
r5259 | vasil | 2009-06-05 10:29:16 +0300 (Fri, 05 Jun 2009) | 7 lines
branches/5.1:
Remove the word "Error" from the printout because the mysqltest suite
interprets it as an error and thus the innodb-autoinc test fails.
Approved by: Sunny (via IM)
------------------------------------------------------------------------
r5339 | marko | 2009-06-17 11:01:37 +0300 (Wed, 17 Jun 2009) | 2 lines
branches/5.1: Add missing #include "mtr0log.h" so that the code compiles
with -DUNIV_MUST_NOT_INLINE.
(null merge; this had already been committed in branches/zip)
------------------------------------------------------------------------
r5340 | marko | 2009-06-17 12:11:49 +0300 (Wed, 17 Jun 2009) | 4 lines
branches/5.1: row_unlock_for_mysql(): When the clustered index is unknown,
refuse to unlock the record.
(Bug #45357, caused by the fix of Bug #39320).
rb://132 approved by Sunny Bains.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5343 | vasil | 2009-06-17 08:56:12 -0400 (Wed, 17 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5342.
------------------------------------------------------------------------
r5344 | marko | 2009-06-17 09:03:45 -0400 (Wed, 17 Jun 2009) | 1 line
branches/zip: row_merge_read_rec(): Fix a UNIV_DEBUG bug (Bug #45426)
------------------------------------------------------------------------
r5391 | marko | 2009-06-22 05:31:35 -0400 (Mon, 22 Jun 2009) | 2 lines
branches/zip: buf_page_get_zip(): Fix a bogus warning about
block_mutex being possibly uninitialized.
------------------------------------------------------------------------
r5392 | marko | 2009-06-22 07:58:20 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: ha_innobase::check_if_incompatible_data(): When
ROW_FORMAT=DEFAULT, do not compare to get_row_type().
Without this change, fast index creation will be disabled
in recent versions of MySQL 5.1.
------------------------------------------------------------------------
r5393 | pekka | 2009-06-22 09:27:55 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: Minor changes for Hot Backup to build correctly. (The
code bracketed between #ifdef UNIV_HOTBACKUP and #endif /* UNIV_HOTBACKUP */).
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r5394 | pekka | 2009-06-22 09:46:34 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: Add functions for checking the format of tablespaces
for Hot Backup build (UNIV_HOTBACKUP defined).
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r5397 | calvin | 2009-06-23 16:59:42 -0400 (Tue, 23 Jun 2009) | 7 lines
branches/zip: change the header file path.
Change the header file path from ../storage/innobase/include/
to ../include/. In the planned 5.1 + plugin release, the source
directory of the plugin will not be in storage/innobase.
Approved by: Heikki (IM)
------------------------------------------------------------------------
r5407 | calvin | 2009-06-24 09:51:08 -0400 (Wed, 24 Jun 2009) | 4 lines
branches/zip: remove relative path of header files.
Suggested by Marko.
------------------------------------------------------------------------
r5412 | marko | 2009-06-25 06:27:08 -0400 (Thu, 25 Jun 2009) | 1 line
branches/zip: Replace a DBUG_ASSERT with ut_a to track down Issue #290.
------------------------------------------------------------------------
r5415 | marko | 2009-06-25 06:45:57 -0400 (Thu, 25 Jun 2009) | 3 lines
branches/zip: dict_index_find_cols(): Print diagnostic on name mismatch.
This addresses Bug #44571 but does not fix it.
rb://135 approved by Sunny Bains.
------------------------------------------------------------------------
r5417 | marko | 2009-06-25 08:20:56 -0400 (Thu, 25 Jun 2009) | 1 line
branches/zip: ha_innodb.cc: Move the misplaced Doxygen @file comment.
------------------------------------------------------------------------
r5418 | marko | 2009-06-25 08:55:52 -0400 (Thu, 25 Jun 2009) | 5 lines
branches/zip: Fix a race condition caused by
SET GLOBAL innodb_commit_concurrency=DEFAULT. (Bug #45749)
When innodb_commit_concurrency is initially set nonzero,
DEFAULT would change it back to 0, triggering Bug #42101.
rb://139 approved by Heikki Tuuri.
------------------------------------------------------------------------
r5423 | calvin | 2009-06-26 16:52:52 -0400 (Fri, 26 Jun 2009) | 2 lines
branches/zip: Fix typos.
------------------------------------------------------------------------
r5425 | marko | 2009-06-29 04:52:30 -0400 (Mon, 29 Jun 2009) | 4 lines
branches/zip: ha_innobase::add_index(), ha_innobase::final_drop_index():
Start prebuilt->trx before locking the table. This should fix Issue #293
and could fix Issue #229.
Approved by Sunny (over IM).
------------------------------------------------------------------------
r5426 | marko | 2009-06-29 05:24:27 -0400 (Mon, 29 Jun 2009) | 3 lines
branches/zip: buf_page_get_gen(): Fix a race condition when reading
buf_fix_count. This could explain Issue #156.
Tested by Michael.
------------------------------------------------------------------------
r5427 | marko | 2009-06-29 05:54:53 -0400 (Mon, 29 Jun 2009) | 5 lines
branches/zip: lock_print_info_all_transactions(), buf_read_recv_pages():
Tolerate missing tablespaces (zip_size==ULINT_UNDEFINED).
buf_page_get_gen(): Add ut_ad(ut_is_2pow(zip_size)).
Issue #289, rb://136 approved by Sunny Bains
------------------------------------------------------------------------
r5428 | marko | 2009-06-29 07:06:29 -0400 (Mon, 29 Jun 2009) | 2 lines
branches/zip: row_sel_store_mysql_rec(): Add missing pointer cast.
Do not do arithmetics on void pointers.
------------------------------------------------------------------------
r5429 | marko | 2009-06-29 09:49:54 -0400 (Mon, 29 Jun 2009) | 13 lines
branches/zip: Do not crash on SET GLOBAL innodb_file_format=DEFAULT
or SET GLOBAL innodb_file_format_check=DEFAULT.
innodb_file_format.test: New test for innodb_file_format and
innodb_file_format_check.
innodb_file_format_name_validate(): Store the string in *save.
innodb_file_format_name_update(): Check the string again.
innodb_file_format_check_validate(): Store the string in *save.
innodb_file_format_check_update(): Check the string again.
Issue #282, rb://140 approved by Heikki Tuuri
------------------------------------------------------------------------
r5430 | marko | 2009-06-29 09:58:07 -0400 (Mon, 29 Jun 2009) | 2 lines
branches/zip: lock_rec_validate_page(): Add another assertion
to track down Issue #289.
------------------------------------------------------------------------
r5431 | marko | 2009-06-29 09:58:40 -0400 (Mon, 29 Jun 2009) | 1 line
branches/zip: Revert an accidentally made change in r5430 to univ.i.
------------------------------------------------------------------------
r5437 | marko | 2009-06-30 05:10:01 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: ibuf_dummy_index_free(): Beautify the comment.
------------------------------------------------------------------------
r5438 | marko | 2009-06-30 05:10:32 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: fseg_free(): Remove this unused function.
------------------------------------------------------------------------
r5439 | marko | 2009-06-30 05:15:22 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: fseg_validate(): Enclose in #ifdef UNIV_DEBUG.
This function is unused, but it could turn out to be a useful debugging aid.
------------------------------------------------------------------------
r5441 | marko | 2009-06-30 06:30:14 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: ha_delete(): Remove this unused function that was
very similar to ha_search_and_delete_if_found().
------------------------------------------------------------------------
r5442 | marko | 2009-06-30 06:45:41 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: lock_is_on_table(), lock_table_unlock(): Unused, remove.
------------------------------------------------------------------------
r5443 | marko | 2009-06-30 07:03:00 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_event_create_auto(): Unused, remove.
------------------------------------------------------------------------
r5444 | marko | 2009-06-30 07:19:49 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: que_graph_try_free(): Unused, remove.
------------------------------------------------------------------------
r5445 | marko | 2009-06-30 07:28:11 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: row_build_row_ref_from_row(): Unused, remove.
------------------------------------------------------------------------
r5446 | marko | 2009-06-30 07:35:45 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: srv_que_round_robin(), srv_que_task_enqueue(): Unused, remove.
------------------------------------------------------------------------
r5447 | marko | 2009-06-30 07:37:58 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: srv_que_task_queue_check(): Unused, remove.
------------------------------------------------------------------------
r5448 | marko | 2009-06-30 07:56:36 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: mem_heap_cat(): Unused, remove.
------------------------------------------------------------------------
r5449 | marko | 2009-06-30 08:00:50 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: innobase_start_or_create_for_mysql():
Invoke os_get_os_version() at most once.
------------------------------------------------------------------------
r5450 | marko | 2009-06-30 08:02:20 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_file_close_no_error_handling(): Unused, remove.
------------------------------------------------------------------------
r5451 | marko | 2009-06-30 08:09:49 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: page_set_max_trx_id(): Make the code compile
with UNIV_HOTBACKUP.
------------------------------------------------------------------------
r5452 | marko | 2009-06-30 08:10:26 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: os_file_close_no_error_handling(): Restore,
as this function is used within InnoDB Hot Backup.
------------------------------------------------------------------------
r5453 | marko | 2009-06-30 08:14:01 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_process_set_priority_boost(): Unused, remove.
------------------------------------------------------------------------
r5454 | marko | 2009-06-30 08:42:52 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: Replace a non-ASCII character
(ISO 8859-1 encoded U+00AD SOFT HYPHEN) with a cheap ASCII substitute.
------------------------------------------------------------------------
r5456 | inaam | 2009-06-30 14:21:09 -0400 (Tue, 30 Jun 2009) | 4 lines
branches/zip
Non functional change. s/Percona/Percona Inc./
------------------------------------------------------------------------
r5470 | vasil | 2009-07-02 09:12:36 -0400 (Thu, 02 Jul 2009) | 16 lines
branches/zip:
Use PAUSE instruction inside spinloop if it is available.
The patch was originally developed by Mikael Ronstrom <mikael@mysql.com>
and can be found here:
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2768
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2771
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2772
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2774
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2777
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2799
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2800
Approved by: Heikki (rb://137)
------------------------------------------------------------------------
r5481 | vasil | 2009-07-06 13:16:32 -0400 (Mon, 06 Jul 2009) | 4 lines
branches/zip:
Remove unnecessary quotes and simplify plug.in.
------------------------------------------------------------------------
r5482 | calvin | 2009-07-06 18:36:35 -0400 (Mon, 06 Jul 2009) | 5 lines
branches/zip: add COPYING files for Percona and Sun Micro.
1.0.4 contains patches based on contributions from Percona
and Sun Microsystems.
------------------------------------------------------------------------
r5483 | calvin | 2009-07-07 05:36:43 -0400 (Tue, 07 Jul 2009) | 3 lines
branches/zip: add IB_HAVE_PAUSE_INSTRUCTION to CMake.
Windows will support PAUSE instruction by default.
------------------------------------------------------------------------
r5484 | inaam | 2009-07-07 18:57:14 -0400 (Tue, 07 Jul 2009) | 13 lines
branches/zip rb://126
Based on contribution from Google Inc.
This patch introduces a new parameter innodb_io_capacity to control the
rate at which master threads performs various tasks. The default value
is 200 and higher values imply more aggressive flushing and ibuf merges
from within the master thread.
This patch also changes the ibuf merge from synchronous to asynchronous.
Another minor change is not to force the master thread to wait for a
log flush to complete every second.
Approved by: Heikki
------------------------------------------------------------------------
r5485 | inaam | 2009-07-07 19:00:49 -0400 (Tue, 07 Jul 2009) | 18 lines
branches/zip rb://138
The current implementation is to try to flush the neighbors of every
page that we flush. This patch makes the following distinction:
1) If the flush is from flush_list AND
2) If the flush is intended to move the oldest_modification LSN ahead
(this happens when a user thread sees little space in the log file and
attempts to flush pages from the buffer pool so that a checkpoint can
be made)
THEN
Do not try to flush the neighbors. Just focus on flushing dirty pages at
the end of flush_list
Approved by: Heikki
------------------------------------------------------------------------
r5486 | inaam | 2009-07-08 12:11:40 -0400 (Wed, 08 Jul 2009) | 29 lines
branches/zip rb://133
This patch introduces heuristics based flushing rate of dirty pages to
avoid IO bursts at checkpoint.
1) log_capacity / log_generated per second gives us number of seconds
in which ALL dirty pages need to be flushed. Based on this rough
assumption we can say that
n_dirty_pages / (log_capacity / log_generation_rate) = desired_flush_rate
2) We use weighted averages (hard coded to 20 seconds) of
log_generation_rate to avoid resonance.
3) From the desired_flush_rate we subtract the number of pages that have
been flushed due to LRU flushing. That gives us pages that we should
flush as part of flush_list cleanup. And that is the number (capped by
maximum io_capacity) that we try to flush from the master thread.
Knobs:
======
innodb_adaptive_flushing: boolean, global, dynamic, default TRUE.
Since this heuristic is very experimental and has the potential to
dramatically change the IO pattern I think it is a good idea to leave a
knob to turn it off.
Approved by: Heikki
------------------------------------------------------------------------
r5487 | calvin | 2009-07-08 12:42:28 -0400 (Wed, 08 Jul 2009) | 7 lines
branches/zip: fix PAUSE instruction patch on Windows
The original PAUSE instruction patch (r5470) does not
compile on Windows. Also, there is an elegant way of
doing it on Windows - YieldProcessor().
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r5489 | vasil | 2009-07-10 05:02:22 -0400 (Fri, 10 Jul 2009) | 9 lines
branches/zip:
Change the defaults for
innodb_sync_spin_loops: 20 -> 30
innodb_spin_wait_delay: 5 -> 6
This change was proposed by Sun/MySQL based on their performance testing,
see https://svn.innodb.com/innobase/Release_tasks_for_InnoDB_Plugin_V1.0.4
------------------------------------------------------------------------
r5490 | vasil | 2009-07-10 05:04:20 -0400 (Fri, 10 Jul 2009) | 4 lines
branches/zip:
Add ChangeLog entry for 5489.
------------------------------------------------------------------------
r5491 | calvin | 2009-07-10 12:19:17 -0400 (Fri, 10 Jul 2009) | 6 lines
branches/zip: add copyright info to files related to PAUSE
instruction patch, contributed by Sun Microsystems.
------------------------------------------------------------------------
r5492 | calvin | 2009-07-10 17:47:34 -0400 (Fri, 10 Jul 2009) | 5 lines
branches/zip: add ChangeLog entries for r5484-r5486.
------------------------------------------------------------------------
r5494 | vasil | 2009-07-13 03:37:35 -0400 (Mon, 13 Jul 2009) | 6 lines
branches/zip:
Restore the original value of innodb_sync_spin_loops at the end, previously
the test assumed that setting it to 20 will do this, but now the default is
30 and MTR's internal check failed.
------------------------------------------------------------------------
r5495 | inaam | 2009-07-13 11:48:45 -0400 (Mon, 13 Jul 2009) | 5 lines
branches/zip rb://138 (REVERT)
Revert the flush neighbors patch as it shows regression in
the benchmarks run by Michael.
------------------------------------------------------------------------
r5496 | inaam | 2009-07-13 14:04:57 -0400 (Mon, 13 Jul 2009) | 4 lines
branches/zip
Fixed warnings on windows where ulint != ib_uint64_t
------------------------------------------------------------------------
r5497 | calvin | 2009-07-13 15:01:00 -0400 (Mon, 13 Jul 2009) | 9 lines
branches/zip: fix run-time symbols clash on Solaris.
This patch is from Sergey Vojtovich of Sun Microsystems,
to fix run-time symbols clash on Solaris with older C++
compiler:
- when finding out a way to hide symbols, make decision basing
on compiler, not operating system.
- Sun Studio supports __hidden declaration specifier for this
purpose.
------------------------------------------------------------------------
r5498 | vasil | 2009-07-14 03:16:18 -0400 (Tue, 14 Jul 2009) | 92 lines
branches/zip: Merge r5341:5497 from branches/5.1, skipping:
c5419 because it is merge from branches/zip into branches/5.1
c5466 because the source code has been adjusted to match the MySQL
behavior and the innodb-autoinc test does not fail in branches/zip,
if c5466 is merged, then innodb-autoinc starts failing, Sunny suggested
not to merge c5466.
and resolving conflicts in c5410, c5440, c5488:
------------------------------------------------------------------------
r5410 | marko | 2009-06-24 22:26:34 +0300 (Wed, 24 Jun 2009) | 2 lines
Changed paths:
M /branches/5.1/include/trx0sys.ic
M /branches/5.1/trx/trx0purge.c
M /branches/5.1/trx/trx0sys.c
M /branches/5.1/trx/trx0undo.c
branches/5.1: Add missing #include "mtr0log.h" to avoid warnings
when compiling with -DUNIV_MUST_NOT_INLINE.
------------------------------------------------------------------------
r5419 | marko | 2009-06-25 16:11:57 +0300 (Thu, 25 Jun 2009) | 18 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb_bug42101-nonzero.result
M /branches/5.1/mysql-test/innodb_bug42101-nonzero.test
M /branches/5.1/mysql-test/innodb_bug42101.result
M /branches/5.1/mysql-test/innodb_bug42101.test
branches/5.1: Merge r5418 from branches/zip:
------------------------------------------------------------------------
r5418 | marko | 2009-06-25 15:55:52 +0300 (Thu, 25 Jun 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/mysql-test/innodb_bug42101-nonzero.result
M /branches/zip/mysql-test/innodb_bug42101-nonzero.test
M /branches/zip/mysql-test/innodb_bug42101.result
M /branches/zip/mysql-test/innodb_bug42101.test
branches/zip: Fix a race condition caused by
SET GLOBAL innodb_commit_concurrency=DEFAULT. (Bug #45749)
When innodb_commit_concurrency is initially set nonzero,
DEFAULT would change it back to 0, triggering Bug #42101.
rb://139 approved by Heikki Tuuri.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5440 | vasil | 2009-06-30 13:04:29 +0300 (Tue, 30 Jun 2009) | 8 lines
Changed paths:
M /branches/5.1/fil/fil0fil.c
branches/5.1:
Fix Bug#45814 URL reference in InnoDB server errors needs adjusting to match documentation
by changing the URL from
http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html to
http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting-datadict.html
------------------------------------------------------------------------
r5466 | vasil | 2009-07-02 10:46:45 +0300 (Thu, 02 Jul 2009) | 6 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Adjust the failing innodb-autoinc test to conform to the latest behavior
of the MySQL code. The idea and the comment in innodb-autoinc.test come
from Sunny.
------------------------------------------------------------------------
r5488 | vasil | 2009-07-09 19:16:44 +0300 (Thu, 09 Jul 2009) | 13 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
A /branches/5.1/mysql-test/innodb_bug21704.result
A /branches/5.1/mysql-test/innodb_bug21704.test
branches/5.1:
Fix Bug#21704 Renaming column does not update FK definition
by checking whether a column that participates in a FK definition is being
renamed and denying the ALTER in this case.
The patch was originally developed by Davi Arnaut <Davi.Arnaut@Sun.COM>:
http://lists.mysql.com/commits/77714
and was later adjusted to conform to InnoDB coding style by me (Vasil),
I also added some more comments and moved the bug specific mysql-test to
a separate file to make it more manageable and flexible.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5499 | calvin | 2009-07-14 12:55:10 -0400 (Tue, 14 Jul 2009) | 3 lines
branches/zip: add a missing file in Makefile.am
This change was suggested by MySQL.
------------------------------------------------------------------------
r5500 | calvin | 2009-07-14 13:03:26 -0400 (Tue, 14 Jul 2009) | 3 lines
branches/zip: minor change
Remove an extra "with".
------------------------------------------------------------------------
r5501 | vasil | 2009-07-14 13:58:15 -0400 (Tue, 14 Jul 2009) | 5 lines
branches/zip:
Add @ZLIB_INCLUDES@ so that the InnoDB Plugin picks up the same zlib.h
header file that is eventually used by mysqld.
------------------------------------------------------------------------
r5502 | vasil | 2009-07-14 13:59:59 -0400 (Tue, 14 Jul 2009) | 4 lines
branches/zip:
Add include/ut0auxconf.h to noinst_HEADERS
------------------------------------------------------------------------
r5503 | vasil | 2009-07-14 14:16:11 -0400 (Tue, 14 Jul 2009) | 8 lines
branches/zip:
Non-functional change:
put files in noinst_HEADERS and libinnobase_a_SOURCES one per line and sort
alphabetically, so it is easier to find if a file is there or not and
also diffs show exactly the added or removed file instead of surrounding
lines too.
------------------------------------------------------------------------
r5504 | calvin | 2009-07-15 04:58:44 -0400 (Wed, 15 Jul 2009) | 6 lines
branches/zip: fix compile errors on Win64
Both srv_read_ahead_factor and srv_io_capacity should
be defined as ulong.
Approved by: Sunny
------------------------------------------------------------------------
r5508 | calvin | 2009-07-16 09:40:47 -0400 (Thu, 16 Jul 2009) | 16 lines
branches/zip: Support inlining of functions and prefetch with
Sun Studio
Those changes are contributed by Sun/MySQL. Two sets of changes
in this patch when Sun Studio is used:
- Explicit inlining of functions
- Prefetch Support
This patch has been tested by Sunny with the plugin statically
built in. Since we've never built the plugin as a dynamically
loaded module on Solaris, it is a separate task to change
plug.in.
rb://142
Approved by: Heikki
------------------------------------------------------------------------
r5509 | calvin | 2009-07-16 09:45:28 -0400 (Thu, 16 Jul 2009) | 2 lines
branches/zip: add ChangeLog entry for r5508.
------------------------------------------------------------------------
r5512 | sunny | 2009-07-19 19:52:48 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Remove unused extern ref to timed_mutexes.
------------------------------------------------------------------------
r5513 | sunny | 2009-07-19 19:58:43 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Undo r5512
------------------------------------------------------------------------
r5514 | sunny | 2009-07-19 20:08:49 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Only use my_bool when UNIV_HOTBACKUP is not defined.
------------------------------------------------------------------------
r5515 | sunny | 2009-07-20 03:29:14 -0400 (Mon, 20 Jul 2009) | 2 lines
branches/zip: The dict_table_t::autoinc_mutex field is not used in HotBackup.
------------------------------------------------------------------------
r5516 | sunny | 2009-07-20 03:46:05 -0400 (Mon, 20 Jul 2009) | 4 lines
branches/zip: Make this file usable from within HotBackup. A new file has
been introduced called hb_univ.i. This file should have all the HotBackup
specific configuration.
------------------------------------------------------------------------
r5517 | sunny | 2009-07-20 03:55:11 -0400 (Mon, 20 Jul 2009) | 2 lines
Add /* UNIV_HOTBACK */
------------------------------------------------------------------------
r5519 | vasil | 2009-07-20 04:45:18 -0400 (Mon, 20 Jul 2009) | 31 lines
branches/zip: Merge r5497:5518 from branches/5.1:
------------------------------------------------------------------------
r5518 | vasil | 2009-07-20 11:29:47 +0300 (Mon, 20 Jul 2009) | 22 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2874.2.1
committer: Anurag Shekhar <anurag.shekhar@sun.com>
branch nick: mysql-5.1-bugteam-windows-warning
timestamp: Wed 2009-05-13 15:41:24 +0530
message:
Bug #39802 On Windows, 32-bit time_t should be enforced
This patch fixes compilation warning, "conversion from 'time_t' to 'ulong',
possible loss of data".
The fix is to typecast time_t to ulong before assigning it to ulong.
Backported this from 6.0-bugteam tree.
modified:
storage/archive/ha_archive.cc
storage/federated/ha_federated.cc
storage/innobase/handler/ha_innodb.cc
storage/myisam/ha_myisam.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r5520 | vasil | 2009-07-20 04:51:47 -0400 (Mon, 20 Jul 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5498 and r5519.
------------------------------------------------------------------------
r5524 | inaam | 2009-07-20 12:23:15 -0400 (Mon, 20 Jul 2009) | 9 lines
branches/zip
Change the read ahead parameter name to innodb_read_ahead_threshold.
Change the meaning of this parameter to signify the number of pages
that must be sequentially accessed for InnoDB to trigger a readahead
request.
Suggested by: Ken
------------------------------------------------------------------------
16 years ago  branches/innodb+: Merge revisions 4150:4528 from branches/zip:
------------------------------------------------------------------------
r4152 | marko | 2009-02-10 12:52:27 +0200 (Tue, 10 Feb 2009) | 12 lines
branches/zip: When innodb_use_sys_malloc is set, ignore
innodb_additional_mem_pool_size, because nothing will
be allocated from mem_comm_pool.
mem_pool_create(): Remove the assertion about size. The function will
work with any size. However, an assertion would fail in ut_malloc_low()
when size==0.
mem_init(): When srv_use_sys_malloc is set, pass size=1 to mem_pool_create().
mem0mem.c: Add #include "srv0srv.h" that is needed by mem0dbg.c.
------------------------------------------------------------------------
r4153 | vasil | 2009-02-10 22:58:17 +0200 (Tue, 10 Feb 2009) | 14 lines
branches/zip:
(followup to r4145) Non-functional change:
Change the os_atomic_increment() and os_compare_and_swap() functions
to macros to avoid artificial limitations on the types of those
functions' arguments. As a consequence typecasts from the source
code can be removed.
Also remove Google's copyright from os0sync.ic because that file no longer
contains code from Google.
Approved by: Marko (rb://88), also ok from Inaam via IM
------------------------------------------------------------------------
r4163 | marko | 2009-02-12 00:14:19 +0200 (Thu, 12 Feb 2009) | 4 lines
branches/zip: Make innodb_thread_concurrency=0 the default.
The old default was 8.
------------------------------------------------------------------------
r4169 | calvin | 2009-02-12 10:37:10 +0200 (Thu, 12 Feb 2009) | 3 lines
branches/zip: Adjust the result file of innodb_thread_concurrency_basic
test. The default value of innodb_thread_concurrency is changed to 0
(from 8) via r4163.
------------------------------------------------------------------------
r4174 | vasil | 2009-02-12 17:38:27 +0200 (Thu, 12 Feb 2009) | 4 lines
branches/zip:
Fix pathname of the file to patch.
------------------------------------------------------------------------
r4176 | vasil | 2009-02-13 10:06:31 +0200 (Fri, 13 Feb 2009) | 7 lines
branches/zip:
Fix the failing mysql-test partition_innodb, which failed only if run after
innodb_trx_weight (or other test that would leave LATEST DEADLOCK ERROR into
the output of SHOW ENGINE INNODB STATUS). Find further explanation for the
failure at the top of the added patch partition_innodb.diff.
------------------------------------------------------------------------
r4198 | vasil | 2009-02-17 09:06:07 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
Add the full text of the GPLv2 license into the root directory of the
plugin. In previous releases this file was copied from an external source
(https://svn.innodb.com/svn/plugin/trunk/support/COPYING) "manually" when
creating the source and binary archives. It is less confusing to have this
present in the root directory of the SVN branch.
------------------------------------------------------------------------
r4199 | vasil | 2009-02-17 09:11:58 +0200 (Tue, 17 Feb 2009) | 4 lines
branches/zip:
Add Google's license into COPYING.Google.
------------------------------------------------------------------------
r4200 | vasil | 2009-02-17 09:56:33 +0200 (Tue, 17 Feb 2009) | 11 lines
branches/zip:
To the files touched by the Google patch from c4144 (excluding
include/os0sync.ic because later we removed Google code from that file):
* Remove the Google license
* Remove old Innobase copyright lines
* Add a reference to the Google license and to the GPLv2 license at the top,
as recommended by the lawyers at Oracle Legal.
------------------------------------------------------------------------
r4201 | vasil | 2009-02-17 10:12:02 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 1/28]
------------------------------------------------------------------------
r4202 | vasil | 2009-02-17 10:15:06 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 2/28]
------------------------------------------------------------------------
r4203 | vasil | 2009-02-17 10:25:45 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 3/28]
------------------------------------------------------------------------
r4204 | vasil | 2009-02-17 10:55:41 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 4/28]
------------------------------------------------------------------------
r4205 | vasil | 2009-02-17 10:59:22 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 5/28]
------------------------------------------------------------------------
r4206 | vasil | 2009-02-17 11:02:27 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 6/28]
------------------------------------------------------------------------
r4207 | vasil | 2009-02-17 11:04:28 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 7/28]
------------------------------------------------------------------------
r4208 | vasil | 2009-02-17 11:06:49 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 8/28]
------------------------------------------------------------------------
r4209 | vasil | 2009-02-17 11:10:18 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 9/28]
------------------------------------------------------------------------
r4210 | vasil | 2009-02-17 11:12:41 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 10/28]
------------------------------------------------------------------------
r4211 | vasil | 2009-02-17 11:14:40 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 11/28]
------------------------------------------------------------------------
r4212 | vasil | 2009-02-17 11:18:35 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 12/28]
------------------------------------------------------------------------
r4213 | vasil | 2009-02-17 11:24:40 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 13/28]
------------------------------------------------------------------------
r4214 | vasil | 2009-02-17 11:27:31 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 13/28]
------------------------------------------------------------------------
r4215 | vasil | 2009-02-17 11:29:55 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 15/28]
------------------------------------------------------------------------
r4216 | vasil | 2009-02-17 11:33:38 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 16/28]
------------------------------------------------------------------------
r4217 | vasil | 2009-02-17 11:36:44 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 17/28]
------------------------------------------------------------------------
r4218 | vasil | 2009-02-17 11:39:11 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 18/28]
------------------------------------------------------------------------
r4219 | vasil | 2009-02-17 11:41:24 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 19/28]
------------------------------------------------------------------------
r4220 | vasil | 2009-02-17 11:43:50 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 20/28]
------------------------------------------------------------------------
r4221 | vasil | 2009-02-17 11:46:52 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 21/28]
------------------------------------------------------------------------
r4222 | vasil | 2009-02-17 11:50:12 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 22/28]
------------------------------------------------------------------------
r4223 | vasil | 2009-02-17 11:53:58 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 23/28]
------------------------------------------------------------------------
r4224 | vasil | 2009-02-17 12:01:41 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 24/28]
------------------------------------------------------------------------
r4225 | vasil | 2009-02-17 12:05:45 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 25/28]
------------------------------------------------------------------------
r4226 | vasil | 2009-02-17 12:09:16 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 26/28]
------------------------------------------------------------------------
r4227 | vasil | 2009-02-17 12:12:56 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 27/28]
------------------------------------------------------------------------
r4228 | vasil | 2009-02-17 12:14:04 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 28/28]
------------------------------------------------------------------------
r4229 | vasil | 2009-02-17 12:30:55 +0200 (Tue, 17 Feb 2009) | 4 lines
branches/zip:
Add the copyright notice to the non C files.
------------------------------------------------------------------------
r4231 | marko | 2009-02-17 14:26:53 +0200 (Tue, 17 Feb 2009) | 12 lines
Minor cleanup of the Google SMP patch.
sync_array_object_signalled(): Add a (void) cast to eliminate a gcc warning
about the return value of os_atomic_increment() being ignored.
rw_lock_create_func(): Properly indent the preprocessor directives.
rw_lock_x_lock_low(), rw_lock_x_lock_func_nowait(): Split lines correctly.
rw_lock_set_writer_id_and_recursion_flag(): Silence a Valgrind warning.
Do not mix statements and variable declarations.
------------------------------------------------------------------------
r4232 | marko | 2009-02-17 14:59:54 +0200 (Tue, 17 Feb 2009) | 3 lines
branches/zip: When assigning lock->recursive = FALSE, also flag
lock->writer_thread invalid, so that Valgrind will catch more errors.
This is related to Issue #175.
------------------------------------------------------------------------
r4242 | marko | 2009-02-18 17:01:09 +0200 (Wed, 18 Feb 2009) | 2 lines
branches/zip: UT_DBG_STOP: Use do{} while(0) to silence a g++-4.3.2 warning
about a while(0); statement. This should fix (part of) Issue #176.
------------------------------------------------------------------------
r4243 | marko | 2009-02-18 17:04:03 +0200 (Wed, 18 Feb 2009) | 3 lines
branches/zip: buf_buddy_get_slot(): Fix a gcc 4.3.2 warning
about an empty body of a "for" statement.
This fixes part of Issue #176.
------------------------------------------------------------------------
r4244 | marko | 2009-02-18 17:25:45 +0200 (Wed, 18 Feb 2009) | 11 lines
branches/zip: Protect ut_total_allocated_memory with ut_list_mutex.
Unprotected updates to ut_total_allocated_memory in
os_mem_alloc_large() and os_mem_free_large(), called during
fast index creation, may corrupt the variable and cause assertion failures.
Also, add UNIV_MEM_ALLOC() and UNIV_MEM_FREE() instrumentation around
os_mem_alloc_large() and os_mem_free_large(), so that Valgrind can
detect more errors.
rb://90 approved by Heikki Tuuri. This addresses Issue #177.
------------------------------------------------------------------------
r4248 | marko | 2009-02-19 11:52:39 +0200 (Thu, 19 Feb 2009) | 2 lines
branches/zip: page_zip_set_size(): Fix a g++ 4.3.2 warning
about an empty body in a "for" statement. This closes Issue #176.
------------------------------------------------------------------------
r4251 | inaam | 2009-02-19 15:46:27 +0200 (Thu, 19 Feb 2009) | 8 lines
branches/zip: Issue #178 rb://91
Change plug.in to have same CXXFLAGS as CFLAGS. This is to ensure that
both .c and .cc files get compiled with same flags. To fix the issue
where UNIV_LINUX was defined only in .c files.
Approved by: Marko
------------------------------------------------------------------------
r4258 | vasil | 2009-02-20 11:52:19 +0200 (Fri, 20 Feb 2009) | 7 lines
branches/zip:
Cleanup in ChangeLog:
* Wrap lines at 78 characters
* Changed files are listed alphabetically
* White-space cleanup
------------------------------------------------------------------------
r4259 | vasil | 2009-02-20 11:59:42 +0200 (Fri, 20 Feb 2009) | 6 lines
branches/zip:
ChangeLog: Remove include/os0sync.ic from the entry about the google patch,
this file was modified later to not include Google's code.
------------------------------------------------------------------------
r4262 | vasil | 2009-02-20 14:56:59 +0200 (Fri, 20 Feb 2009) | 373 lines
branches/zip:
Merge revisions 4035:4261 from branches/5.1:
------------------------------------------------------------------------
r4065 | sunny | 2009-01-29 16:01:36 +0200 (Thu, 29 Jan 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: In the last round of AUTOINC cleanup we assumed that AUTOINC
is only defined for integer columns. This caused an assertion failure when
we checked for the maximum value of a column type. We now calculate the
max value for floating-point autoinc columns too.
Fix Bug#42400 - InnoDB autoinc code can't handle floating-point columns
rb://84 and Mantis issue://162
------------------------------------------------------------------------
r4111 | sunny | 2009-02-03 22:06:52 +0200 (Tue, 03 Feb 2009) | 2 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Add the ULL suffix otherwise there is an overflow.
------------------------------------------------------------------------
r4128 | vasil | 2009-02-08 21:36:45 +0200 (Sun, 08 Feb 2009) | 18 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2709.20.31
committer: Timothy Smith <timothy.smith@sun.com>
branch nick: 51
timestamp: Fri 2008-12-19 01:28:51 +0100
message:
Disable part of innodb-autoinc.test, because the MySQL server asserts when
compiled --with-debug, due to bug 39828, "autoinc wraps around when offset and
increment > 1". This change should be reverted when that bug is fixed (and a
a few other minor changes to the test as described in comments).
modified:
mysql-test/r/innodb-autoinc.result
mysql-test/t/innodb-autoinc.test
------------------------------------------------------------------------
r4129 | vasil | 2009-02-08 21:54:25 +0200 (Sun, 08 Feb 2009) | 310 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Merge a change from MySQL:
[looks like the changes to innodb-autoinc.test were made as part of
the following huge merge, but we are merging only changes to that file]
------------------------------------------------------------
revno: 2546.47.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: 5.1-rpl
timestamp: Fri 2009-01-23 13:22:05 +0100
message:
merge: 5.1 -> 5.1-rpl
conflicts:
Text conflict in client/mysqltest.cc
Text conflict in mysql-test/include/wait_until_connected_again.inc
Text conflict in mysql-test/lib/mtr_report.pm
Text conflict in mysql-test/mysql-test-run.pl
Text conflict in mysql-test/r/events_bugs.result
Text conflict in mysql-test/r/log_state.result
Text conflict in mysql-test/r/myisam_data_pointer_size_func.result
Text conflict in mysql-test/r/mysqlcheck.result
Text conflict in mysql-test/r/query_cache.result
Text conflict in mysql-test/r/status.result
Text conflict in mysql-test/suite/binlog/r/binlog_index.result
Text conflict in mysql-test/suite/binlog/r/binlog_innodb.result
Text conflict in mysql-test/suite/rpl/r/rpl_packet.result
Text conflict in mysql-test/suite/rpl/t/rpl_packet.test
Text conflict in mysql-test/t/disabled.def
Text conflict in mysql-test/t/events_bugs.test
Text conflict in mysql-test/t/log_state.test
Text conflict in mysql-test/t/myisam_data_pointer_size_func.test
Text conflict in mysql-test/t/mysqlcheck.test
Text conflict in mysql-test/t/query_cache.test
Text conflict in mysql-test/t/rpl_init_slave_func.test
Text conflict in mysql-test/t/status.test
removed:
mysql-test/suite/parts/r/partition_bit_ndb.result
mysql-test/suite/parts/t/partition_bit_ndb.test
mysql-test/suite/parts/t/partition_sessions.test
mysql-test/suite/sys_vars/inc/tmp_table_size_basic.inc
mysql-test/suite/sys_vars/r/tmp_table_size_basic_32.result
mysql-test/suite/sys_vars/r/tmp_table_size_basic_64.result
mysql-test/suite/sys_vars/t/tmp_table_size_basic_32.test
mysql-test/suite/sys_vars/t/tmp_table_size_basic_64.test
mysql-test/t/log_bin_trust_function_creators_func-master.opt
mysql-test/t/rpl_init_slave_func-slave.opt
added:
mysql-test/include/check_events_off.inc
mysql-test/include/cleanup_fake_relay_log.inc
mysql-test/include/have_simple_parser.inc
mysql-test/include/no_running_event_scheduler.inc
mysql-test/include/no_running_events.inc
mysql-test/include/running_event_scheduler.inc
mysql-test/include/setup_fake_relay_log.inc
mysql-test/include/wait_condition_sp.inc
mysql-test/r/fulltext_plugin.result
mysql-test/r/have_simple_parser.require
mysql-test/r/innodb_bug38231.result
mysql-test/r/innodb_bug39438.result
mysql-test/r/innodb_mysql_rbk.result
mysql-test/r/partition_innodb_semi_consistent.result
mysql-test/r/query_cache_28249.result
mysql-test/r/status2.result
mysql-test/std_data/bug40482-bin.000001
mysql-test/suite/binlog/r/binlog_innodb_row.result
mysql-test/suite/binlog/t/binlog_innodb_row.test
mysql-test/suite/rpl/r/rpl_binlog_corruption.result
mysql-test/suite/rpl/t/rpl_binlog_corruption-master.opt
mysql-test/suite/rpl/t/rpl_binlog_corruption.test
mysql-test/suite/sys_vars/r/tmp_table_size_basic.result
mysql-test/suite/sys_vars/t/tmp_table_size_basic.test
mysql-test/t/fulltext_plugin-master.opt
mysql-test/t/fulltext_plugin.test
mysql-test/t/innodb_bug38231.test
mysql-test/t/innodb_bug39438-master.opt
mysql-test/t/innodb_bug39438.test
mysql-test/t/innodb_mysql_rbk-master.opt
mysql-test/t/innodb_mysql_rbk.test
mysql-test/t/partition_innodb_semi_consistent-master.opt
mysql-test/t/partition_innodb_semi_consistent.test
mysql-test/t/query_cache_28249.test
mysql-test/t/status2.test
renamed:
mysql-test/suite/funcs_1/r/is_collation_character_set_applicability.result => mysql-test/suite/funcs_1/r/is_coll_char_set_appl.result
mysql-test/suite/funcs_1/t/is_collation_character_set_applicability.test => mysql-test/suite/funcs_1/t/is_coll_char_set_appl.test
modified:
.bzr-mysql/default.conf
CMakeLists.txt
client/mysql.cc
client/mysql_upgrade.c
client/mysqlcheck.c
client/mysqltest.cc
configure.in
extra/resolve_stack_dump.c
extra/yassl/include/openssl/ssl.h
include/config-win.h
include/m_ctype.h
include/my_global.h
mysql-test/extra/binlog_tests/database.test
mysql-test/extra/rpl_tests/rpl_auto_increment.test
mysql-test/include/commit.inc
mysql-test/include/have_32bit.inc
mysql-test/include/have_64bit.inc
mysql-test/include/index_merge1.inc
mysql-test/include/linux_sys_vars.inc
mysql-test/include/windows_sys_vars.inc
mysql-test/lib/mtr_report.pm
mysql-test/mysql-test-run.pl
mysql-test/r/alter_table.result
mysql-test/r/commit_1innodb.result
mysql-test/r/create.result
mysql-test/r/csv.result
mysql-test/r/ctype_ucs.result
mysql-test/r/date_formats.result
mysql-test/r/events_bugs.result
mysql-test/r/events_scheduling.result
mysql-test/r/fulltext.result
mysql-test/r/func_if.result
mysql-test/r/func_in.result
mysql-test/r/func_str.result
mysql-test/r/func_time.result
mysql-test/r/grant.result
mysql-test/r/index_merge_myisam.result
mysql-test/r/information_schema.result
mysql-test/r/innodb-autoinc.result
mysql-test/r/innodb.result
mysql-test/r/innodb_mysql.result
mysql-test/r/log_bin_trust_function_creators_func.result
mysql-test/r/log_state.result
mysql-test/r/myisampack.result
mysql-test/r/mysql.result
mysql-test/r/mysqlcheck.result
mysql-test/r/partition_datatype.result
mysql-test/r/partition_mgm.result
mysql-test/r/partition_pruning.result
mysql-test/r/query_cache.result
mysql-test/r/read_buffer_size_basic.result
mysql-test/r/read_rnd_buffer_size_basic.result
mysql-test/r/rpl_init_slave_func.result
mysql-test/r/select.result
mysql-test/r/status.result
mysql-test/r/strict.result
mysql-test/r/temp_table.result
mysql-test/r/type_bit.result
mysql-test/r/type_date.result
mysql-test/r/type_float.result
mysql-test/r/warnings_engine_disabled.result
mysql-test/r/xml.result
mysql-test/suite/binlog/r/binlog_database.result
mysql-test/suite/binlog/r/binlog_index.result
mysql-test/suite/binlog/r/binlog_innodb.result
mysql-test/suite/binlog/r/binlog_row_mix_innodb_myisam.result
mysql-test/suite/binlog/t/binlog_innodb.test
mysql-test/suite/funcs_1/r/is_columns_is.result
mysql-test/suite/funcs_1/r/is_engines.result
mysql-test/suite/funcs_1/r/storedproc.result
mysql-test/suite/funcs_1/storedproc/param_check.inc
mysql-test/suite/funcs_2/t/disabled.def
mysql-test/suite/ndb/t/disabled.def
mysql-test/suite/parts/r/partition_bit_innodb.result
mysql-test/suite/parts/r/partition_bit_myisam.result
mysql-test/suite/parts/r/partition_special_innodb.result
mysql-test/suite/parts/t/disabled.def
mysql-test/suite/parts/t/partition_special_innodb.test
mysql-test/suite/parts/t/partition_value_innodb.test
mysql-test/suite/parts/t/partition_value_myisam.test
mysql-test/suite/parts/t/partition_value_ndb.test
mysql-test/suite/rpl/r/rpl_auto_increment.result
mysql-test/suite/rpl/r/rpl_packet.result
mysql-test/suite/rpl/r/rpl_row_create_table.result
mysql-test/suite/rpl/r/rpl_slave_skip.result
mysql-test/suite/rpl/r/rpl_trigger.result
mysql-test/suite/rpl/t/disabled.def
mysql-test/suite/rpl/t/rpl_packet.test
mysql-test/suite/rpl/t/rpl_row_create_table.test
mysql-test/suite/rpl/t/rpl_slave_skip.test
mysql-test/suite/rpl/t/rpl_trigger.test
mysql-test/suite/rpl_ndb/t/disabled.def
mysql-test/suite/sys_vars/inc/key_buffer_size_basic.inc
mysql-test/suite/sys_vars/inc/sort_buffer_size_basic.inc
mysql-test/suite/sys_vars/r/key_buffer_size_basic_32.result
mysql-test/suite/sys_vars/r/key_buffer_size_basic_64.result
mysql-test/suite/sys_vars/r/sort_buffer_size_basic_32.result
mysql-test/suite/sys_vars/r/sort_buffer_size_basic_64.result
mysql-test/t/alter_table.test
mysql-test/t/create.test
mysql-test/t/csv.test
mysql-test/t/ctype_ucs.test
mysql-test/t/date_formats.test
mysql-test/t/disabled.def
mysql-test/t/events_bugs.test
mysql-test/t/events_scheduling.test
mysql-test/t/fulltext.test
mysql-test/t/func_if.test
mysql-test/t/func_in.test
mysql-test/t/func_str.test
mysql-test/t/func_time.test
mysql-test/t/grant.test
mysql-test/t/information_schema.test
mysql-test/t/innodb-autoinc.test
mysql-test/t/innodb.test
mysql-test/t/innodb_mysql.test
mysql-test/t/log_bin_trust_function_creators_func.test
mysql-test/t/log_state.test
mysql-test/t/myisam_data_pointer_size_func.test
mysql-test/t/myisampack.test
mysql-test/t/mysql.test
mysql-test/t/mysqlcheck.test
mysql-test/t/partition_innodb_stmt.test
mysql-test/t/partition_mgm.test
mysql-test/t/partition_pruning.test
mysql-test/t/query_cache.test
mysql-test/t/rpl_init_slave_func.test
mysql-test/t/select.test
mysql-test/t/status.test
mysql-test/t/strict.test
mysql-test/t/temp_table.test
mysql-test/t/type_bit.test
mysql-test/t/type_date.test
mysql-test/t/type_float.test
mysql-test/t/warnings_engine_disabled.test
mysql-test/t/xml.test
mysys/my_getopt.c
mysys/my_init.c
scripts/mysql_install_db.sh
sql-common/my_time.c
sql/field.cc
sql/field.h
sql/filesort.cc
sql/ha_partition.cc
sql/ha_partition.h
sql/item.cc
sql/item_cmpfunc.cc
sql/item_func.h
sql/item_strfunc.cc
sql/item_sum.cc
sql/item_timefunc.cc
sql/item_timefunc.h
sql/log.cc
sql/log.h
sql/log_event.cc
sql/log_event.h
sql/mysql_priv.h
sql/mysqld.cc
sql/opt_range.cc
sql/partition_info.cc
sql/repl_failsafe.cc
sql/rpl_constants.h
sql/set_var.cc
sql/slave.cc
sql/spatial.h
sql/sql_acl.cc
sql/sql_base.cc
sql/sql_binlog.cc
sql/sql_class.h
sql/sql_cursor.cc
sql/sql_delete.cc
sql/sql_lex.cc
sql/sql_lex.h
sql/sql_locale.cc
sql/sql_parse.cc
sql/sql_partition.cc
sql/sql_plugin.cc
sql/sql_plugin.h
sql/sql_profile.cc
sql/sql_repl.cc
sql/sql_select.cc
sql/sql_select.h
sql/sql_show.cc
sql/sql_table.cc
sql/sql_trigger.cc
sql/sql_trigger.h
sql/table.cc
sql/table.h
sql/unireg.cc
storage/csv/ha_tina.cc
storage/federated/ha_federated.cc
storage/heap/ha_heap.cc
storage/innobase/Makefile.am
storage/innobase/btr/btr0sea.c
storage/innobase/buf/buf0lru.c
storage/innobase/dict/dict0dict.c
storage/innobase/dict/dict0mem.c
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innobase/include/btr0sea.h
storage/innobase/include/dict0dict.h
storage/innobase/include/dict0mem.h
storage/innobase/include/ha_prototypes.h
storage/innobase/include/lock0lock.h
storage/innobase/include/row0mysql.h
storage/innobase/include/sync0sync.ic
storage/innobase/include/ut0ut.h
storage/innobase/lock/lock0lock.c
storage/innobase/os/os0file.c
storage/innobase/plug.in
storage/innobase/row/row0mysql.c
storage/innobase/row/row0sel.c
storage/innobase/srv/srv0srv.c
storage/innobase/srv/srv0start.c
storage/innobase/ut/ut0ut.c
storage/myisam/ft_boolean_search.c
strings/ctype.c
strings/xml.c
tests/mysql_client_test.c
win/configure.js
mysql-test/suite/funcs_1/t/is_coll_char_set_appl.test
------------------------------------------------------------------------
r4165 | calvin | 2009-02-12 01:34:27 +0200 (Thu, 12 Feb 2009) | 1 line
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: minor non-functional changes.
------------------------------------------------------------------------
------------------------------------------------------------------------
r4263 | vasil | 2009-02-20 15:00:46 +0200 (Fri, 20 Feb 2009) | 4 lines
branches/zip:
Add a ChangeLog entry for a change in r4262.
------------------------------------------------------------------------
r4265 | marko | 2009-02-20 22:31:03 +0200 (Fri, 20 Feb 2009) | 5 lines
branches/zip: Make innodb_use_sys_malloc=ON the default.
Replace srv_use_sys_malloc with UNIV_LIKELY(srv_use_sys_malloc)
to improve branch prediction in the default case.
Approved by Ken over the IM.
------------------------------------------------------------------------
r4266 | vasil | 2009-02-20 23:29:32 +0200 (Fri, 20 Feb 2009) | 7 lines
branches/zip:
Add a sentence at the top of COPYING.Google to clarify that this license
does not apply to the whole InnoDB.
Suggested by: Ken
------------------------------------------------------------------------
r4268 | marko | 2009-02-23 12:43:51 +0200 (Mon, 23 Feb 2009) | 9 lines
branches/zip: Initialize ut_list_mutex at startup. Without this fix,
ut_list_mutex would be used uninitialized when innodb_use_sys_malloc=1.
This fix addresses Issue #181.
ut_mem_block_list_init(): Rename to ut_mem_init() and make public.
ut_malloc_low(), ut_free_all_mem(): Add ut_a(ut_mem_block_list_inited).
mem_init(): Call ut_mem_init().
------------------------------------------------------------------------
r4269 | marko | 2009-02-23 15:09:49 +0200 (Mon, 23 Feb 2009) | 7 lines
branches/zip: When freeing an uncompressed BLOB page, tolerate garbage in
FIL_PAGE_TYPE. (Bug #43043, Issue #182)
btr_check_blob_fil_page_type(): New function.
btr_free_externally_stored_field(), btr_copy_blob_prefix():
Call btr_check_blob_fil_page_type() to check FIL_PAGE_TYPE.
------------------------------------------------------------------------
r4272 | marko | 2009-02-23 23:10:18 +0200 (Mon, 23 Feb 2009) | 8 lines
branches/zip: Adjust the fix of Issue #182 in r4269 per Inaam's suggestion.
btr_check_blob_fil_page_type(): Replace the parameter
const char* op
with
ibool read. Do not print anything about page type mismatch
when reading a BLOB page in Antelope format.
Print space id before page number.
------------------------------------------------------------------------
r4273 | marko | 2009-02-24 00:11:11 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: ut_mem_init(): Add the assertion !ut_mem_block_list_inited.
------------------------------------------------------------------------
r4274 | marko | 2009-02-24 00:14:38 +0200 (Tue, 24 Feb 2009) | 12 lines
branches/zip: Fix bugs in the fix of Issue #181. Tested inside and
outside Valgrind, with innodb_use_sys_malloc set to 0 and 1.
mem_init(): Invoke ut_mem_init() before mem_pool_create(), because
the latter one will invoke ut_malloc().
srv_general_init(): Do not initialize the memory subsystem (mem_init()).
innobase_init(): Initialize the memory subsystem (mem_init()) before
calling srv_parse_data_file_paths_and_sizes(), which needs ut_malloc().
Call ut_free_all_mem() in error handling to clean up after the mem_init().
------------------------------------------------------------------------
r4280 | marko | 2009-02-24 15:14:59 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: Remove unused function os_mem_alloc_nocache().
------------------------------------------------------------------------
r4281 | marko | 2009-02-24 16:02:48 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: Remove the unused function dict_index_get_type().
------------------------------------------------------------------------
r4283 | marko | 2009-02-24 23:06:56 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: srv0start.c: Remove unnecessary #include "mem0pool.h".
------------------------------------------------------------------------
r4284 | marko | 2009-02-24 23:26:38 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: mem0mem.c: Remove unnecessary #include "mach0data.h".
------------------------------------------------------------------------
r4288 | vasil | 2009-02-25 10:48:07 +0200 (Wed, 25 Feb 2009) | 21 lines
branches/zip: Merge revisions 4261:4287 from branches/5.1:
------------------------------------------------------------------------
r4287 | sunny | 2009-02-25 05:32:01 +0200 (Wed, 25 Feb 2009) | 10 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Fix Bug#42714 AUTO_INCREMENT errors in 5.1.31. There are two
changes to the autoinc handling.
1. To fix the immediate problem from the bug report, we must ensure that the
value written to the table is always less than the max value stored in
dict_table_t.
2. The second related change is that according to MySQL documentation when
the offset is greater than the increment, we should ignore the offset.
------------------------------------------------------------------------
------------------------------------------------------------------------
r4289 | vasil | 2009-02-25 10:53:51 +0200 (Wed, 25 Feb 2009) | 4 lines
branches/zip:
Add ChangeLog entry for the fix in r4288.
------------------------------------------------------------------------
r4290 | vasil | 2009-02-25 11:05:44 +0200 (Wed, 25 Feb 2009) | 11 lines
branches/zip:
Make ChangeLog entries for bugs in bugs.mysql.com in the form:
Fix Bug#12345 bug title
(for bugs after 1.0.2 was released and the ChangeLog published)
There is no need to bloat the ChangeLog with information that is available
via bugs.mysql.com.
Discussed with: Marko
------------------------------------------------------------------------
r4291 | vasil | 2009-02-25 11:08:32 +0200 (Wed, 25 Feb 2009) | 4 lines
branches/zip:
Fix Bug synopsis and remove explanation
------------------------------------------------------------------------
r4292 | marko | 2009-02-25 12:09:15 +0200 (Wed, 25 Feb 2009) | 25 lines
branches/zip: Correct the initialization of the memory subsystem once
again, to finally put Issue #181 to rest.
Revert some parts of r4274. It is best not to call ut_malloc() before
srv_general_init().
mem_init(): Do not call ut_mem_init().
srv_general_init(): Initialize the memory subsystem in two phases:
first ut_mem_init(), then mem_init(). This is because os_sync_init()
and sync_init() depend on ut_mem_init() and mem_init() depends on
os_sync_init() or sync_init().
srv_parse_data_file_paths_and_sizes(),
srv_parse_log_group_home_dirs(): Remove the output parameters. Assign
to the global variables directly. Allocate memory with malloc()
instead of ut_malloc(), because these functions will be called before
srv_general_init().
srv_free_paths_and_sizes(): New function, for cleaning up after
srv_parse_data_file_paths_and_sizes() and
srv_parse_log_group_home_dirs().
rb://92 approved by Sunny Bains
------------------------------------------------------------------------
r4297 | vasil | 2009-02-25 17:19:19 +0200 (Wed, 25 Feb 2009) | 4 lines
branches/zip:
White-space cleanup in the ChangeLog
------------------------------------------------------------------------
r4301 | vasil | 2009-02-25 21:33:32 +0200 (Wed, 25 Feb 2009) | 5 lines
branches/zip:
Do not output the commands that restore the environment because they depend
on the state of the environment before the test starts executing.
------------------------------------------------------------------------
r4315 | vasil | 2009-02-26 09:21:20 +0200 (Thu, 26 Feb 2009) | 5 lines
branches/zip:
Apply any necessary patches to the mysql tree at the end of setup.sh
This step was previously done manually (and sometimes forgotten).
------------------------------------------------------------------------
r4319 | marko | 2009-02-26 23:27:51 +0200 (Thu, 26 Feb 2009) | 6 lines
branches/zip: btr_check_blob_fil_page_type(): Do not report
FIL_PAGE_TYPE mismatch even when purging a BLOB.
Heavy users may have large data files created with MySQL 5.0 or earlier,
and they don not want to have the error log flooded with such messages.
This fixes Issue #182.
------------------------------------------------------------------------
r4320 | inaam | 2009-02-27 02:13:19 +0200 (Fri, 27 Feb 2009) | 8 lines
branches/zip
This is to revert the changes made to the plug.in (r4251) as a fix for
issue# 178. Changes to plug.in will not propogate to a plugin
installation unless autotools are rerun which is unacceptable.
A fix for issue# 178 will be committed in a separate commit.
------------------------------------------------------------------------
r4321 | inaam | 2009-02-27 02:16:46 +0200 (Fri, 27 Feb 2009) | 6 lines
branches/zip
This is a fix for issue#178. Instead of using UNIV_LINUX which is
defined through CFLAGS we use compiler generated define __linux__
that is effective for both .c and .cc files.
------------------------------------------------------------------------
r4324 | vasil | 2009-02-27 13:27:18 +0200 (Fri, 27 Feb 2009) | 39 lines
branches/zip:
Add FreeBSD to the list of the operating systems that have
sizeof(pthread_t) == sizeof(void*) (i.e. word size).
On FreeBSD pthread_t is defined like:
/usr/include/sys/_pthreadtypes.h:
typedef struct pthread *pthread_t;
I did the following tests (per Inaam's recommendation):
a) appropriate version of GCC is available on that platform (4.1.2 or
higher for atomics to be available)
On FreeBSD 6.x the default compiler is 3.4.6, on FreeBSD 7.x the default
one is 4.2.1. One can always install the version of choice from the ports
collection. If gcc 3.x is used then HAVE_GCC_ATOMIC_BUILTINS will not be
defined and thus the change I am committing will make no difference.
b) find out if sizeof(pthread_t) == sizeof(long)
On 32 bit both are 4 bytes, on 64 bit both are 8 bytes.
c) find out the compiler generated platform define (e.g.: __aix, __sunos__
etc.)
The macro is __FreeBSD__.
d) patch univ.i with the appropriate platform define
e) build the mysql
f) ensure it is using atomic builtins (look at the err.log message at
system startup. It should say we are using atomics for both mutexes and
rw-locks)
g) do sanity testing (keeping in view the smp changes)
I ran the mysql-test suite. All tests pass.
------------------------------------------------------------------------
r4353 | vasil | 2009-03-05 09:27:29 +0200 (Thu, 05 Mar 2009) | 6 lines
branches/zip:
As suggested by Ken, print a message that says that the Google SMP patch
(GCC atomics) is disabled if it is. Also extend the message when the patch
is partially enabled to make it clear that it is partially enabled.
------------------------------------------------------------------------
r4356 | vasil | 2009-03-05 13:49:51 +0200 (Thu, 05 Mar 2009) | 4 lines
branches/zip:
Fix typo made in r4353.
------------------------------------------------------------------------
r4357 | vasil | 2009-03-05 16:38:59 +0200 (Thu, 05 Mar 2009) | 23 lines
branches/zip:
Implement a check whether pthread_t objects can be used by GCC atomic
builtin functions. This check is implemented in plug.in and defines the
macro HAVE_ATOMIC_PTHREAD_T. This macro is checked in univ.i and the
relevant part of the code enabled (the one that uses GCC atomics against
pthread_t objects).
In addition to this, the same program that is compiled as part of the
plug.in check is added in ut/ut0auxconf.c. In the InnoDB Plugin source
archives that are shipped to the users, a generated Makefile.in is added.
That Makefile.in will be modified to compile ut/ut0auxconf.c and define
the macro HAVE_ATOMIC_PTHREAD_T if the compilation succeeds. I.e.
Makefile.in will emulate the work that is done by plug.in. This is done in
order to make the check happen and HAVE_ATOMIC_PTHREAD_T eventually
defined without regenerating MySQL's ./configure from
./storage/innobase/plug.in. The point is not to ask users to install the
autotools and regenerate ./configure.
rb://95
Approved by: Marko
------------------------------------------------------------------------
r4360 | vasil | 2009-03-05 22:23:17 +0200 (Thu, 05 Mar 2009) | 21 lines
branches/zip: Merge revisions 4287:4357 from branches/5.1:
------------------------------------------------------------------------
r4325 | sunny | 2009-03-02 02:28:52 +0200 (Mon, 02 Mar 2009) | 10 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Bug#43203: Overflow from auto incrementing causes server segv
It was not a SIGSEGV but an assertion failure. The assertion was checking
the invariant that *first_value passed in by MySQL doesn't contain a value
that is greater than the max value for that type. The assertion has been
changed to a check and if the value is greater than the max we report a
generic AUTOINC failure.
rb://93
Approved by Heikki
------------------------------------------------------------------------
------------------------------------------------------------------------
r4361 | vasil | 2009-03-05 22:27:54 +0200 (Thu, 05 Mar 2009) | 30 lines
branches/zip: Merge revision 4358 from branches/5.1 (resolving a conflict):
------------------------------------------------------------------------
r4358 | vasil | 2009-03-05 21:21:10 +0200 (Thu, 05 Mar 2009) | 21 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2728.19.1
committer: Alfranio Correia <alfranio.correia@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-02-03 11:36:46 +0000
message:
BUG#42445 Warning messages in innobase/handler/ha_innodb.cc
There was a type casting problem in the storage/innobase/handler/ha_innodb.cc,
(int ha_innobase::write_row(...)). Innobase uses has an internal error variable
of type 'ulint' while mysql uses an 'int'.
To fix the problem the function manipulates an error variable of
type 'ulint' and only casts it into 'int' when needs to return the value.
modified:
storage/innobase/handler/ha_innodb.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r4362 | vasil | 2009-03-05 22:29:07 +0200 (Thu, 05 Mar 2009) | 23 lines
branches/zip: Merge revision 4359 from branches/5.1:
------------------------------------------------------------------------
r4359 | vasil | 2009-03-05 21:42:01 +0200 (Thu, 05 Mar 2009) | 14 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2747
committer: Timothy Smith <timothy.smith@sun.com>
branch nick: 51
timestamp: Fri 2009-01-16 17:49:07 +0100
message:
Add another cast to ignore int/ulong difference in error types, silence warning on Win64
modified:
storage/innobase/handler/ha_innodb.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r4363 | vasil | 2009-03-05 22:31:37 +0200 (Thu, 05 Mar 2009) | 4 lines
branches/zip:
Add ChangeLog entry for the bugfix in c4360.
------------------------------------------------------------------------
r4378 | calvin | 2009-03-09 10:10:17 +0200 (Mon, 09 Mar 2009) | 7 lines
branches/zip: remove compile flag MYSQL_SERVER for dynamic plugin
The dynamic plugin on Windows used to be built with MYSQL_SERVER
compile flag, while it is not the case for other platforms.
r3797 assumed MYSQL_SERVER was not defined for dynamic plugin,
which introduced the engine crash during dropping a database.
------------------------------------------------------------------------
r4396 | marko | 2009-03-12 09:22:27 +0200 (Thu, 12 Mar 2009) | 3 lines
branches/zip: btr_store_big_rec_extern_fields(): Initialize FIL_PAGE_TYPE
in a separate redo log entry. This will make ibbackup --apply-log
debugging easier.
------------------------------------------------------------------------
r4397 | marko | 2009-03-12 09:26:11 +0200 (Thu, 12 Mar 2009) | 3 lines
branches/zip: trx_sys_create_doublewrite_buf(): As the dummy change,
initialize FIL_PAGE_TYPE. This will make it easier to write the debug
assertions for ibbackup --apply-log.
------------------------------------------------------------------------
r4401 | marko | 2009-03-12 10:26:40 +0200 (Thu, 12 Mar 2009) | 19 lines
branches/zip: Merge revisions 4359:4400 from branches/5.1:
------------------------------------------------------------------------
r4399 | marko | 2009-03-12 09:38:05 +0200 (Thu, 12 Mar 2009) | 2 lines
branches/5.1: row_sel_get_clust_rec_for_mysql(): Store the cursor position
also for unlock_row(). (Bug #39320)
------------------------------------------------------------------------
r4400 | marko | 2009-03-12 10:06:44 +0200 (Thu, 12 Mar 2009) | 5 lines
branches/5.1: Fix a bug in multi-table semi-consistent reads.
Remember the acquired record locks per table handle (row_prebuilt_t)
rather than per transaction (trx_t), so that unlock_row should successfully
unlock all non-matching rows in multi-table operations.
This deficiency was found while investigating Bug #39320.
------------------------------------------------------------------------
These were submitted as rb://94 and rb://96 and approved by Heikki Tuuri.
------------------------------------------------------------------------
r4455 | marko | 2009-03-16 11:43:34 +0200 (Mon, 16 Mar 2009) | 2 lines
branches/zip: UT_LIST_VALIDATE(): Add the parameter ASSERTION and
adjust all callers.
------------------------------------------------------------------------
r4456 | marko | 2009-03-16 12:59:25 +0200 (Mon, 16 Mar 2009) | 6 lines
branches/zip: UT_LIST_VALIDATE(): Assert that the link is non-NULL
before dereferencing it. In this way, ut_list_node_313 will be
pointing to the last non-NULL list item at the time of the assertion
failure. (gcc-4.3.2 -O3 seems to optimize the common subexpressions
and make the variable NULL, though.)
------------------------------------------------------------------------
r4457 | marko | 2009-03-16 14:12:02 +0200 (Mon, 16 Mar 2009) | 2 lines
branches/zip: sync_thread_add_level(): Make the assertions about
level == SYNC_BUF_BLOCK more readable.
------------------------------------------------------------------------
r4461 | vasil | 2009-03-17 09:38:19 +0200 (Tue, 17 Mar 2009) | 6 lines
branches/zip:
Remove mysql-test/patches/bug32625.diff because that bug was fixed in
the mysql repository (1 year and 4 months after sending them the simple
patch!). See http://bugs.mysql.com/32625
------------------------------------------------------------------------
r4465 | marko | 2009-03-17 12:34:19 +0200 (Tue, 17 Mar 2009) | 1 line
branches/zip: buf0buddy.c: Add and adjust some debug assertions.
------------------------------------------------------------------------
r4473 | vasil | 2009-03-17 15:50:30 +0200 (Tue, 17 Mar 2009) | 5 lines
branches/zip:
Increment the InnoDB Plugin version from 1.0.3 to 1.0.4 now that
1.0.3 has been released.
------------------------------------------------------------------------
r4478 | vasil | 2009-03-18 11:53:53 +0200 (Wed, 18 Mar 2009) | 5 lines
branches/zip:
Remove mysql-test/patches/bug41893.diff because that bug has been fixed
in the MySQL repository, see http://bugs.mysql.com/41893.
------------------------------------------------------------------------
r4479 | marko | 2009-03-18 12:43:54 +0200 (Wed, 18 Mar 2009) | 2 lines
branches/zip: buf_LRU_block_remove_hashed_page(): Add some debug assertions.
------------------------------------------------------------------------
r4480 | marko | 2009-03-18 14:32:13 +0200 (Wed, 18 Mar 2009) | 1 line
branches/zip: buf_buddy_free_low(): Correct the function comment.
------------------------------------------------------------------------
r4482 | marko | 2009-03-19 15:23:32 +0200 (Thu, 19 Mar 2009) | 12 lines
branches/zip: Merge revisions 4400:4481 from branches/5.1:
------------------------------------------------------------------------
r4481 | marko | 2009-03-19 15:01:48 +0200 (Thu, 19 Mar 2009) | 6 lines
branches/5.1: row_unlock_for_mysql(): Do not unlock records that were
modified by the current transaction. This bug was introduced or unmasked
in r4400.
rb://97 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
r4490 | marko | 2009-03-20 12:33:33 +0200 (Fri, 20 Mar 2009) | 4 lines
branches/zip: Non-functional change for reducing dependencies in InnoDB Hot Backup:
Replace srv_sys->dummy_ind1 and srv_sys->dummy_ind2 with
dict_ind_redundant and dict_ind_compact, initialized in dict_init().
------------------------------------------------------------------------
r4491 | marko | 2009-03-20 12:45:18 +0200 (Fri, 20 Mar 2009) | 2 lines
branches/zip: Add const qualifiers or in/out comments to some function
parameters in log0log.
------------------------------------------------------------------------
r4492 | marko | 2009-03-20 12:52:14 +0200 (Fri, 20 Mar 2009) | 5 lines
branches/zip: page_validate(): Always report the space id and the
name of the index.
In Hot Backup, do not invoke comparison functions, as MySQL collations
will be unavailable.
------------------------------------------------------------------------
r4493 | marko | 2009-03-20 13:24:06 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: Replace fil_get_space_for_id_low() with fil_space_get_by_id().
------------------------------------------------------------------------
r4494 | marko | 2009-03-20 13:51:35 +0200 (Fri, 20 Mar 2009) | 3 lines
branches/zip: fil0fil.c: Refer to fil_system directly, not via local vars.
This eliminates some "unused variable" warnings when building
InnoDB Hot Backup in such a way that all mutex operations are no-ops.
------------------------------------------------------------------------
r4495 | marko | 2009-03-20 14:15:52 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: innobase_get_at_most_n_mbchars(): Declare in ha_prototypes.h.
------------------------------------------------------------------------
r4496 | marko | 2009-03-20 14:48:26 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: recv_recover_page(): Remove compile-time constant parameters.
------------------------------------------------------------------------
r4497 | marko | 2009-03-20 14:56:19 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: recv_sys_init(): Remove a compile-time constant parameter.
------------------------------------------------------------------------
r4498 | marko | 2009-03-20 15:08:05 +0200 (Fri, 20 Mar 2009) | 4 lines
branches/zip: Non-functional change: Add const qualifiers.
log_block_checksum_is_ok_or_old_format(), recv_sys_add_to_parsing_buf():
The log block is read-only. Make it const.
------------------------------------------------------------------------
r4499 | marko | 2009-03-20 15:10:25 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: recv_scan_log_recs(): Remove a compile-time constant parameter.
------------------------------------------------------------------------
r4500 | marko | 2009-03-20 15:47:17 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: fil_init(): Add the parameter hash_size.
------------------------------------------------------------------------
r4501 | vasil | 2009-03-20 16:50:41 +0200 (Fri, 20 Mar 2009) | 4 lines
branches/zip:
Add any entry about the release of 1.0.3 in the ChangeLog.
------------------------------------------------------------------------
r4515 | marko | 2009-03-23 10:49:53 +0200 (Mon, 23 Mar 2009) | 1 line
branches/zip: hash_table_t: adaptive: Remove from UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4516 | marko | 2009-03-23 10:57:16 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Define and use ASSERT_HASH_MUTEX_OWN.
Make it a no-op in UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4517 | marko | 2009-03-23 11:07:20 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Define and use PAGE_ZIP_MATCH.
In UNIV_HOTBACKUP builds, assume fixed allocation.
------------------------------------------------------------------------
r4521 | marko | 2009-03-23 12:05:47 +0200 (Mon, 23 Mar 2009) | 1 line
branches/zip: buf_page_print(): Clean up the code #ifdef UNIV_HOTBACKUP.
------------------------------------------------------------------------
r4522 | marko | 2009-03-23 12:20:50 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Exclude some operating system interface code
from UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4523 | marko | 2009-03-23 13:00:43 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Remove the remaining references to hash_table_t::adapive
from UNIV_HOTBACKUP builds. This should have been done in r4515.
------------------------------------------------------------------------
r4524 | marko | 2009-03-23 14:05:18 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Enclose recv_recovery_from_backup_on and
recv_recovery_from_backup_is_on() in #ifdef UNIV_LOG_ARCHIVE.
------------------------------------------------------------------------
r4525 | marko | 2009-03-23 14:57:45 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: recv_parse_or_apply_log_rec_body(): Add debug assertions
ensuring that FIL_PAGE_TYPE makes sense when applying log records.
------------------------------------------------------------------------
r4526 | marko | 2009-03-23 16:21:34 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Remove unneeded definitions and dependencies
from UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4527 | calvin | 2009-03-23 23:15:33 +0200 (Mon, 23 Mar 2009) | 5 lines
branches/zip: adjust build files on Windows
Adjust the patch positions based on the latest MySQL source.
Also add the patches to the .bat files for vs9.
------------------------------------------------------------------------
17 years ago  branches/innodb+: Merge revisions 5091:5143 from branches/zip:
------------------------------------------------------------------------
r5092 | marko | 2009-05-25 09:54:17 +0300 (Mon, 25 May 2009) | 1 line
branches/zip: Adjust some function comments after r5091.
------------------------------------------------------------------------
r5100 | marko | 2009-05-25 12:09:45 +0300 (Mon, 25 May 2009) | 1 line
branches/zip: Split some long lines that were introduced in r5091.
------------------------------------------------------------------------
r5101 | marko | 2009-05-25 12:42:47 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Introduce the macro TEMP_INDEX_PREFIX_STR.
This is to avoid triggering an error in Doxygen.
------------------------------------------------------------------------
r5102 | marko | 2009-05-25 13:47:14 +0300 (Mon, 25 May 2009) | 1 line
branches/zip: Add missing file comments.
------------------------------------------------------------------------
r5103 | marko | 2009-05-25 13:52:29 +0300 (Mon, 25 May 2009) | 10 lines
branches/zip: Add @file comments, and convert decorative
/*********************************
comments to Doxygen /** style like this:
/*****************************//**
This conversion was performed by the following command:
perl -i -e 'while(<ARGV>){if (m|^/\*{30}\**$|) {
s|\*{4}$|//**| if ++$com>1; $_ .= "\@file $ARGV\n" if $com==2}
print; if(eof){$.=0;undef $com}}' */*[ch] include/univ.i
------------------------------------------------------------------------
r5104 | marko | 2009-05-25 14:39:07 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Revert ut0auxconf_* to r5102,
that is, make Doxygen ignore these test programs.
------------------------------------------------------------------------
r5105 | marko | 2009-05-25 14:52:20 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Enclose some #error checks inside #ifndef DOXYGEN
to prevent bogus Doxygen errors.
------------------------------------------------------------------------
r5106 | marko | 2009-05-25 16:09:24 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Add some Doxygen comments, mainly to structs, typedefs,
macros and global variables. Many more to go.
------------------------------------------------------------------------
r5108 | marko | 2009-05-26 00:32:35 +0300 (Tue, 26 May 2009) | 2 lines
branches/zip: lexyy.c: Remove the inadvertently added @file directive.
There is nothing for Doxygen to see in this file, move along.
------------------------------------------------------------------------
r5125 | marko | 2009-05-26 16:28:49 +0300 (Tue, 26 May 2009) | 3 lines
branches/zip: Add some Doxygen comments for many structs, typedefs,
#defines and global variables. Many are still missing.
------------------------------------------------------------------------
r5134 | marko | 2009-05-27 09:08:43 +0300 (Wed, 27 May 2009) | 1 line
branches/zip: Add some Doxygen @return comments.
------------------------------------------------------------------------
r5139 | marko | 2009-05-27 10:01:40 +0300 (Wed, 27 May 2009) | 1 line
branches/zip: Add Doxyfile.
------------------------------------------------------------------------
r5143 | marko | 2009-05-27 10:57:25 +0300 (Wed, 27 May 2009) | 3 lines
branches/zip: buf0buf.h, Doxyfile: Fix the Doxygen translation.
@defgroup is for source code modules, not for field groups.
Tell Doxygen to expand the UT_LIST declarations.
------------------------------------------------------------------------
17 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-11254: innodb-use-trim has no effect in 10.2
Problem was that implementation merged from 10.1 was incompatible
with InnoDB 5.7.
buf0buf.cc: Add functions to return should we punch hole and
how big.
buf0flu.cc: Add written page to IORequest
fil0fil.cc: Remove unneeded status call and add test is
sparse files and punch hole supported by file system when
tablespace is created. Add call to get file system
block size. Used file node is added to IORequest. Added
functions to check is punch hole supported and setting
punch hole.
ha_innodb.cc: Remove unneeded status variables (trim512-32768)
and trim_op_saved. Deprecate innodb_use_trim and
set it ON by default. Add function to set innodb-use-trim
dynamically.
dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
if punch hole operation fails.
fil0fil.h: Add punch_hole variable to fil_space_t and
block size to fil_node_t.
os0api.h: Header to helper functions on buf0buf.cc and
fil0fil.cc for os0file.h
os0file.h: Remove unneeded m_block_size from IORequest
and add bpage to IORequest to know actual size of
the block and m_fil_node to know tablespace file
system block size and does it support punch hole.
os0file.cc: Add function punch_hole() to IORequest
to do punch_hole operation,
get the file system block size and determine
does file system support sparse files (for punch hole).
page0size.h: remove implicit copy disable and
use this implicit copy to implement copy_from()
function.
buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
os0file.h, os0file.cc, log0log.cc, log0recv.cc:
Remove unneeded write_size parameter from fil_io
calls.
srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
trim512-trim32678 status variables. Removed
these from monitor tests.
9 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  Merge Google encryption
commit 195158e9889365dc3298f8c1f3bcaa745992f27f
Author: Minli Zhu <minliz@google.com>
Date: Mon Nov 25 11:05:55 2013 -0800
Innodb redo log encryption/decryption.
Use start lsn of a log block as part of AES CTR counter.
Record key version with each checkpoint. Internally key version 0 means no
encryption. Tests done (see test_innodb_log_encryption.sh for detail):
- Verify flag innodb_encrypt_log on or off, combined with various key versions
passed through CLI, and dynamically set after startup, will not corrupt
database. This includes tests from being unencrypted to encrypted, and
encrypted to unencrypted.
- Verify start-up with no redo logs succeeds.
- Verify fresh start-up succeeds.
Change-Id: I4ce4c2afdf3076be2fce90ebbc2a7ce01184b612
commit c1b97273659f07866758c25f4a56f680a1fbad24
Author: Jonas Oreland <jonaso@google.com>
Date: Tue Dec 3 18:47:27 2013 +0100
encryption of aria data&index files
this patch implements encryption of aria data & index files.
this is implemented as
1) add read/write hooks (renamed from callbacks) that does encrypt/decrypt
(also add pre_read and post_write hooks)
2) modify page headers for data/index to contain key version
(making the data-page header size different for with/without encryption)
3) modify index page 0 to contain IV (and crypt header)
4) AES CRT crypt functions
5) counter block is implemented using combination of
page no, lsn and table specific id
NOTE:
1) log files are not encrypted, this is not needed for if aria is only used
for internal temporary tables and they are not transactional (i.e not logged)
2) all encrypted tables are using PAGE_CHECKSUM (crc)
normal internal temporary tables are (currently) not CHECKSUM:ed
3) This patch adds insert-order semantics to aria block_format.
The default behaviour of aria block-format is best-fit, meaning
that rows gets allocated to page trying to fill the pages as much
as possible. However, certain sql constructs materialize temporary
result in tmp-tables, and expect that a table scan will later return
the rows in the same order they were inserted. This implementation of
insert-order is only enabled when explicitly requested by sql-layer.
CHANGES:
1) found bug in ma_write that made code try to abort a record that was never written
unsure why this is not exposed
Change-Id: Ia82bbaa92e2c0629c08693c5add2f56b815c0509
commit 89dc1ab651fe0205d55b4eb588f62df550aa65fc
Author: Jonas Oreland <jonaso@google.com>
Date: Mon Feb 17 08:04:50 2014 -0800
Implement encryption of innodb datafiles.
Pages are encrypted before written to disk and decrypted when read from disk.
Each page except first page (page 0) in tablespace is encrypted.
Page 0 is unencrypted and contains IV for the tablespace.
FIL_PAGE_FILE_FLUSH_LSN on each page (except page 0) is used to store a 32-bit
key-version, so that multiple keys can be active in a tablespace simultaneous.
The other 32-bit of the FIL_PAGE_FILE_FLUSH_LSN field contains a checksum that
is computed after encryption. This checksum is used by innochecksum and
when restoring from double-write-buffer.
The encryption is performed using AES CRT.
Monitoring of encryption is enabled using new IS-table INNODB_TABLESPACES_ENCRYPTION.
In addition to that new status variables
innodb_encryption_rotation_{ pages_read_from_cache, pages_read_from_disk,
pages_modified,pages_flushed } has been added.
The following tunables are introduces
- innodb_encrypt_tables
- innodb_encryption_threads
- innodb_encryption_rotate_key_age
- innodb_encryption_rotation_iops
Change-Id: I8f651795a30b52e71b16d6bc9cb7559be349d0b2
commit a17eef2f6948e58219c9e26fc35633d6fd4de1de
Author: Andrew Ford <andrewford@google.com>
Date: Thu Jan 2 15:43:09 2014 -0800
Key management skeleton with debug hooks.
Change-Id: Ifd6aa3743d7ea291c70083f433a059c439aed866
commit 68a399838ad72264fd61b3dc67fecd29bbdb0af1
Author: Andrew Ford <andrewford@google.com>
Date: Mon Oct 28 16:27:44 2013 -0700
Add AES-128 CTR and GCM encryption classes.
Change-Id: I116305eced2a233db15306bc2ef5b9d398d1a3a2
11 years ago  MDEV-11254: innodb-use-trim has no effect in 10.2
Problem was that implementation merged from 10.1 was incompatible
with InnoDB 5.7.
buf0buf.cc: Add functions to return should we punch hole and
how big.
buf0flu.cc: Add written page to IORequest
fil0fil.cc: Remove unneeded status call and add test is
sparse files and punch hole supported by file system when
tablespace is created. Add call to get file system
block size. Used file node is added to IORequest. Added
functions to check is punch hole supported and setting
punch hole.
ha_innodb.cc: Remove unneeded status variables (trim512-32768)
and trim_op_saved. Deprecate innodb_use_trim and
set it ON by default. Add function to set innodb-use-trim
dynamically.
dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
if punch hole operation fails.
fil0fil.h: Add punch_hole variable to fil_space_t and
block size to fil_node_t.
os0api.h: Header to helper functions on buf0buf.cc and
fil0fil.cc for os0file.h
os0file.h: Remove unneeded m_block_size from IORequest
and add bpage to IORequest to know actual size of
the block and m_fil_node to know tablespace file
system block size and does it support punch hole.
os0file.cc: Add function punch_hole() to IORequest
to do punch_hole operation,
get the file system block size and determine
does file system support sparse files (for punch hole).
page0size.h: remove implicit copy disable and
use this implicit copy to implement copy_from()
function.
buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
os0file.h, os0file.cc, log0log.cc, log0recv.cc:
Remove unneeded write_size parameter from fil_io
calls.
srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
trim512-trim32678 status variables. Removed
these from monitor tests.
9 years ago  MDEV-11254: innodb-use-trim has no effect in 10.2
Problem was that implementation merged from 10.1 was incompatible
with InnoDB 5.7.
buf0buf.cc: Add functions to return should we punch hole and
how big.
buf0flu.cc: Add written page to IORequest
fil0fil.cc: Remove unneeded status call and add test is
sparse files and punch hole supported by file system when
tablespace is created. Add call to get file system
block size. Used file node is added to IORequest. Added
functions to check is punch hole supported and setting
punch hole.
ha_innodb.cc: Remove unneeded status variables (trim512-32768)
and trim_op_saved. Deprecate innodb_use_trim and
set it ON by default. Add function to set innodb-use-trim
dynamically.
dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
if punch hole operation fails.
fil0fil.h: Add punch_hole variable to fil_space_t and
block size to fil_node_t.
os0api.h: Header to helper functions on buf0buf.cc and
fil0fil.cc for os0file.h
os0file.h: Remove unneeded m_block_size from IORequest
and add bpage to IORequest to know actual size of
the block and m_fil_node to know tablespace file
system block size and does it support punch hole.
os0file.cc: Add function punch_hole() to IORequest
to do punch_hole operation,
get the file system block size and determine
does file system support sparse files (for punch hole).
page0size.h: remove implicit copy disable and
use this implicit copy to implement copy_from()
function.
buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
os0file.h, os0file.cc, log0log.cc, log0recv.cc:
Remove unneeded write_size parameter from fil_io
calls.
srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
trim512-trim32678 status variables. Removed
these from monitor tests.
9 years ago  TSAN: unprotected global variable
WARNING: ThreadSanitizer: data race (pid=1510842)
Write of size 8 at 0x0000067b1e98 by main thread:
#0 os_file_pwrite(IORequest const&, int, unsigned char const*, unsigned long, unsigned long, dberr_t*) /storage/innobase/os/os0file.cc:2928:2 (mariadbd+0x234c5ac)
#1 os_file_write_func(IORequest const&, char const*, int, void const*, unsigned long, unsigned long) /storage/innobase/os/os0file.cc:2963:20 (mariadbd+0x234c019)
#2 file_os_io::write(char const*, unsigned long, st_::span<unsigned char const>) /storage/innobase/log/log0log.cc:320:10 (mariadbd+0x22eaa50)
#3 log_file_t::write(unsigned long, st_::span<unsigned char const>) /storage/innobase/log/log0log.cc:434:18 (mariadbd+0x22eb1d8)
#4 log_t::file::write(unsigned long, st_::span<unsigned char>) /storage/innobase/log/log0log.cc:496:29 (mariadbd+0x22ebb55)
#5 log_write_buf(unsigned char*, unsigned long, unsigned long, unsigned long, unsigned long) /storage/innobase/log/log0log.cc:614:14 (mariadbd+0x22f1b51)
#6 log_write(bool) /storage/innobase/log/log0log.cc:755:2 (mariadbd+0x22ed2ec)
#7 log_write_up_to(unsigned long, bool, bool, completion_callback const*) /storage/innobase/log/log0log.cc:817:5 (mariadbd+0x22eca44)
#8 log_checkpoint_low(unsigned long, unsigned long) /storage/innobase/buf/buf0flu.cc:1734:5 (mariadbd+0x20d37c1)
#9 log_checkpoint() /storage/innobase/buf/buf0flu.cc:1787:10 (mariadbd+0x20cd155)
#10 buf_flush_wait_flushed(unsigned long) /storage/innobase/buf/buf0flu.cc:1867:5 (mariadbd+0x20ccf8f)
#11 log_make_checkpoint() /storage/innobase/buf/buf0flu.cc:1793:3 (mariadbd+0x20cc4c9)
#12 buf_dblwr_t::create() /storage/innobase/buf/buf0dblwr.cc:216:3 (mariadbd+0x209076a)
#13 srv_start(bool) /storage/innobase/srv/srv0start.cc:1685:20 (mariadbd+0x256b4aa)
#14 innodb_init(void*) /storage/innobase/handler/ha_innodb.cc:4188:8 (mariadbd+0x1ed40da)
#15 ha_initialize_handlerton(st_plugin_int*) /sql/handler.cc:659:31 (mariadbd+0xf7c2b6)
#16 plugin_initialize(st_mem_root*, st_plugin_int*, int*, char**, bool) /sql/sql_plugin.cc:1463:9 (mariadbd+0x160fedb)
#17 plugin_init(int*, char**, int) /sql/sql_plugin.cc:1756:15 (mariadbd+0x160f53f)
#18 init_server_components() /sql/mysqld.cc:5043:7 (mariadbd+0xd71462)
#19 mysqld_main(int, char**) /sql/mysqld.cc:5655:7 (mariadbd+0xd6ae87)
#20 main /sql/main.cc:34:10 (mariadbd+0xd661c8)
Previous write of size 8 at 0x0000067b1e98 by thread T3:
#0 os_file_pwrite(IORequest const&, int, unsigned char const*, unsigned long, unsigned long, dberr_t*) /storage/innobase/os/os0file.cc:2928:2 (mariadbd+0x234c5ac)
#1 os_file_write_func(IORequest const&, char const*, int, void const*, unsigned long, unsigned long) /storage/innobase/os/os0file.cc:2963:20 (mariadbd+0x234c019)
#2 file_os_io::write(char const*, unsigned long, st_::span<unsigned char const>) /storage/innobase/log/log0log.cc:320:10 (mariadbd+0x22eaa50)
#3 log_file_t::write(unsigned long, st_::span<unsigned char const>) /storage/innobase/log/log0log.cc:434:18 (mariadbd+0x22eb1d8)
#4 log_t::file::write(unsigned long, st_::span<unsigned char>) /storage/innobase/log/log0log.cc:496:29 (mariadbd+0x22ebb55)
#5 log_write_checkpoint_info(unsigned long) /storage/innobase/log/log0log.cc:911:14 (mariadbd+0x22edd4e)
#6 log_checkpoint_low(unsigned long, unsigned long) /storage/innobase/buf/buf0flu.cc:1755:3 (mariadbd+0x20d3a3d)
#7 buf_flush_sync_for_checkpoint(unsigned long) /storage/innobase/buf/buf0flu.cc:1947:7 (mariadbd+0x20d4163)
#8 buf_flush_page_cleaner() /storage/innobase/buf/buf0flu.cc:2186:9 (mariadbd+0x20cdab1)
#9 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14 (mariadbd+0x20c3aaa)
#10 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 (mariadbd+0x20c39bd)
#11 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:253:13 (mariadbd+0x20c3965)
#12 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:260:11 (mariadbd+0x20c3905)
#13 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > >::_M_run() /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13 (mariadbd+0x20c37f9)
#14 <null> <null> (libstdc++.so.6+0xd230f)
Location is global 'os_n_file_writes' of size 8 at 0x0000067b1e98 (mariadbd+0x67b1e98)
Make variable atomic.
4 years ago  TSAN: data race on a global counter
WARNING: ThreadSanitizer: data race (pid=1503350)
Write of size 8 at 0x0000067b1f20 by thread T3:
#0 os_file_sync_posix(int) /storage/innobase/os/os0file.cc:895:5 (mariadbd+0x23493f6)
#1 os_file_flush_func(int) /storage/innobase/os/os0file.cc:983:8 (mariadbd+0x2349204)
#2 file_os_io::flush() /storage/innobase/log/log0log.cc:326:10 (mariadbd+0x22eaaa9)
#3 log_file_t::flush() /storage/innobase/log/log0log.cc:440:18 (mariadbd+0x22eb2d0)
#4 log_t::file::flush() /storage/innobase/log/log0log.cc:507:29 (mariadbd+0x22ebe69)
#5 log_write_flush_to_disk_low(unsigned long) /storage/innobase/log/log0log.cc:629:17 (mariadbd+0x22ed3f3)
#6 log_write_up_to(unsigned long, bool, bool, completion_callback const*) /storage/innobase/log/log0log.cc:829:3 (mariadbd+0x22ecb04)
#7 log_checkpoint_low(unsigned long, unsigned long) /storage/innobase/buf/buf0flu.cc:1734:5 (mariadbd+0x20d37f1)
#8 buf_flush_sync_for_checkpoint(unsigned long) /storage/innobase/buf/buf0flu.cc:1947:7 (mariadbd+0x20d4193)
#9 buf_flush_page_cleaner() /storage/innobase/buf/buf0flu.cc:2186:9 (mariadbd+0x20cdad7)
#10 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14 (mariadbd+0x20c3aaa)
#11 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 (mariadbd+0x20c39bd)
#12 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:253:13 (mariadbd+0x20c3965)
#13 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:260:11 (mariadbd+0x20c3905)
#14 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > >::_M_run() /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13 (mariadbd+0x20c37f9)
#15 <null> <null> (libstdc++.so.6+0xd230f)
Previous write of size 8 at 0x0000067b1f20 by main thread:
#0 os_file_sync_posix(int) /storage/innobase/os/os0file.cc:895:5 (mariadbd+0x23493f6)
#1 os_file_flush_func(int) /storage/innobase/os/os0file.cc:983:8 (mariadbd+0x2349204)
#2 fil_space_t::flush_low() /storage/innobase/fil/fil0fil.cc:504:5 (mariadbd+0x205cad5)
#3 fil_flush_file_spaces() /storage/innobase/fil/fil0fil.cc:2947:13 (mariadbd+0x206523f)
#4 log_checkpoint() /storage/innobase/buf/buf0flu.cc:1777:5 (mariadbd+0x20cd069)
#5 buf_flush_wait_flushed(unsigned long) /storage/innobase/buf/buf0flu.cc:1867:5 (mariadbd+0x20ccf95)
#6 log_make_checkpoint() /storage/innobase/buf/buf0flu.cc:1793:3 (mariadbd+0x20cc4c9)
#7 buf_dblwr_t::create() /storage/innobase/buf/buf0dblwr.cc:216:3 (mariadbd+0x209076a)
#8 srv_start(bool) /storage/innobase/srv/srv0start.cc:1685:20 (mariadbd+0x256b514)
#9 innodb_init(void*) /storage/innobase/handler/ha_innodb.cc:4188:8 (mariadbd+0x1ed406a)
#10 ha_initialize_handlerton(st_plugin_int*) /sql/handler.cc:659:31 (mariadbd+0xf7c246)
#11 plugin_initialize(st_mem_root*, st_plugin_int*, int*, char**, bool) /sql/sql_plugin.cc:1463:9 (mariadbd+0x160fe6b)
#12 plugin_init(int*, char**, int) /sql/sql_plugin.cc:1756:15 (mariadbd+0x160f4cf)
#13 init_server_components() /sql/mysqld.cc:5043:7 (mariadbd+0xd713f2)
#14 mysqld_main(int, char**) /sql/mysqld.cc:5655:7 (mariadbd+0xd6ae17)
#15 main /sql/main.cc:34:10 (mariadbd+0xd66158)
This is a correct report by TSAN for an obvious case: unprotected global
counter. Fix it by making counter std::atomic.
4 years ago  MDEV-11254: innodb-use-trim has no effect in 10.2
Problem was that implementation merged from 10.1 was incompatible
with InnoDB 5.7.
buf0buf.cc: Add functions to return should we punch hole and
how big.
buf0flu.cc: Add written page to IORequest
fil0fil.cc: Remove unneeded status call and add test is
sparse files and punch hole supported by file system when
tablespace is created. Add call to get file system
block size. Used file node is added to IORequest. Added
functions to check is punch hole supported and setting
punch hole.
ha_innodb.cc: Remove unneeded status variables (trim512-32768)
and trim_op_saved. Deprecate innodb_use_trim and
set it ON by default. Add function to set innodb-use-trim
dynamically.
dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
if punch hole operation fails.
fil0fil.h: Add punch_hole variable to fil_space_t and
block size to fil_node_t.
os0api.h: Header to helper functions on buf0buf.cc and
fil0fil.cc for os0file.h
os0file.h: Remove unneeded m_block_size from IORequest
and add bpage to IORequest to know actual size of
the block and m_fil_node to know tablespace file
system block size and does it support punch hole.
os0file.cc: Add function punch_hole() to IORequest
to do punch_hole operation,
get the file system block size and determine
does file system support sparse files (for punch hole).
page0size.h: remove implicit copy disable and
use this implicit copy to implement copy_from()
function.
buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
os0file.h, os0file.cc, log0log.cc, log0recv.cc:
Remove unneeded write_size parameter from fil_io
calls.
srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
trim512-trim32678 status variables. Removed
these from monitor tests.
9 years ago  MDEV-11254: innodb-use-trim has no effect in 10.2
Problem was that implementation merged from 10.1 was incompatible
with InnoDB 5.7.
buf0buf.cc: Add functions to return should we punch hole and
how big.
buf0flu.cc: Add written page to IORequest
fil0fil.cc: Remove unneeded status call and add test is
sparse files and punch hole supported by file system when
tablespace is created. Add call to get file system
block size. Used file node is added to IORequest. Added
functions to check is punch hole supported and setting
punch hole.
ha_innodb.cc: Remove unneeded status variables (trim512-32768)
and trim_op_saved. Deprecate innodb_use_trim and
set it ON by default. Add function to set innodb-use-trim
dynamically.
dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
if punch hole operation fails.
fil0fil.h: Add punch_hole variable to fil_space_t and
block size to fil_node_t.
os0api.h: Header to helper functions on buf0buf.cc and
fil0fil.cc for os0file.h
os0file.h: Remove unneeded m_block_size from IORequest
and add bpage to IORequest to know actual size of
the block and m_fil_node to know tablespace file
system block size and does it support punch hole.
os0file.cc: Add function punch_hole() to IORequest
to do punch_hole operation,
get the file system block size and determine
does file system support sparse files (for punch hole).
page0size.h: remove implicit copy disable and
use this implicit copy to implement copy_from()
function.
buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
os0file.h, os0file.cc, log0log.cc, log0recv.cc:
Remove unneeded write_size parameter from fil_io
calls.
srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
trim512-trim32678 status variables. Removed
these from monitor tests.
9 years ago  MDEV-11254: innodb-use-trim has no effect in 10.2
Problem was that implementation merged from 10.1 was incompatible
with InnoDB 5.7.
buf0buf.cc: Add functions to return should we punch hole and
how big.
buf0flu.cc: Add written page to IORequest
fil0fil.cc: Remove unneeded status call and add test is
sparse files and punch hole supported by file system when
tablespace is created. Add call to get file system
block size. Used file node is added to IORequest. Added
functions to check is punch hole supported and setting
punch hole.
ha_innodb.cc: Remove unneeded status variables (trim512-32768)
and trim_op_saved. Deprecate innodb_use_trim and
set it ON by default. Add function to set innodb-use-trim
dynamically.
dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
if punch hole operation fails.
fil0fil.h: Add punch_hole variable to fil_space_t and
block size to fil_node_t.
os0api.h: Header to helper functions on buf0buf.cc and
fil0fil.cc for os0file.h
os0file.h: Remove unneeded m_block_size from IORequest
and add bpage to IORequest to know actual size of
the block and m_fil_node to know tablespace file
system block size and does it support punch hole.
os0file.cc: Add function punch_hole() to IORequest
to do punch_hole operation,
get the file system block size and determine
does file system support sparse files (for punch hole).
page0size.h: remove implicit copy disable and
use this implicit copy to implement copy_from()
function.
buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
os0file.h, os0file.cc, log0log.cc, log0recv.cc:
Remove unneeded write_size parameter from fil_io
calls.
srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
trim512-trim32678 status variables. Removed
these from monitor tests.
9 years ago  MDEV-11254: innodb-use-trim has no effect in 10.2
Problem was that implementation merged from 10.1 was incompatible
with InnoDB 5.7.
buf0buf.cc: Add functions to return should we punch hole and
how big.
buf0flu.cc: Add written page to IORequest
fil0fil.cc: Remove unneeded status call and add test is
sparse files and punch hole supported by file system when
tablespace is created. Add call to get file system
block size. Used file node is added to IORequest. Added
functions to check is punch hole supported and setting
punch hole.
ha_innodb.cc: Remove unneeded status variables (trim512-32768)
and trim_op_saved. Deprecate innodb_use_trim and
set it ON by default. Add function to set innodb-use-trim
dynamically.
dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
if punch hole operation fails.
fil0fil.h: Add punch_hole variable to fil_space_t and
block size to fil_node_t.
os0api.h: Header to helper functions on buf0buf.cc and
fil0fil.cc for os0file.h
os0file.h: Remove unneeded m_block_size from IORequest
and add bpage to IORequest to know actual size of
the block and m_fil_node to know tablespace file
system block size and does it support punch hole.
os0file.cc: Add function punch_hole() to IORequest
to do punch_hole operation,
get the file system block size and determine
does file system support sparse files (for punch hole).
page0size.h: remove implicit copy disable and
use this implicit copy to implement copy_from()
function.
buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
os0file.h, os0file.cc, log0log.cc, log0recv.cc:
Remove unneeded write_size parameter from fil_io
calls.
srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
trim512-trim32678 status variables. Removed
these from monitor tests.
9 years ago  MDEV-25312 Replace fil_space_t::name with fil_space_t::name()
A consistency check for fil_space_t::name is causing recovery failures
in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field
altogether.
fil_space_t::name was more or less a copy of dict_table_t::name
(except for some special cases), and it was not being used for
anything useful.
There used to be a name_hash, but it had been removed already in
commit a75dbfd7183cc96680f3e3e684fd36500dac8158 (MDEV-12266).
We will also remove os_normalize_path(), OS_PATH_SEPARATOR,
OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and /
roughly in the same way. The intention is that for per-table
tablespaces, the filenames will always follow the pattern
prefix/databasename/tablename.ibd. (Any \ in the prefix must not
be converted.)
ut_basename_noext(): Remove (unused function).
read_link_file(): Replaces RemoteDatafile::read_link_file().
We will ensure that the last two path component separators are
forward slashes (converting up to 2 trailing backslashes on
Microsoft Windows), so that everywhere else we can
assume that data file names end in "/databasename/tablename.ibd".
Note: On Microsoft Windows, path names that start with \\?\ must
not contain / as path component separators. Previously, such paths
did work in the DATA DIRECTORY argument of InnoDB tables.
Reviewed by: Vladislav Vaintroub
5 years ago  MDEV-25312 Replace fil_space_t::name with fil_space_t::name()
A consistency check for fil_space_t::name is causing recovery failures
in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field
altogether.
fil_space_t::name was more or less a copy of dict_table_t::name
(except for some special cases), and it was not being used for
anything useful.
There used to be a name_hash, but it had been removed already in
commit a75dbfd7183cc96680f3e3e684fd36500dac8158 (MDEV-12266).
We will also remove os_normalize_path(), OS_PATH_SEPARATOR,
OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and /
roughly in the same way. The intention is that for per-table
tablespaces, the filenames will always follow the pattern
prefix/databasename/tablename.ibd. (Any \ in the prefix must not
be converted.)
ut_basename_noext(): Remove (unused function).
read_link_file(): Replaces RemoteDatafile::read_link_file().
We will ensure that the last two path component separators are
forward slashes (converting up to 2 trailing backslashes on
Microsoft Windows), so that everywhere else we can
assume that data file names end in "/databasename/tablename.ibd".
Note: On Microsoft Windows, path names that start with \\?\ must
not contain / as path component separators. Previously, such paths
did work in the DATA DIRECTORY argument of InnoDB tables.
Reviewed by: Vladislav Vaintroub
5 years ago  MDEV-25312 Replace fil_space_t::name with fil_space_t::name()
A consistency check for fil_space_t::name is causing recovery failures
in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field
altogether.
fil_space_t::name was more or less a copy of dict_table_t::name
(except for some special cases), and it was not being used for
anything useful.
There used to be a name_hash, but it had been removed already in
commit a75dbfd7183cc96680f3e3e684fd36500dac8158 (MDEV-12266).
We will also remove os_normalize_path(), OS_PATH_SEPARATOR,
OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and /
roughly in the same way. The intention is that for per-table
tablespaces, the filenames will always follow the pattern
prefix/databasename/tablename.ibd. (Any \ in the prefix must not
be converted.)
ut_basename_noext(): Remove (unused function).
read_link_file(): Replaces RemoteDatafile::read_link_file().
We will ensure that the last two path component separators are
forward slashes (converting up to 2 trailing backslashes on
Microsoft Windows), so that everywhere else we can
assume that data file names end in "/databasename/tablename.ibd".
Note: On Microsoft Windows, path names that start with \\?\ must
not contain / as path component separators. Previously, such paths
did work in the DATA DIRECTORY argument of InnoDB tables.
Reviewed by: Vladislav Vaintroub
5 years ago  MDEV-25312 Replace fil_space_t::name with fil_space_t::name()
A consistency check for fil_space_t::name is causing recovery failures
in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field
altogether.
fil_space_t::name was more or less a copy of dict_table_t::name
(except for some special cases), and it was not being used for
anything useful.
There used to be a name_hash, but it had been removed already in
commit a75dbfd7183cc96680f3e3e684fd36500dac8158 (MDEV-12266).
We will also remove os_normalize_path(), OS_PATH_SEPARATOR,
OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and /
roughly in the same way. The intention is that for per-table
tablespaces, the filenames will always follow the pattern
prefix/databasename/tablename.ibd. (Any \ in the prefix must not
be converted.)
ut_basename_noext(): Remove (unused function).
read_link_file(): Replaces RemoteDatafile::read_link_file().
We will ensure that the last two path component separators are
forward slashes (converting up to 2 trailing backslashes on
Microsoft Windows), so that everywhere else we can
assume that data file names end in "/databasename/tablename.ibd".
Note: On Microsoft Windows, path names that start with \\?\ must
not contain / as path component separators. Previously, such paths
did work in the DATA DIRECTORY argument of InnoDB tables.
Reviewed by: Vladislav Vaintroub
5 years ago  MDEV-25312 Replace fil_space_t::name with fil_space_t::name()
A consistency check for fil_space_t::name is causing recovery failures
in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field
altogether.
fil_space_t::name was more or less a copy of dict_table_t::name
(except for some special cases), and it was not being used for
anything useful.
There used to be a name_hash, but it had been removed already in
commit a75dbfd7183cc96680f3e3e684fd36500dac8158 (MDEV-12266).
We will also remove os_normalize_path(), OS_PATH_SEPARATOR,
OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and /
roughly in the same way. The intention is that for per-table
tablespaces, the filenames will always follow the pattern
prefix/databasename/tablename.ibd. (Any \ in the prefix must not
be converted.)
ut_basename_noext(): Remove (unused function).
read_link_file(): Replaces RemoteDatafile::read_link_file().
We will ensure that the last two path component separators are
forward slashes (converting up to 2 trailing backslashes on
Microsoft Windows), so that everywhere else we can
assume that data file names end in "/databasename/tablename.ibd".
Note: On Microsoft Windows, path names that start with \\?\ must
not contain / as path component separators. Previously, such paths
did work in the DATA DIRECTORY argument of InnoDB tables.
Reviewed by: Vladislav Vaintroub
5 years ago  MDEV-25312 Replace fil_space_t::name with fil_space_t::name()
A consistency check for fil_space_t::name is causing recovery failures
in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field
altogether.
fil_space_t::name was more or less a copy of dict_table_t::name
(except for some special cases), and it was not being used for
anything useful.
There used to be a name_hash, but it had been removed already in
commit a75dbfd7183cc96680f3e3e684fd36500dac8158 (MDEV-12266).
We will also remove os_normalize_path(), OS_PATH_SEPARATOR,
OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and /
roughly in the same way. The intention is that for per-table
tablespaces, the filenames will always follow the pattern
prefix/databasename/tablename.ibd. (Any \ in the prefix must not
be converted.)
ut_basename_noext(): Remove (unused function).
read_link_file(): Replaces RemoteDatafile::read_link_file().
We will ensure that the last two path component separators are
forward slashes (converting up to 2 trailing backslashes on
Microsoft Windows), so that everywhere else we can
assume that data file names end in "/databasename/tablename.ibd".
Note: On Microsoft Windows, path names that start with \\?\ must
not contain / as path component separators. Previously, such paths
did work in the DATA DIRECTORY argument of InnoDB tables.
Reviewed by: Vladislav Vaintroub
5 years ago  MDEV-25312 Replace fil_space_t::name with fil_space_t::name()
A consistency check for fil_space_t::name is causing recovery failures
in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field
altogether.
fil_space_t::name was more or less a copy of dict_table_t::name
(except for some special cases), and it was not being used for
anything useful.
There used to be a name_hash, but it had been removed already in
commit a75dbfd7183cc96680f3e3e684fd36500dac8158 (MDEV-12266).
We will also remove os_normalize_path(), OS_PATH_SEPARATOR,
OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and /
roughly in the same way. The intention is that for per-table
tablespaces, the filenames will always follow the pattern
prefix/databasename/tablename.ibd. (Any \ in the prefix must not
be converted.)
ut_basename_noext(): Remove (unused function).
read_link_file(): Replaces RemoteDatafile::read_link_file().
We will ensure that the last two path component separators are
forward slashes (converting up to 2 trailing backslashes on
Microsoft Windows), so that everywhere else we can
assume that data file names end in "/databasename/tablename.ibd".
Note: On Microsoft Windows, path names that start with \\?\ must
not contain / as path component separators. Previously, such paths
did work in the DATA DIRECTORY argument of InnoDB tables.
Reviewed by: Vladislav Vaintroub
5 years ago  MDEV-25312 Replace fil_space_t::name with fil_space_t::name()
A consistency check for fil_space_t::name is causing recovery failures
in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field
altogether.
fil_space_t::name was more or less a copy of dict_table_t::name
(except for some special cases), and it was not being used for
anything useful.
There used to be a name_hash, but it had been removed already in
commit a75dbfd7183cc96680f3e3e684fd36500dac8158 (MDEV-12266).
We will also remove os_normalize_path(), OS_PATH_SEPARATOR,
OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and /
roughly in the same way. The intention is that for per-table
tablespaces, the filenames will always follow the pattern
prefix/databasename/tablename.ibd. (Any \ in the prefix must not
be converted.)
ut_basename_noext(): Remove (unused function).
read_link_file(): Replaces RemoteDatafile::read_link_file().
We will ensure that the last two path component separators are
forward slashes (converting up to 2 trailing backslashes on
Microsoft Windows), so that everywhere else we can
assume that data file names end in "/databasename/tablename.ibd".
Note: On Microsoft Windows, path names that start with \\?\ must
not contain / as path component separators. Previously, such paths
did work in the DATA DIRECTORY argument of InnoDB tables.
Reviewed by: Vladislav Vaintroub
5 years ago  MDEV-25312 Replace fil_space_t::name with fil_space_t::name()
A consistency check for fil_space_t::name is causing recovery failures
in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field
altogether.
fil_space_t::name was more or less a copy of dict_table_t::name
(except for some special cases), and it was not being used for
anything useful.
There used to be a name_hash, but it had been removed already in
commit a75dbfd7183cc96680f3e3e684fd36500dac8158 (MDEV-12266).
We will also remove os_normalize_path(), OS_PATH_SEPARATOR,
OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and /
roughly in the same way. The intention is that for per-table
tablespaces, the filenames will always follow the pattern
prefix/databasename/tablename.ibd. (Any \ in the prefix must not
be converted.)
ut_basename_noext(): Remove (unused function).
read_link_file(): Replaces RemoteDatafile::read_link_file().
We will ensure that the last two path component separators are
forward slashes (converting up to 2 trailing backslashes on
Microsoft Windows), so that everywhere else we can
assume that data file names end in "/databasename/tablename.ibd".
Note: On Microsoft Windows, path names that start with \\?\ must
not contain / as path component separators. Previously, such paths
did work in the DATA DIRECTORY argument of InnoDB tables.
Reviewed by: Vladislav Vaintroub
5 years ago  MDEV-25312 Replace fil_space_t::name with fil_space_t::name()
A consistency check for fil_space_t::name is causing recovery failures
in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field
altogether.
fil_space_t::name was more or less a copy of dict_table_t::name
(except for some special cases), and it was not being used for
anything useful.
There used to be a name_hash, but it had been removed already in
commit a75dbfd7183cc96680f3e3e684fd36500dac8158 (MDEV-12266).
We will also remove os_normalize_path(), OS_PATH_SEPARATOR,
OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and /
roughly in the same way. The intention is that for per-table
tablespaces, the filenames will always follow the pattern
prefix/databasename/tablename.ibd. (Any \ in the prefix must not
be converted.)
ut_basename_noext(): Remove (unused function).
read_link_file(): Replaces RemoteDatafile::read_link_file().
We will ensure that the last two path component separators are
forward slashes (converting up to 2 trailing backslashes on
Microsoft Windows), so that everywhere else we can
assume that data file names end in "/databasename/tablename.ibd".
Note: On Microsoft Windows, path names that start with \\?\ must
not contain / as path component separators. Previously, such paths
did work in the DATA DIRECTORY argument of InnoDB tables.
Reviewed by: Vladislav Vaintroub
5 years ago  MDEV-11254: innodb-use-trim has no effect in 10.2
Problem was that implementation merged from 10.1 was incompatible
with InnoDB 5.7.
buf0buf.cc: Add functions to return should we punch hole and
how big.
buf0flu.cc: Add written page to IORequest
fil0fil.cc: Remove unneeded status call and add test is
sparse files and punch hole supported by file system when
tablespace is created. Add call to get file system
block size. Used file node is added to IORequest. Added
functions to check is punch hole supported and setting
punch hole.
ha_innodb.cc: Remove unneeded status variables (trim512-32768)
and trim_op_saved. Deprecate innodb_use_trim and
set it ON by default. Add function to set innodb-use-trim
dynamically.
dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
if punch hole operation fails.
fil0fil.h: Add punch_hole variable to fil_space_t and
block size to fil_node_t.
os0api.h: Header to helper functions on buf0buf.cc and
fil0fil.cc for os0file.h
os0file.h: Remove unneeded m_block_size from IORequest
and add bpage to IORequest to know actual size of
the block and m_fil_node to know tablespace file
system block size and does it support punch hole.
os0file.cc: Add function punch_hole() to IORequest
to do punch_hole operation,
get the file system block size and determine
does file system support sparse files (for punch hole).
page0size.h: remove implicit copy disable and
use this implicit copy to implement copy_from()
function.
buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
os0file.h, os0file.cc, log0log.cc, log0recv.cc:
Remove unneeded write_size parameter from fil_io
calls.
srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
trim512-trim32678 status variables. Removed
these from monitor tests.
9 years ago  MDEV-11254: innodb-use-trim has no effect in 10.2
Problem was that implementation merged from 10.1 was incompatible
with InnoDB 5.7.
buf0buf.cc: Add functions to return should we punch hole and
how big.
buf0flu.cc: Add written page to IORequest
fil0fil.cc: Remove unneeded status call and add test is
sparse files and punch hole supported by file system when
tablespace is created. Add call to get file system
block size. Used file node is added to IORequest. Added
functions to check is punch hole supported and setting
punch hole.
ha_innodb.cc: Remove unneeded status variables (trim512-32768)
and trim_op_saved. Deprecate innodb_use_trim and
set it ON by default. Add function to set innodb-use-trim
dynamically.
dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
if punch hole operation fails.
fil0fil.h: Add punch_hole variable to fil_space_t and
block size to fil_node_t.
os0api.h: Header to helper functions on buf0buf.cc and
fil0fil.cc for os0file.h
os0file.h: Remove unneeded m_block_size from IORequest
and add bpage to IORequest to know actual size of
the block and m_fil_node to know tablespace file
system block size and does it support punch hole.
os0file.cc: Add function punch_hole() to IORequest
to do punch_hole operation,
get the file system block size and determine
does file system support sparse files (for punch hole).
page0size.h: remove implicit copy disable and
use this implicit copy to implement copy_from()
function.
buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
os0file.h, os0file.cc, log0log.cc, log0recv.cc:
Remove unneeded write_size parameter from fil_io
calls.
srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
trim512-trim32678 status variables. Removed
these from monitor tests.
9 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
2 years ago  MDEV-25506 (3 of 3): Do not delete .ibd files before commit
This is a complete rewrite of DROP TABLE, also as part of other DDL,
such as ALTER TABLE, CREATE TABLE...SELECT, TRUNCATE TABLE.
The background DROP TABLE queue hack is removed.
If a transaction needs to drop and create a table by the same name
(like TRUNCATE TABLE does), it must first rename the table to an
internal #sql-ib name. No committed version of the data dictionary
will include any #sql-ib tables, because whenever a transaction
renames a table to a #sql-ib name, it will also drop that table.
Either the rename will be rolled back, or the drop will be committed.
Data files will be unlinked after the transaction has been committed
and a FILE_RENAME record has been durably written. The file will
actually be deleted when the detached file handle returned by
fil_delete_tablespace() will be closed, after the latches have been
released. It is possible that a purge of the delete of the SYS_INDEXES
record for the clustered index will execute fil_delete_tablespace()
concurrently with the DDL transaction. In that case, the thread that
arrives later will wait for the other thread to finish.
HTON_TRUNCATE_REQUIRES_EXCLUSIVE_USE: A new handler flag.
ha_innobase::truncate() now requires that all other references to
the table be released in advance. This was implemented by Monty.
ha_innobase::delete_table(): If CREATE TABLE..SELECT is detected,
we will "hijack" the current transaction, drop the table in
the current transaction and commit the current transaction.
This essentially fixes MDEV-21602. There is a FIXME comment about
making the check less failure-prone.
ha_innobase::truncate(), ha_innobase::delete_table():
Implement a fast path for temporary tables. We will no longer allow
temporary tables to use the adaptive hash index.
dict_table_t::mdl_name: The original table name for the purpose of
acquiring MDL in purge, to prevent a race condition between a
DDL transaction that is dropping a table, and purge processing
undo log records of DML that had executed before the DDL operation.
For #sql-backup- tables during ALTER TABLE...ALGORITHM=COPY, the
dict_table_t::mdl_name will differ from dict_table_t::name.
dict_table_t::parse_name(): Use mdl_name instead of name.
dict_table_rename_in_cache(): Update mdl_name.
For the internal FTS_ tables of FULLTEXT INDEX, purge would
acquire MDL on the FTS_ table name, but not on the main table,
and therefore it would be able to run concurrently with a
DDL transaction that is dropping the table. Previously, the
DROP TABLE queue hack prevented a race between purge and DDL.
For now, we introduce purge_sys.stop_FTS() to prevent purge from
opening any table, while a DDL transaction that may drop FTS_
tables is in progress. The function fts_lock_table(), which will
be invoked before the dictionary is locked, will wait for
purge to release any table handles.
trx_t::drop_table_statistics(): Drop statistics for the table.
This replaces dict_stats_drop_index(). We will drop or rename
persistent statistics atomically as part of DDL transactions.
On lock conflict for dropping statistics, we will fail instantly
with DB_LOCK_WAIT_TIMEOUT, because we will be holding the
exclusive data dictionary latch.
trx_t::commit_cleanup(): Separated from trx_t::commit_in_memory().
Relax an assertion around fts_commit() and allow DB_LOCK_WAIT_TIMEOUT
in addition to DB_DUPLICATE_KEY. The call to fts_commit() is
entirely misplaced here and may obviously break the consistency
of transactions that affect FULLTEXT INDEX. It needs to be fixed
separately.
dict_table_t::n_foreign_key_checks_running: Remove (MDEV-21175).
The counter was a work-around for missing meta-data locking (MDL)
on the SQL layer, and not really needed in MariaDB.
ER_TABLE_IN_FK_CHECK: Replaced with ER_UNUSED_28.
HA_ERR_TABLE_IN_FK_CHECK: Remove.
row_ins_check_foreign_constraints(): Do not acquire
dict_sys.latch either. The SQL-layer MDL will protect us.
This was reviewed by Thirunarayanan Balathandayuthapani
and tested by Matthias Leich.
4 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-28766: SET GLOBAL innodb_log_file_buffering
In commit c4c88307091cb16886562e9e7b77f5fd077d34b5 (MDEV-28111) we disabled
the file system cache on the InnoDB write-ahead log file (ib_logfile0)
by default on Linux.
It turns out that especially with innodb_flush_trx_log_at_commit=2,
writing to the log via the file system cache typically improves throughput,
especially on slow storage or at a small number of concurrent transactions.
For other values of innodb_flush_log_at_trx_commit, direct writes were
observed to be mostly but not always faster. Whether it pays off to
disable the file system cache on the log may depend on the type of storage,
the workload, and the operating system kernel version.
On Linux and Microsoft Windows, we will introduce the settable Boolean
global variable innodb_log_file_buffering that indicates whether the
file system cache on the redo log file is enabled. The default value is
innodb_log_file_buffering=OFF. If the server is started up with
innodb_flush_log_at_trx_commit=2, the value will be changed to
innodb_log_file_buffering=ON.
When a persistent memory interface is being used for the log,
the value cannot be changed from innodb_log_file_buffering=OFF.
On Linux, when the physical block size cannot be determined
to be a power of 2 between 64 and 4096 bytes, the file system cache
cannot be disabled, and innodb_log_file_buffering=ON cannot be changed.
Server log messages will indicate whether the file system cache is
enabled for the redo log:
[Note] InnoDB: Buffered log writes (block size=512 bytes)
[Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
After this change, the startup parameter innodb_flush_method will no
longer control whether O_DIRECT will be set on the redo log on Linux.
On other operating systems that support O_DIRECT, no interface has been
implemented for controlling the file system cache for the redo log.
The innodb_flush_method values O_DIRECT, O_DIRECT_NO_FSYNC, O_DSYNC
will enable O_DIRECT for data files, not the log.
Tested by: Matthias Leich, Axel Schwenke
3 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-28766: SET GLOBAL innodb_log_file_buffering
In commit c4c88307091cb16886562e9e7b77f5fd077d34b5 (MDEV-28111) we disabled
the file system cache on the InnoDB write-ahead log file (ib_logfile0)
by default on Linux.
It turns out that especially with innodb_flush_trx_log_at_commit=2,
writing to the log via the file system cache typically improves throughput,
especially on slow storage or at a small number of concurrent transactions.
For other values of innodb_flush_log_at_trx_commit, direct writes were
observed to be mostly but not always faster. Whether it pays off to
disable the file system cache on the log may depend on the type of storage,
the workload, and the operating system kernel version.
On Linux and Microsoft Windows, we will introduce the settable Boolean
global variable innodb_log_file_buffering that indicates whether the
file system cache on the redo log file is enabled. The default value is
innodb_log_file_buffering=OFF. If the server is started up with
innodb_flush_log_at_trx_commit=2, the value will be changed to
innodb_log_file_buffering=ON.
When a persistent memory interface is being used for the log,
the value cannot be changed from innodb_log_file_buffering=OFF.
On Linux, when the physical block size cannot be determined
to be a power of 2 between 64 and 4096 bytes, the file system cache
cannot be disabled, and innodb_log_file_buffering=ON cannot be changed.
Server log messages will indicate whether the file system cache is
enabled for the redo log:
[Note] InnoDB: Buffered log writes (block size=512 bytes)
[Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
After this change, the startup parameter innodb_flush_method will no
longer control whether O_DIRECT will be set on the redo log on Linux.
On other operating systems that support O_DIRECT, no interface has been
implemented for controlling the file system cache for the redo log.
The innodb_flush_method values O_DIRECT, O_DIRECT_NO_FSYNC, O_DSYNC
will enable O_DIRECT for data files, not the log.
Tested by: Matthias Leich, Axel Schwenke
3 years ago  MDEV-28766: SET GLOBAL innodb_log_file_buffering
In commit c4c88307091cb16886562e9e7b77f5fd077d34b5 (MDEV-28111) we disabled
the file system cache on the InnoDB write-ahead log file (ib_logfile0)
by default on Linux.
It turns out that especially with innodb_flush_trx_log_at_commit=2,
writing to the log via the file system cache typically improves throughput,
especially on slow storage or at a small number of concurrent transactions.
For other values of innodb_flush_log_at_trx_commit, direct writes were
observed to be mostly but not always faster. Whether it pays off to
disable the file system cache on the log may depend on the type of storage,
the workload, and the operating system kernel version.
On Linux and Microsoft Windows, we will introduce the settable Boolean
global variable innodb_log_file_buffering that indicates whether the
file system cache on the redo log file is enabled. The default value is
innodb_log_file_buffering=OFF. If the server is started up with
innodb_flush_log_at_trx_commit=2, the value will be changed to
innodb_log_file_buffering=ON.
When a persistent memory interface is being used for the log,
the value cannot be changed from innodb_log_file_buffering=OFF.
On Linux, when the physical block size cannot be determined
to be a power of 2 between 64 and 4096 bytes, the file system cache
cannot be disabled, and innodb_log_file_buffering=ON cannot be changed.
Server log messages will indicate whether the file system cache is
enabled for the redo log:
[Note] InnoDB: Buffered log writes (block size=512 bytes)
[Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
After this change, the startup parameter innodb_flush_method will no
longer control whether O_DIRECT will be set on the redo log on Linux.
On other operating systems that support O_DIRECT, no interface has been
implemented for controlling the file system cache for the redo log.
The innodb_flush_method values O_DIRECT, O_DIRECT_NO_FSYNC, O_DSYNC
will enable O_DIRECT for data files, not the log.
Tested by: Matthias Leich, Axel Schwenke
3 years ago  branches/innodb+: Merge revisions r5971:6130 from branches/zip.
------------------------------------------------------------------------
r5971 | marko | 2009-09-23 23:03:51 +1000 (Wed, 23 Sep 2009) | 2 lines
branches/zip: os_file_pwrite(): Make the code compile in InnoDB Hot Backup
when the pwrite system call is not available.
------------------------------------------------------------------------
r5972 | marko | 2009-09-24 05:44:52 +1000 (Thu, 24 Sep 2009) | 5 lines
branches/zip: fil_node_open_file(): In InnoDB Hot Backup,
determine the page size of single-file tablespaces before computing
the file node size. Otherwise, the space->size of compressed tablespaces
would be computed with UNIV_PAGE_SIZE instead of key_block_size.
This should fix Issue #313.
------------------------------------------------------------------------
r5973 | marko | 2009-09-24 05:53:21 +1000 (Thu, 24 Sep 2009) | 2 lines
branches/zip: recv_add_to_hash_table():
Simplify obfuscated pointer arithmetics.
------------------------------------------------------------------------
r5978 | marko | 2009-09-24 17:47:56 +1000 (Thu, 24 Sep 2009) | 1 line
branches/zip: Fix warnings and errors when UNIV_HOTBACKUP is defined.
------------------------------------------------------------------------
r5979 | marko | 2009-09-24 20:16:10 +1000 (Thu, 24 Sep 2009) | 4 lines
branches/zip: ha_innodb.cc: Define MYSQL_PLUGIN_IMPORT when necessary.
This preprocessor symbol has been recently introduced in MySQL 5.1.
The InnoDB Plugin should remain source compatible with MySQL 5.1.24
and later.
------------------------------------------------------------------------
r5988 | calvin | 2009-09-26 05:14:43 +1000 (Sat, 26 Sep 2009) | 8 lines
branches/zip: fix bug#47055 unconditional exit(1) on ERROR_WORKING_SET_QUOTA
1453 (0x5AD) for InnoDB backend
When error ERROR_WORKING_SET_QUOTA or ERROR_NO_SYSTEM_RESOURCES
occurs, yields for 100ms and retries the operation.
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r5992 | vasil | 2009-09-28 17:10:29 +1000 (Mon, 28 Sep 2009) | 4 lines
branches/zip:
Add ChangeLog entry for c5988.
------------------------------------------------------------------------
r5994 | marko | 2009-09-28 18:33:59 +1000 (Mon, 28 Sep 2009) | 17 lines
branches/zip: Try to prevent the reuse of tablespace identifiers after
InnoDB has crashed during table creation. Also, refuse to start if
files with duplicate tablespace identifiers are encountered.
fil_node_create(): Update fil_system->max_assigned_id. This should
prevent the reuse of a space->id when InnoDB does a full crash
recovery and invokes fil_load_single_table_tablespaces(). Normally,
fil_system->max_assigned_id is initialized from
SELECT MAX(ID) FROM SYS_TABLES.
fil_open_single_table_tablespace(): Return FALSE when
fil_space_create() fails.
fil_load_single_table_tablespace(): Exit if fil_space_create() fails
and innodb_force_recovery=0.
rb://173 approved by Heikki Tuuri. This addresses Issue #335.
------------------------------------------------------------------------
r5995 | marko | 2009-09-28 18:52:25 +1000 (Mon, 28 Sep 2009) | 17 lines
branches/zip: Do not write to PAGE_INDEX_ID after page creation,
not even when restoring an uncompressed page after a compression failure.
btr_page_reorganize_low(): On compression failure, do not restore
those page header fields that should not be affected by the
reorganization. Instead, compare the fields.
page_zip_decompress(): Add the parameter ibool all, for copying all
page header fields. Pass the parameter all=TRUE on block read
completion, redo log application, and page_zip_validate(); pass
all=FALSE in all other cases.
page_zip_reorganize(): Do not restore the uncompressed page on
failure. It will be restored (to pre-modification state) by the
caller anyway.
rb://167, Issue #346
------------------------------------------------------------------------
r5996 | marko | 2009-09-28 22:46:02 +1000 (Mon, 28 Sep 2009) | 4 lines
branches/zip: Address Issue #350 in comments.
lock_rec_queue_validate(), lock_rec_queue_validate(): Note that
this debug code may violate the latching order and cause deadlocks.
------------------------------------------------------------------------
r5997 | marko | 2009-09-28 23:03:58 +1000 (Mon, 28 Sep 2009) | 12 lines
branches/zip: Remove an assertion failure when the InnoDB data dictionary
is inconsistent with the MySQL .frm file.
ha_innobase::index_read(): When the index cannot be found,
return an error.
ha_innobase::change_active_index(): When prebuilt->index == NULL,
set also prebuilt->index_usable = FALSE. This is not needed for
correctness, because prebuilt->index_usable is only checked by
row_search_for_mysql(), which requires prebuilt->index != NULL.
This addresses Issue #349. Approved by Heikki Tuuri over IM.
------------------------------------------------------------------------
r6005 | vasil | 2009-09-29 18:09:52 +1000 (Tue, 29 Sep 2009) | 4 lines
branches/zip:
ChangeLog: wrap around 78th column, not earlier.
------------------------------------------------------------------------
r6006 | vasil | 2009-09-29 20:15:25 +1000 (Tue, 29 Sep 2009) | 4 lines
branches/zip:
Add ChangeLog entry for the release of 1.0.4.
------------------------------------------------------------------------
r6007 | vasil | 2009-09-29 23:19:59 +1000 (Tue, 29 Sep 2009) | 6 lines
branches/zip:
Fix the year, should be 2009.
Pointed by: Calvin
------------------------------------------------------------------------
r6026 | marko | 2009-09-30 17:18:24 +1000 (Wed, 30 Sep 2009) | 1 line
branches/zip: Add some debug assertions for checking FSEG_MAGIC_N.
------------------------------------------------------------------------
r6028 | marko | 2009-09-30 23:55:23 +1000 (Wed, 30 Sep 2009) | 3 lines
branches/zip: recv_no_log_write: New debug flag for tracking down
Mantis Issue #347. No modifications should be made to the database
while recv_apply_hashed_log_recs() is about to complete.
------------------------------------------------------------------------
r6029 | calvin | 2009-10-01 06:32:02 +1000 (Thu, 01 Oct 2009) | 4 lines
branches/zip: non-functional changes
Fix typo.
------------------------------------------------------------------------
r6031 | marko | 2009-10-01 21:24:33 +1000 (Thu, 01 Oct 2009) | 49 lines
branches/zip: Clean up after a crash during DROP INDEX.
When InnoDB crashes while dropping an index, ensure that
the index will be completely dropped during crash recovery.
row_merge_drop_index(): Before dropping an index, rename the index to
start with TEMP_INDEX_PREFIX_STR and commit the change, so that
row_merge_drop_temp_indexes() will drop the index after crash
recovery if the server crashes while dropping the index.
fseg_inode_try_get(): New function, forked from fseg_inode_get().
Return NULL if the file segment index node is free.
fseg_inode_get(): Assert that the file segment index node is not free.
fseg_free_step(): If the file segment index node is already free,
print a diagnostic message and return TRUE.
fsp_free_seg_inode(): Write a nonzero number to FSEG_MAGIC_N, so that
allocated-and-freed file segment index nodes can be better
distinguished from uninitialized ones.
This is rb://174, addressing Issue #348.
Tested by restarting mysqld upon the completion of the added
log_write_up_to() invocation below, during DROP INDEX. The index was
dropped after crash recovery, and re-issuing the DROP INDEX did not
crash the server.
Index: btr/btr0btr.c
===================================================================
--- btr/btr0btr.c (revision 6026)
+++ btr/btr0btr.c (working copy)
@@ -42,6 +42,7 @@ Created 6/2/1994 Heikki Tuuri
#include "ibuf0ibuf.h"
#include "trx0trx.h"
+#include "log0log.h"
/*
Latching strategy of the InnoDB B-tree
--------------------------------------
@@ -873,6 +874,8 @@ leaf_loop:
goto leaf_loop;
}
+
+ log_write_up_to(mtr.end_lsn, LOG_WAIT_ALL_GROUPS, TRUE);
top_loop:
mtr_start(&mtr);
------------------------------------------------------------------------
r6033 | calvin | 2009-10-02 06:19:46 +1000 (Fri, 02 Oct 2009) | 4 lines
branches/zip: fix a typo in error message
Reported as bug#47763.
------------------------------------------------------------------------
r6043 | inaam | 2009-10-06 01:45:35 +1100 (Tue, 06 Oct 2009) | 12 lines
branches/zip rb://176
Do not invalidate buffer pool while an LRU batch is active. Added
code to buf_pool_invalidate() to wait for the running batches to finish.
This patch also resets the state of buf_pool struct at invalidation. This
addresses the concern where buf_pool->freed_page_clock becomes non-zero
because we read in a system tablespace page for file format info at
startup.
Approved by: Marko
------------------------------------------------------------------------
r6044 | pekka | 2009-10-07 01:44:54 +1100 (Wed, 07 Oct 2009) | 5 lines
branches/zip:
Add os_file_is_same() function for Hot Backup (inside ifdef UNIV_HOTBACKUP).
This is part of the fix for Issue #186.
Note! The Windows implementation is incomplete.
------------------------------------------------------------------------
r6046 | pekka | 2009-10-08 20:24:56 +1100 (Thu, 08 Oct 2009) | 3 lines
branches/zip: Revert r6044 which added os_file_is_same() function
(issue#186). This functionality is moved to Hot Backup source tree.
------------------------------------------------------------------------
r6048 | vasil | 2009-10-09 16:42:55 +1100 (Fri, 09 Oct 2009) | 16 lines
branches/zip:
When scanning a directory readdir() is called and stat() after it,
if a file is deleted between the two calls stat will fail and the
whole precedure will fail. Change this behavior to continue with the
next entry if stat() fails because of nonexistent file. This is
transparent change as it will make it look as if the file was deleted
before the readdir() call.
This change is needed in order to fix
https://svn.innodb.com/mantis/view.php?id=174
in which we need to abort if os_file_readdir_next_file()
encounters "real" errors.
Approved by: Marko, Pekka (rb://177)
------------------------------------------------------------------------
r6049 | vasil | 2009-10-10 03:05:26 +1100 (Sat, 10 Oct 2009) | 7 lines
branches/zip:
Fix compilation warning in Hot Backup:
innodb/fil/fil0fil.c: In function 'fil_load_single_table_tablespace':
innodb/fil/fil0fil.c:3253: warning: format '%lld' expects type 'long long int', but argument 6 has type 'ib_int64_t'
------------------------------------------------------------------------
r6064 | calvin | 2009-10-14 02:23:35 +1100 (Wed, 14 Oct 2009) | 4 lines
branches/zip: non-functional changes
Changes from MySQL to fix build issue.
------------------------------------------------------------------------
r6065 | inaam | 2009-10-14 04:43:13 +1100 (Wed, 14 Oct 2009) | 7 lines
branches/zip rb://182
Call fsync() on datafiles after a batch of pages is written to disk
even when skip_innodb_doublewrite is set.
Approved by: Heikki
------------------------------------------------------------------------
r6080 | sunny | 2009-10-15 09:29:01 +1100 (Thu, 15 Oct 2009) | 3 lines
branches/zip: Change page_mem_alloc_free() to inline.
Fix Bug #47058 - Failure to compile innodb_plugin on solaris 10u7 + spro cc/CC 5.10
------------------------------------------------------------------------
r6084 | vasil | 2009-10-15 16:21:17 +1100 (Thu, 15 Oct 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r6080.
------------------------------------------------------------------------
r6095 | vasil | 2009-10-20 00:04:59 +1100 (Tue, 20 Oct 2009) | 7 lines
branches/zip:
Fix Bug#47808 innodb_information_schema.test fails when run under valgrind
by using the wait_until_rows_count macro that loops until the number of
rows becomes 14 instead of sleep 0.1, which is obviously very fragile.
------------------------------------------------------------------------
r6096 | vasil | 2009-10-20 00:06:09 +1100 (Tue, 20 Oct 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r6095.
------------------------------------------------------------------------
r6099 | jyang | 2009-10-22 13:58:39 +1100 (Thu, 22 Oct 2009) | 7 lines
branches/zip: Port bug #46000 related changes from 5.1 to zip
branch. Due to different code path for creating index in zip
branch comparing to 5.1), the index reserved name check function
is extended to be used in ha_innobase::add_index().
rb://190 Approved by: Marko
------------------------------------------------------------------------
r6100 | jyang | 2009-10-22 14:51:07 +1100 (Thu, 22 Oct 2009) | 6 lines
branches/zip: As a request from mysql, WARN_LEVEL_ERROR cannot
be used for push_warning_* call any more. Switch to
WARN_LEVEL_WARN. Bug #47233.
rb://172 approved by Sunny Bains and Marko.
------------------------------------------------------------------------
r6101 | jyang | 2009-10-23 19:45:50 +1100 (Fri, 23 Oct 2009) | 7 lines
branches/zip: Update test result with the WARN_LEVEL_ERROR
to WARN_LEVEL_WARN change. This is the same result as
submitted in rb://172 review, which approved by Sunny Bains
and Marko.
------------------------------------------------------------------------
r6102 | marko | 2009-10-26 18:32:23 +1100 (Mon, 26 Oct 2009) | 1 line
branches/zip: row_prebuilt_struct::prebuilts: Unused field, remove.
------------------------------------------------------------------------
r6103 | marko | 2009-10-27 00:46:18 +1100 (Tue, 27 Oct 2009) | 4 lines
branches/zip: row_ins_alloc_sys_fields(): Zero out the system columns
DB_TRX_ID, DB_ROLL_PTR and DB_ROW_ID, in order to avoid harmless
Valgrind warnings about uninitialized data. (The warnings were
harmless, because the fields would be initialized at a later stage.)
------------------------------------------------------------------------
r6105 | calvin | 2009-10-28 09:05:52 +1100 (Wed, 28 Oct 2009) | 6 lines
branches/zip: backport r3848 from 6.0 branch
----
branches/6.0: innobase_start_or_create_for_mysql(): Make the 10 MB
minimum tablespace limit independent of UNIV_PAGE_SIZE. (Bug #41490)
------------------------------------------------------------------------
r6107 | marko | 2009-10-29 01:10:34 +1100 (Thu, 29 Oct 2009) | 5 lines
branches/zip: buf_page_set_old(): Improve UNIV_LRU_DEBUG diagnostics
in order to catch the buf_pool->LRU_old corruption reported in Issue #381.
buf_LRU_old_init(): Set the property from the tail towards the front
of the buf_pool->LRU list, in order not to trip the debug check.
------------------------------------------------------------------------
r6108 | calvin | 2009-10-29 16:58:04 +1100 (Thu, 29 Oct 2009) | 5 lines
branches/zip: close file handle when building with UNIV_HOTBACKUP
The change does not affect regular InnoDB engine. Confirmed by
Marko.
------------------------------------------------------------------------
r6109 | jyang | 2009-10-29 19:37:32 +1100 (Thu, 29 Oct 2009) | 7 lines
branches/zip: In os_mem_alloc_large(), if we fail to attach
the shared memory, reset memory pointer ptr to NULL, and
allocate memory from conventional pool.
Bug #48237 Error handling in os_mem_alloc_large appears to be incorrect
rb://198 Approved by: Marko
------------------------------------------------------------------------
r6110 | marko | 2009-10-29 21:44:57 +1100 (Thu, 29 Oct 2009) | 2 lines
branches/zip: Makefile.am (INCLUDES): Merge a change from MySQL:
Use $(srcdir)/include instead of $(top_srcdir)/storage/innobase/include.
------------------------------------------------------------------------
r6111 | marko | 2009-10-29 22:04:11 +1100 (Thu, 29 Oct 2009) | 33 lines
branches/zip: Fix corruption of buf_pool->LRU_old and improve debug assertions.
This was reported as Issue #381.
buf_page_set_old(): Assert that blocks may only be set old if
buf_pool->LRU_old is initialized and buf_pool->LRU_old_len is nonzero.
Assert that buf_pool->LRU_old points to the block at the old/new boundary.
buf_LRU_old_adjust_len(): Invoke buf_page_set_old() after adjusting
buf_pool->LRU_old and buf_pool->LRU_old_len, in order not to violate
the added assertions.
buf_LRU_old_init(): Replace buf_page_set_old() with a direct
assignment to bpage->old, because these loops that initialize all the
blocks would temporarily violate the assertions about
buf_pool->LRU_old.
buf_LRU_remove_block(): When setting buf_pool->LRU_old = NULL, also
clear all bpage->old flags and set buf_pool->LRU_old_len = 0.
buf_LRU_add_block_to_end_low(), buf_LRU_add_block_low(): Move the
buf_page_set_old() call later in order not to violate the debug
assertions. If buf_pool->LRU_old is NULL, set old=FALSE.
buf_LRU_free_block(): Replace the UNIV_LRU_DEBUG assertion with a
dummy buf_page_set_old() call that performs more thorough checks.
buf_LRU_validate(): Do not tolerate garbage in buf_pool->LRU_old_len
even if buf_pool->LRU_old is NULL. Check that bpage->old is monotonic.
buf_relocate(): Make the UNIV_LRU_DEBUG checks stricter.
buf0buf.h: Revise the documentation of buf_page_t::old and
buf_pool_t::LRU_old_len.
------------------------------------------------------------------------
r6112 | calvin | 2009-10-30 01:21:15 +1100 (Fri, 30 Oct 2009) | 4 lines
branches/zip: consideration for icc compilers
Proposed by MySQL, and approved by Marko.
------------------------------------------------------------------------
r6113 | vasil | 2009-10-30 03:15:50 +1100 (Fri, 30 Oct 2009) | 93 lines
branches/zip: Merge r5912:6112 from branches/5.1:
(after this merge the innodb-autoinc test starts to fail, but
I commit anyway because it would be easier to investigate the
failure this way)
------------------------------------------------------------------------
r5952 | calvin | 2009-09-22 19:45:07 +0300 (Tue, 22 Sep 2009) | 7 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: fix bug#42383: Can't create table 'test.bug39438'
For embedded server, MySQL may pass in full path, which is
currently disallowed. It is needed to relax the condition by
accepting full paths in the embedded case.
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r6032 | vasil | 2009-10-01 15:55:49 +0300 (Thu, 01 Oct 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Fix Bug#38996 Race condition in ANALYZE TABLE
by serializing ANALYZE TABLE inside InnoDB.
Approved by: Heikki (rb://175)
------------------------------------------------------------------------
r6045 | jyang | 2009-10-08 02:27:08 +0300 (Thu, 08 Oct 2009) | 7 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
A /branches/5.1/mysql-test/innodb_bug47777.result
A /branches/5.1/mysql-test/innodb_bug47777.test
branches/5.1: Fix bug #47777. Treat the Geometry data same as
Binary BLOB in ha_innobase::store_key_val_for_row(), since the
Geometry data is stored as Binary BLOB in Innodb.
Review: rb://180 approved by Marko Makela.
------------------------------------------------------------------------
r6051 | sunny | 2009-10-12 07:05:00 +0300 (Mon, 12 Oct 2009) | 6 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Ignore negative values supplied by the user when calculating the
next value to store in dict_table_t. Setting autoincrement columns top negative
values is undefined behavior and this change should bring the behavior of
InnoDB closer to what users expect. Added several tests to check.
rb://162
------------------------------------------------------------------------
r6052 | sunny | 2009-10-12 07:09:56 +0300 (Mon, 12 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Reset the statement level autoinc counter on ROLLBACK. Fix
the test results too.
rb://164
------------------------------------------------------------------------
r6053 | sunny | 2009-10-12 07:37:49 +0300 (Mon, 12 Oct 2009) | 6 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Copy the maximum AUTOINC value from the old table to the new
table when MySQL does a CREATE INDEX ON T. This is required because MySQL
does a table copy, rename and drops the old table.
Fix Bug#47125: auto_increment start value is ignored if an index is created and engine=innodb
rb://168
------------------------------------------------------------------------
r6076 | vasil | 2009-10-14 19:30:12 +0300 (Wed, 14 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/row/row0mysql.c
branches/5.1:
Fix typo.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6114 | vasil | 2009-10-30 03:43:51 +1100 (Fri, 30 Oct 2009) | 6 lines
branches/zip:
* Add ChangeLog entries for latest changes
* Obey alphabetical order in the list of the files
* White-space fixup
------------------------------------------------------------------------
r6121 | sunny | 2009-10-30 10:42:11 +1100 (Fri, 30 Oct 2009) | 7 lines
branches/zip: This test has been problematic for sometime now. The underlying
bug is that the data dictionaries get out of sync. In the AUTOINC code we
try and apply salve to the symptoms. In the past MySQL made some unrelated
change and the dictionaries stopped getting out of sync and this test started
to fail. Now, it seems they have reverted that changed and the test is
passing again. I suspect this is not he last time that this test will change.
------------------------------------------------------------------------
r6124 | jyang | 2009-10-30 19:02:31 +1100 (Fri, 30 Oct 2009) | 5 lines
branches/zip: Correct the bug number for -r6109 change
from # 48273 to #48237
------------------------------------------------------------------------
r6126 | vasil | 2009-10-30 19:36:07 +1100 (Fri, 30 Oct 2009) | 45 lines
branches/zip: Merge r6112:6125 from branches/5.1:
(skipping r6122 and r6123, Jimmy says these are already present and need
not be merged):
------------------------------------------------------------------------
r6122 | jyang | 2009-10-30 05:18:38 +0200 (Fri, 30 Oct 2009) | 7 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb_bug44369.result
M /branches/5.1/mysql-test/innodb_bug44369.test
M /branches/5.1/mysql-test/innodb_bug46000.result
M /branches/5.1/mysql-test/innodb_bug46000.test
branches/5.1: Chnage WARN_LEVEL_ERROR to WARN_LEVEL_WARN
for push_warning_printf() call in innodb.
Fix Bug#47233: Innodb calls push_warning(MYSQL_ERROR::WARN_LEVEL_ERROR)
rb://170 approved by Marko.
------------------------------------------------------------------------
r6123 | jyang | 2009-10-30 05:43:06 +0200 (Fri, 30 Oct 2009) | 8 lines
Changed paths:
M /branches/5.1/os/os0proc.c
branches/5.1: In os_mem_alloc_large(), if we fail to attach
the shared memory, reset memory pointer ptr to NULL, and
allocate memory from conventional pool. This is a port
from branches/zip.
Bug #48237 Error handling in os_mem_alloc_large appears to be incorrect
rb://198 Approved by: Marko
------------------------------------------------------------------------
r6125 | vasil | 2009-10-30 10:31:23 +0200 (Fri, 30 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
White-space fixup.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6130 | marko | 2009-11-02 20:42:56 +1100 (Mon, 02 Nov 2009) | 9 lines
branches/zip: Free all resources at shutdown. Set pointers to NULL, so
that Valgrind will not complain about freed data structures that are
reachable via pointers. This addresses Bug #45992 and Bug #46656.
This patch is mostly based on changes copied from branches/embedded-1.0,
mainly c5432, c3439, c3134, c2994, c2978, but also some other code was
copied. Some added cleanup code is specific to MySQL/InnoDB.
rb://199 approved by Sunny Bains
------------------------------------------------------------------------
16 years ago  MDEV-28766: SET GLOBAL innodb_log_file_buffering
In commit c4c88307091cb16886562e9e7b77f5fd077d34b5 (MDEV-28111) we disabled
the file system cache on the InnoDB write-ahead log file (ib_logfile0)
by default on Linux.
It turns out that especially with innodb_flush_trx_log_at_commit=2,
writing to the log via the file system cache typically improves throughput,
especially on slow storage or at a small number of concurrent transactions.
For other values of innodb_flush_log_at_trx_commit, direct writes were
observed to be mostly but not always faster. Whether it pays off to
disable the file system cache on the log may depend on the type of storage,
the workload, and the operating system kernel version.
On Linux and Microsoft Windows, we will introduce the settable Boolean
global variable innodb_log_file_buffering that indicates whether the
file system cache on the redo log file is enabled. The default value is
innodb_log_file_buffering=OFF. If the server is started up with
innodb_flush_log_at_trx_commit=2, the value will be changed to
innodb_log_file_buffering=ON.
When a persistent memory interface is being used for the log,
the value cannot be changed from innodb_log_file_buffering=OFF.
On Linux, when the physical block size cannot be determined
to be a power of 2 between 64 and 4096 bytes, the file system cache
cannot be disabled, and innodb_log_file_buffering=ON cannot be changed.
Server log messages will indicate whether the file system cache is
enabled for the redo log:
[Note] InnoDB: Buffered log writes (block size=512 bytes)
[Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
After this change, the startup parameter innodb_flush_method will no
longer control whether O_DIRECT will be set on the redo log on Linux.
On other operating systems that support O_DIRECT, no interface has been
implemented for controlling the file system cache for the redo log.
The innodb_flush_method values O_DIRECT, O_DIRECT_NO_FSYNC, O_DSYNC
will enable O_DIRECT for data files, not the log.
Tested by: Matthias Leich, Axel Schwenke
3 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  branches/innodb+: Merge revisions r5971:6130 from branches/zip.
------------------------------------------------------------------------
r5971 | marko | 2009-09-23 23:03:51 +1000 (Wed, 23 Sep 2009) | 2 lines
branches/zip: os_file_pwrite(): Make the code compile in InnoDB Hot Backup
when the pwrite system call is not available.
------------------------------------------------------------------------
r5972 | marko | 2009-09-24 05:44:52 +1000 (Thu, 24 Sep 2009) | 5 lines
branches/zip: fil_node_open_file(): In InnoDB Hot Backup,
determine the page size of single-file tablespaces before computing
the file node size. Otherwise, the space->size of compressed tablespaces
would be computed with UNIV_PAGE_SIZE instead of key_block_size.
This should fix Issue #313.
------------------------------------------------------------------------
r5973 | marko | 2009-09-24 05:53:21 +1000 (Thu, 24 Sep 2009) | 2 lines
branches/zip: recv_add_to_hash_table():
Simplify obfuscated pointer arithmetics.
------------------------------------------------------------------------
r5978 | marko | 2009-09-24 17:47:56 +1000 (Thu, 24 Sep 2009) | 1 line
branches/zip: Fix warnings and errors when UNIV_HOTBACKUP is defined.
------------------------------------------------------------------------
r5979 | marko | 2009-09-24 20:16:10 +1000 (Thu, 24 Sep 2009) | 4 lines
branches/zip: ha_innodb.cc: Define MYSQL_PLUGIN_IMPORT when necessary.
This preprocessor symbol has been recently introduced in MySQL 5.1.
The InnoDB Plugin should remain source compatible with MySQL 5.1.24
and later.
------------------------------------------------------------------------
r5988 | calvin | 2009-09-26 05:14:43 +1000 (Sat, 26 Sep 2009) | 8 lines
branches/zip: fix bug#47055 unconditional exit(1) on ERROR_WORKING_SET_QUOTA
1453 (0x5AD) for InnoDB backend
When error ERROR_WORKING_SET_QUOTA or ERROR_NO_SYSTEM_RESOURCES
occurs, yields for 100ms and retries the operation.
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r5992 | vasil | 2009-09-28 17:10:29 +1000 (Mon, 28 Sep 2009) | 4 lines
branches/zip:
Add ChangeLog entry for c5988.
------------------------------------------------------------------------
r5994 | marko | 2009-09-28 18:33:59 +1000 (Mon, 28 Sep 2009) | 17 lines
branches/zip: Try to prevent the reuse of tablespace identifiers after
InnoDB has crashed during table creation. Also, refuse to start if
files with duplicate tablespace identifiers are encountered.
fil_node_create(): Update fil_system->max_assigned_id. This should
prevent the reuse of a space->id when InnoDB does a full crash
recovery and invokes fil_load_single_table_tablespaces(). Normally,
fil_system->max_assigned_id is initialized from
SELECT MAX(ID) FROM SYS_TABLES.
fil_open_single_table_tablespace(): Return FALSE when
fil_space_create() fails.
fil_load_single_table_tablespace(): Exit if fil_space_create() fails
and innodb_force_recovery=0.
rb://173 approved by Heikki Tuuri. This addresses Issue #335.
------------------------------------------------------------------------
r5995 | marko | 2009-09-28 18:52:25 +1000 (Mon, 28 Sep 2009) | 17 lines
branches/zip: Do not write to PAGE_INDEX_ID after page creation,
not even when restoring an uncompressed page after a compression failure.
btr_page_reorganize_low(): On compression failure, do not restore
those page header fields that should not be affected by the
reorganization. Instead, compare the fields.
page_zip_decompress(): Add the parameter ibool all, for copying all
page header fields. Pass the parameter all=TRUE on block read
completion, redo log application, and page_zip_validate(); pass
all=FALSE in all other cases.
page_zip_reorganize(): Do not restore the uncompressed page on
failure. It will be restored (to pre-modification state) by the
caller anyway.
rb://167, Issue #346
------------------------------------------------------------------------
r5996 | marko | 2009-09-28 22:46:02 +1000 (Mon, 28 Sep 2009) | 4 lines
branches/zip: Address Issue #350 in comments.
lock_rec_queue_validate(), lock_rec_queue_validate(): Note that
this debug code may violate the latching order and cause deadlocks.
------------------------------------------------------------------------
r5997 | marko | 2009-09-28 23:03:58 +1000 (Mon, 28 Sep 2009) | 12 lines
branches/zip: Remove an assertion failure when the InnoDB data dictionary
is inconsistent with the MySQL .frm file.
ha_innobase::index_read(): When the index cannot be found,
return an error.
ha_innobase::change_active_index(): When prebuilt->index == NULL,
set also prebuilt->index_usable = FALSE. This is not needed for
correctness, because prebuilt->index_usable is only checked by
row_search_for_mysql(), which requires prebuilt->index != NULL.
This addresses Issue #349. Approved by Heikki Tuuri over IM.
------------------------------------------------------------------------
r6005 | vasil | 2009-09-29 18:09:52 +1000 (Tue, 29 Sep 2009) | 4 lines
branches/zip:
ChangeLog: wrap around 78th column, not earlier.
------------------------------------------------------------------------
r6006 | vasil | 2009-09-29 20:15:25 +1000 (Tue, 29 Sep 2009) | 4 lines
branches/zip:
Add ChangeLog entry for the release of 1.0.4.
------------------------------------------------------------------------
r6007 | vasil | 2009-09-29 23:19:59 +1000 (Tue, 29 Sep 2009) | 6 lines
branches/zip:
Fix the year, should be 2009.
Pointed by: Calvin
------------------------------------------------------------------------
r6026 | marko | 2009-09-30 17:18:24 +1000 (Wed, 30 Sep 2009) | 1 line
branches/zip: Add some debug assertions for checking FSEG_MAGIC_N.
------------------------------------------------------------------------
r6028 | marko | 2009-09-30 23:55:23 +1000 (Wed, 30 Sep 2009) | 3 lines
branches/zip: recv_no_log_write: New debug flag for tracking down
Mantis Issue #347. No modifications should be made to the database
while recv_apply_hashed_log_recs() is about to complete.
------------------------------------------------------------------------
r6029 | calvin | 2009-10-01 06:32:02 +1000 (Thu, 01 Oct 2009) | 4 lines
branches/zip: non-functional changes
Fix typo.
------------------------------------------------------------------------
r6031 | marko | 2009-10-01 21:24:33 +1000 (Thu, 01 Oct 2009) | 49 lines
branches/zip: Clean up after a crash during DROP INDEX.
When InnoDB crashes while dropping an index, ensure that
the index will be completely dropped during crash recovery.
row_merge_drop_index(): Before dropping an index, rename the index to
start with TEMP_INDEX_PREFIX_STR and commit the change, so that
row_merge_drop_temp_indexes() will drop the index after crash
recovery if the server crashes while dropping the index.
fseg_inode_try_get(): New function, forked from fseg_inode_get().
Return NULL if the file segment index node is free.
fseg_inode_get(): Assert that the file segment index node is not free.
fseg_free_step(): If the file segment index node is already free,
print a diagnostic message and return TRUE.
fsp_free_seg_inode(): Write a nonzero number to FSEG_MAGIC_N, so that
allocated-and-freed file segment index nodes can be better
distinguished from uninitialized ones.
This is rb://174, addressing Issue #348.
Tested by restarting mysqld upon the completion of the added
log_write_up_to() invocation below, during DROP INDEX. The index was
dropped after crash recovery, and re-issuing the DROP INDEX did not
crash the server.
Index: btr/btr0btr.c
===================================================================
--- btr/btr0btr.c (revision 6026)
+++ btr/btr0btr.c (working copy)
@@ -42,6 +42,7 @@ Created 6/2/1994 Heikki Tuuri
#include "ibuf0ibuf.h"
#include "trx0trx.h"
+#include "log0log.h"
/*
Latching strategy of the InnoDB B-tree
--------------------------------------
@@ -873,6 +874,8 @@ leaf_loop:
goto leaf_loop;
}
+
+ log_write_up_to(mtr.end_lsn, LOG_WAIT_ALL_GROUPS, TRUE);
top_loop:
mtr_start(&mtr);
------------------------------------------------------------------------
r6033 | calvin | 2009-10-02 06:19:46 +1000 (Fri, 02 Oct 2009) | 4 lines
branches/zip: fix a typo in error message
Reported as bug#47763.
------------------------------------------------------------------------
r6043 | inaam | 2009-10-06 01:45:35 +1100 (Tue, 06 Oct 2009) | 12 lines
branches/zip rb://176
Do not invalidate buffer pool while an LRU batch is active. Added
code to buf_pool_invalidate() to wait for the running batches to finish.
This patch also resets the state of buf_pool struct at invalidation. This
addresses the concern where buf_pool->freed_page_clock becomes non-zero
because we read in a system tablespace page for file format info at
startup.
Approved by: Marko
------------------------------------------------------------------------
r6044 | pekka | 2009-10-07 01:44:54 +1100 (Wed, 07 Oct 2009) | 5 lines
branches/zip:
Add os_file_is_same() function for Hot Backup (inside ifdef UNIV_HOTBACKUP).
This is part of the fix for Issue #186.
Note! The Windows implementation is incomplete.
------------------------------------------------------------------------
r6046 | pekka | 2009-10-08 20:24:56 +1100 (Thu, 08 Oct 2009) | 3 lines
branches/zip: Revert r6044 which added os_file_is_same() function
(issue#186). This functionality is moved to Hot Backup source tree.
------------------------------------------------------------------------
r6048 | vasil | 2009-10-09 16:42:55 +1100 (Fri, 09 Oct 2009) | 16 lines
branches/zip:
When scanning a directory readdir() is called and stat() after it,
if a file is deleted between the two calls stat will fail and the
whole precedure will fail. Change this behavior to continue with the
next entry if stat() fails because of nonexistent file. This is
transparent change as it will make it look as if the file was deleted
before the readdir() call.
This change is needed in order to fix
https://svn.innodb.com/mantis/view.php?id=174
in which we need to abort if os_file_readdir_next_file()
encounters "real" errors.
Approved by: Marko, Pekka (rb://177)
------------------------------------------------------------------------
r6049 | vasil | 2009-10-10 03:05:26 +1100 (Sat, 10 Oct 2009) | 7 lines
branches/zip:
Fix compilation warning in Hot Backup:
innodb/fil/fil0fil.c: In function 'fil_load_single_table_tablespace':
innodb/fil/fil0fil.c:3253: warning: format '%lld' expects type 'long long int', but argument 6 has type 'ib_int64_t'
------------------------------------------------------------------------
r6064 | calvin | 2009-10-14 02:23:35 +1100 (Wed, 14 Oct 2009) | 4 lines
branches/zip: non-functional changes
Changes from MySQL to fix build issue.
------------------------------------------------------------------------
r6065 | inaam | 2009-10-14 04:43:13 +1100 (Wed, 14 Oct 2009) | 7 lines
branches/zip rb://182
Call fsync() on datafiles after a batch of pages is written to disk
even when skip_innodb_doublewrite is set.
Approved by: Heikki
------------------------------------------------------------------------
r6080 | sunny | 2009-10-15 09:29:01 +1100 (Thu, 15 Oct 2009) | 3 lines
branches/zip: Change page_mem_alloc_free() to inline.
Fix Bug #47058 - Failure to compile innodb_plugin on solaris 10u7 + spro cc/CC 5.10
------------------------------------------------------------------------
r6084 | vasil | 2009-10-15 16:21:17 +1100 (Thu, 15 Oct 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r6080.
------------------------------------------------------------------------
r6095 | vasil | 2009-10-20 00:04:59 +1100 (Tue, 20 Oct 2009) | 7 lines
branches/zip:
Fix Bug#47808 innodb_information_schema.test fails when run under valgrind
by using the wait_until_rows_count macro that loops until the number of
rows becomes 14 instead of sleep 0.1, which is obviously very fragile.
------------------------------------------------------------------------
r6096 | vasil | 2009-10-20 00:06:09 +1100 (Tue, 20 Oct 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r6095.
------------------------------------------------------------------------
r6099 | jyang | 2009-10-22 13:58:39 +1100 (Thu, 22 Oct 2009) | 7 lines
branches/zip: Port bug #46000 related changes from 5.1 to zip
branch. Due to different code path for creating index in zip
branch comparing to 5.1), the index reserved name check function
is extended to be used in ha_innobase::add_index().
rb://190 Approved by: Marko
------------------------------------------------------------------------
r6100 | jyang | 2009-10-22 14:51:07 +1100 (Thu, 22 Oct 2009) | 6 lines
branches/zip: As a request from mysql, WARN_LEVEL_ERROR cannot
be used for push_warning_* call any more. Switch to
WARN_LEVEL_WARN. Bug #47233.
rb://172 approved by Sunny Bains and Marko.
------------------------------------------------------------------------
r6101 | jyang | 2009-10-23 19:45:50 +1100 (Fri, 23 Oct 2009) | 7 lines
branches/zip: Update test result with the WARN_LEVEL_ERROR
to WARN_LEVEL_WARN change. This is the same result as
submitted in rb://172 review, which approved by Sunny Bains
and Marko.
------------------------------------------------------------------------
r6102 | marko | 2009-10-26 18:32:23 +1100 (Mon, 26 Oct 2009) | 1 line
branches/zip: row_prebuilt_struct::prebuilts: Unused field, remove.
------------------------------------------------------------------------
r6103 | marko | 2009-10-27 00:46:18 +1100 (Tue, 27 Oct 2009) | 4 lines
branches/zip: row_ins_alloc_sys_fields(): Zero out the system columns
DB_TRX_ID, DB_ROLL_PTR and DB_ROW_ID, in order to avoid harmless
Valgrind warnings about uninitialized data. (The warnings were
harmless, because the fields would be initialized at a later stage.)
------------------------------------------------------------------------
r6105 | calvin | 2009-10-28 09:05:52 +1100 (Wed, 28 Oct 2009) | 6 lines
branches/zip: backport r3848 from 6.0 branch
----
branches/6.0: innobase_start_or_create_for_mysql(): Make the 10 MB
minimum tablespace limit independent of UNIV_PAGE_SIZE. (Bug #41490)
------------------------------------------------------------------------
r6107 | marko | 2009-10-29 01:10:34 +1100 (Thu, 29 Oct 2009) | 5 lines
branches/zip: buf_page_set_old(): Improve UNIV_LRU_DEBUG diagnostics
in order to catch the buf_pool->LRU_old corruption reported in Issue #381.
buf_LRU_old_init(): Set the property from the tail towards the front
of the buf_pool->LRU list, in order not to trip the debug check.
------------------------------------------------------------------------
r6108 | calvin | 2009-10-29 16:58:04 +1100 (Thu, 29 Oct 2009) | 5 lines
branches/zip: close file handle when building with UNIV_HOTBACKUP
The change does not affect regular InnoDB engine. Confirmed by
Marko.
------------------------------------------------------------------------
r6109 | jyang | 2009-10-29 19:37:32 +1100 (Thu, 29 Oct 2009) | 7 lines
branches/zip: In os_mem_alloc_large(), if we fail to attach
the shared memory, reset memory pointer ptr to NULL, and
allocate memory from conventional pool.
Bug #48237 Error handling in os_mem_alloc_large appears to be incorrect
rb://198 Approved by: Marko
------------------------------------------------------------------------
r6110 | marko | 2009-10-29 21:44:57 +1100 (Thu, 29 Oct 2009) | 2 lines
branches/zip: Makefile.am (INCLUDES): Merge a change from MySQL:
Use $(srcdir)/include instead of $(top_srcdir)/storage/innobase/include.
------------------------------------------------------------------------
r6111 | marko | 2009-10-29 22:04:11 +1100 (Thu, 29 Oct 2009) | 33 lines
branches/zip: Fix corruption of buf_pool->LRU_old and improve debug assertions.
This was reported as Issue #381.
buf_page_set_old(): Assert that blocks may only be set old if
buf_pool->LRU_old is initialized and buf_pool->LRU_old_len is nonzero.
Assert that buf_pool->LRU_old points to the block at the old/new boundary.
buf_LRU_old_adjust_len(): Invoke buf_page_set_old() after adjusting
buf_pool->LRU_old and buf_pool->LRU_old_len, in order not to violate
the added assertions.
buf_LRU_old_init(): Replace buf_page_set_old() with a direct
assignment to bpage->old, because these loops that initialize all the
blocks would temporarily violate the assertions about
buf_pool->LRU_old.
buf_LRU_remove_block(): When setting buf_pool->LRU_old = NULL, also
clear all bpage->old flags and set buf_pool->LRU_old_len = 0.
buf_LRU_add_block_to_end_low(), buf_LRU_add_block_low(): Move the
buf_page_set_old() call later in order not to violate the debug
assertions. If buf_pool->LRU_old is NULL, set old=FALSE.
buf_LRU_free_block(): Replace the UNIV_LRU_DEBUG assertion with a
dummy buf_page_set_old() call that performs more thorough checks.
buf_LRU_validate(): Do not tolerate garbage in buf_pool->LRU_old_len
even if buf_pool->LRU_old is NULL. Check that bpage->old is monotonic.
buf_relocate(): Make the UNIV_LRU_DEBUG checks stricter.
buf0buf.h: Revise the documentation of buf_page_t::old and
buf_pool_t::LRU_old_len.
------------------------------------------------------------------------
r6112 | calvin | 2009-10-30 01:21:15 +1100 (Fri, 30 Oct 2009) | 4 lines
branches/zip: consideration for icc compilers
Proposed by MySQL, and approved by Marko.
------------------------------------------------------------------------
r6113 | vasil | 2009-10-30 03:15:50 +1100 (Fri, 30 Oct 2009) | 93 lines
branches/zip: Merge r5912:6112 from branches/5.1:
(after this merge the innodb-autoinc test starts to fail, but
I commit anyway because it would be easier to investigate the
failure this way)
------------------------------------------------------------------------
r5952 | calvin | 2009-09-22 19:45:07 +0300 (Tue, 22 Sep 2009) | 7 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: fix bug#42383: Can't create table 'test.bug39438'
For embedded server, MySQL may pass in full path, which is
currently disallowed. It is needed to relax the condition by
accepting full paths in the embedded case.
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r6032 | vasil | 2009-10-01 15:55:49 +0300 (Thu, 01 Oct 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Fix Bug#38996 Race condition in ANALYZE TABLE
by serializing ANALYZE TABLE inside InnoDB.
Approved by: Heikki (rb://175)
------------------------------------------------------------------------
r6045 | jyang | 2009-10-08 02:27:08 +0300 (Thu, 08 Oct 2009) | 7 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
A /branches/5.1/mysql-test/innodb_bug47777.result
A /branches/5.1/mysql-test/innodb_bug47777.test
branches/5.1: Fix bug #47777. Treat the Geometry data same as
Binary BLOB in ha_innobase::store_key_val_for_row(), since the
Geometry data is stored as Binary BLOB in Innodb.
Review: rb://180 approved by Marko Makela.
------------------------------------------------------------------------
r6051 | sunny | 2009-10-12 07:05:00 +0300 (Mon, 12 Oct 2009) | 6 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Ignore negative values supplied by the user when calculating the
next value to store in dict_table_t. Setting autoincrement columns top negative
values is undefined behavior and this change should bring the behavior of
InnoDB closer to what users expect. Added several tests to check.
rb://162
------------------------------------------------------------------------
r6052 | sunny | 2009-10-12 07:09:56 +0300 (Mon, 12 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Reset the statement level autoinc counter on ROLLBACK. Fix
the test results too.
rb://164
------------------------------------------------------------------------
r6053 | sunny | 2009-10-12 07:37:49 +0300 (Mon, 12 Oct 2009) | 6 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Copy the maximum AUTOINC value from the old table to the new
table when MySQL does a CREATE INDEX ON T. This is required because MySQL
does a table copy, rename and drops the old table.
Fix Bug#47125: auto_increment start value is ignored if an index is created and engine=innodb
rb://168
------------------------------------------------------------------------
r6076 | vasil | 2009-10-14 19:30:12 +0300 (Wed, 14 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/row/row0mysql.c
branches/5.1:
Fix typo.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6114 | vasil | 2009-10-30 03:43:51 +1100 (Fri, 30 Oct 2009) | 6 lines
branches/zip:
* Add ChangeLog entries for latest changes
* Obey alphabetical order in the list of the files
* White-space fixup
------------------------------------------------------------------------
r6121 | sunny | 2009-10-30 10:42:11 +1100 (Fri, 30 Oct 2009) | 7 lines
branches/zip: This test has been problematic for sometime now. The underlying
bug is that the data dictionaries get out of sync. In the AUTOINC code we
try and apply salve to the symptoms. In the past MySQL made some unrelated
change and the dictionaries stopped getting out of sync and this test started
to fail. Now, it seems they have reverted that changed and the test is
passing again. I suspect this is not he last time that this test will change.
------------------------------------------------------------------------
r6124 | jyang | 2009-10-30 19:02:31 +1100 (Fri, 30 Oct 2009) | 5 lines
branches/zip: Correct the bug number for -r6109 change
from # 48273 to #48237
------------------------------------------------------------------------
r6126 | vasil | 2009-10-30 19:36:07 +1100 (Fri, 30 Oct 2009) | 45 lines
branches/zip: Merge r6112:6125 from branches/5.1:
(skipping r6122 and r6123, Jimmy says these are already present and need
not be merged):
------------------------------------------------------------------------
r6122 | jyang | 2009-10-30 05:18:38 +0200 (Fri, 30 Oct 2009) | 7 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb_bug44369.result
M /branches/5.1/mysql-test/innodb_bug44369.test
M /branches/5.1/mysql-test/innodb_bug46000.result
M /branches/5.1/mysql-test/innodb_bug46000.test
branches/5.1: Chnage WARN_LEVEL_ERROR to WARN_LEVEL_WARN
for push_warning_printf() call in innodb.
Fix Bug#47233: Innodb calls push_warning(MYSQL_ERROR::WARN_LEVEL_ERROR)
rb://170 approved by Marko.
------------------------------------------------------------------------
r6123 | jyang | 2009-10-30 05:43:06 +0200 (Fri, 30 Oct 2009) | 8 lines
Changed paths:
M /branches/5.1/os/os0proc.c
branches/5.1: In os_mem_alloc_large(), if we fail to attach
the shared memory, reset memory pointer ptr to NULL, and
allocate memory from conventional pool. This is a port
from branches/zip.
Bug #48237 Error handling in os_mem_alloc_large appears to be incorrect
rb://198 Approved by: Marko
------------------------------------------------------------------------
r6125 | vasil | 2009-10-30 10:31:23 +0200 (Fri, 30 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
White-space fixup.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6130 | marko | 2009-11-02 20:42:56 +1100 (Mon, 02 Nov 2009) | 9 lines
branches/zip: Free all resources at shutdown. Set pointers to NULL, so
that Valgrind will not complain about freed data structures that are
reachable via pointers. This addresses Bug #45992 and Bug #46656.
This patch is mostly based on changes copied from branches/embedded-1.0,
mainly c5432, c3439, c3134, c2994, c2978, but also some other code was
copied. Some added cleanup code is specific to MySQL/InnoDB.
rb://199 approved by Sunny Bains
------------------------------------------------------------------------
16 years ago  branches/innodb+: Merge revisions r5971:6130 from branches/zip.
------------------------------------------------------------------------
r5971 | marko | 2009-09-23 23:03:51 +1000 (Wed, 23 Sep 2009) | 2 lines
branches/zip: os_file_pwrite(): Make the code compile in InnoDB Hot Backup
when the pwrite system call is not available.
------------------------------------------------------------------------
r5972 | marko | 2009-09-24 05:44:52 +1000 (Thu, 24 Sep 2009) | 5 lines
branches/zip: fil_node_open_file(): In InnoDB Hot Backup,
determine the page size of single-file tablespaces before computing
the file node size. Otherwise, the space->size of compressed tablespaces
would be computed with UNIV_PAGE_SIZE instead of key_block_size.
This should fix Issue #313.
------------------------------------------------------------------------
r5973 | marko | 2009-09-24 05:53:21 +1000 (Thu, 24 Sep 2009) | 2 lines
branches/zip: recv_add_to_hash_table():
Simplify obfuscated pointer arithmetics.
------------------------------------------------------------------------
r5978 | marko | 2009-09-24 17:47:56 +1000 (Thu, 24 Sep 2009) | 1 line
branches/zip: Fix warnings and errors when UNIV_HOTBACKUP is defined.
------------------------------------------------------------------------
r5979 | marko | 2009-09-24 20:16:10 +1000 (Thu, 24 Sep 2009) | 4 lines
branches/zip: ha_innodb.cc: Define MYSQL_PLUGIN_IMPORT when necessary.
This preprocessor symbol has been recently introduced in MySQL 5.1.
The InnoDB Plugin should remain source compatible with MySQL 5.1.24
and later.
------------------------------------------------------------------------
r5988 | calvin | 2009-09-26 05:14:43 +1000 (Sat, 26 Sep 2009) | 8 lines
branches/zip: fix bug#47055 unconditional exit(1) on ERROR_WORKING_SET_QUOTA
1453 (0x5AD) for InnoDB backend
When error ERROR_WORKING_SET_QUOTA or ERROR_NO_SYSTEM_RESOURCES
occurs, yields for 100ms and retries the operation.
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r5992 | vasil | 2009-09-28 17:10:29 +1000 (Mon, 28 Sep 2009) | 4 lines
branches/zip:
Add ChangeLog entry for c5988.
------------------------------------------------------------------------
r5994 | marko | 2009-09-28 18:33:59 +1000 (Mon, 28 Sep 2009) | 17 lines
branches/zip: Try to prevent the reuse of tablespace identifiers after
InnoDB has crashed during table creation. Also, refuse to start if
files with duplicate tablespace identifiers are encountered.
fil_node_create(): Update fil_system->max_assigned_id. This should
prevent the reuse of a space->id when InnoDB does a full crash
recovery and invokes fil_load_single_table_tablespaces(). Normally,
fil_system->max_assigned_id is initialized from
SELECT MAX(ID) FROM SYS_TABLES.
fil_open_single_table_tablespace(): Return FALSE when
fil_space_create() fails.
fil_load_single_table_tablespace(): Exit if fil_space_create() fails
and innodb_force_recovery=0.
rb://173 approved by Heikki Tuuri. This addresses Issue #335.
------------------------------------------------------------------------
r5995 | marko | 2009-09-28 18:52:25 +1000 (Mon, 28 Sep 2009) | 17 lines
branches/zip: Do not write to PAGE_INDEX_ID after page creation,
not even when restoring an uncompressed page after a compression failure.
btr_page_reorganize_low(): On compression failure, do not restore
those page header fields that should not be affected by the
reorganization. Instead, compare the fields.
page_zip_decompress(): Add the parameter ibool all, for copying all
page header fields. Pass the parameter all=TRUE on block read
completion, redo log application, and page_zip_validate(); pass
all=FALSE in all other cases.
page_zip_reorganize(): Do not restore the uncompressed page on
failure. It will be restored (to pre-modification state) by the
caller anyway.
rb://167, Issue #346
------------------------------------------------------------------------
r5996 | marko | 2009-09-28 22:46:02 +1000 (Mon, 28 Sep 2009) | 4 lines
branches/zip: Address Issue #350 in comments.
lock_rec_queue_validate(), lock_rec_queue_validate(): Note that
this debug code may violate the latching order and cause deadlocks.
------------------------------------------------------------------------
r5997 | marko | 2009-09-28 23:03:58 +1000 (Mon, 28 Sep 2009) | 12 lines
branches/zip: Remove an assertion failure when the InnoDB data dictionary
is inconsistent with the MySQL .frm file.
ha_innobase::index_read(): When the index cannot be found,
return an error.
ha_innobase::change_active_index(): When prebuilt->index == NULL,
set also prebuilt->index_usable = FALSE. This is not needed for
correctness, because prebuilt->index_usable is only checked by
row_search_for_mysql(), which requires prebuilt->index != NULL.
This addresses Issue #349. Approved by Heikki Tuuri over IM.
------------------------------------------------------------------------
r6005 | vasil | 2009-09-29 18:09:52 +1000 (Tue, 29 Sep 2009) | 4 lines
branches/zip:
ChangeLog: wrap around 78th column, not earlier.
------------------------------------------------------------------------
r6006 | vasil | 2009-09-29 20:15:25 +1000 (Tue, 29 Sep 2009) | 4 lines
branches/zip:
Add ChangeLog entry for the release of 1.0.4.
------------------------------------------------------------------------
r6007 | vasil | 2009-09-29 23:19:59 +1000 (Tue, 29 Sep 2009) | 6 lines
branches/zip:
Fix the year, should be 2009.
Pointed by: Calvin
------------------------------------------------------------------------
r6026 | marko | 2009-09-30 17:18:24 +1000 (Wed, 30 Sep 2009) | 1 line
branches/zip: Add some debug assertions for checking FSEG_MAGIC_N.
------------------------------------------------------------------------
r6028 | marko | 2009-09-30 23:55:23 +1000 (Wed, 30 Sep 2009) | 3 lines
branches/zip: recv_no_log_write: New debug flag for tracking down
Mantis Issue #347. No modifications should be made to the database
while recv_apply_hashed_log_recs() is about to complete.
------------------------------------------------------------------------
r6029 | calvin | 2009-10-01 06:32:02 +1000 (Thu, 01 Oct 2009) | 4 lines
branches/zip: non-functional changes
Fix typo.
------------------------------------------------------------------------
r6031 | marko | 2009-10-01 21:24:33 +1000 (Thu, 01 Oct 2009) | 49 lines
branches/zip: Clean up after a crash during DROP INDEX.
When InnoDB crashes while dropping an index, ensure that
the index will be completely dropped during crash recovery.
row_merge_drop_index(): Before dropping an index, rename the index to
start with TEMP_INDEX_PREFIX_STR and commit the change, so that
row_merge_drop_temp_indexes() will drop the index after crash
recovery if the server crashes while dropping the index.
fseg_inode_try_get(): New function, forked from fseg_inode_get().
Return NULL if the file segment index node is free.
fseg_inode_get(): Assert that the file segment index node is not free.
fseg_free_step(): If the file segment index node is already free,
print a diagnostic message and return TRUE.
fsp_free_seg_inode(): Write a nonzero number to FSEG_MAGIC_N, so that
allocated-and-freed file segment index nodes can be better
distinguished from uninitialized ones.
This is rb://174, addressing Issue #348.
Tested by restarting mysqld upon the completion of the added
log_write_up_to() invocation below, during DROP INDEX. The index was
dropped after crash recovery, and re-issuing the DROP INDEX did not
crash the server.
Index: btr/btr0btr.c
===================================================================
--- btr/btr0btr.c (revision 6026)
+++ btr/btr0btr.c (working copy)
@@ -42,6 +42,7 @@ Created 6/2/1994 Heikki Tuuri
#include "ibuf0ibuf.h"
#include "trx0trx.h"
+#include "log0log.h"
/*
Latching strategy of the InnoDB B-tree
--------------------------------------
@@ -873,6 +874,8 @@ leaf_loop:
goto leaf_loop;
}
+
+ log_write_up_to(mtr.end_lsn, LOG_WAIT_ALL_GROUPS, TRUE);
top_loop:
mtr_start(&mtr);
------------------------------------------------------------------------
r6033 | calvin | 2009-10-02 06:19:46 +1000 (Fri, 02 Oct 2009) | 4 lines
branches/zip: fix a typo in error message
Reported as bug#47763.
------------------------------------------------------------------------
r6043 | inaam | 2009-10-06 01:45:35 +1100 (Tue, 06 Oct 2009) | 12 lines
branches/zip rb://176
Do not invalidate buffer pool while an LRU batch is active. Added
code to buf_pool_invalidate() to wait for the running batches to finish.
This patch also resets the state of buf_pool struct at invalidation. This
addresses the concern where buf_pool->freed_page_clock becomes non-zero
because we read in a system tablespace page for file format info at
startup.
Approved by: Marko
------------------------------------------------------------------------
r6044 | pekka | 2009-10-07 01:44:54 +1100 (Wed, 07 Oct 2009) | 5 lines
branches/zip:
Add os_file_is_same() function for Hot Backup (inside ifdef UNIV_HOTBACKUP).
This is part of the fix for Issue #186.
Note! The Windows implementation is incomplete.
------------------------------------------------------------------------
r6046 | pekka | 2009-10-08 20:24:56 +1100 (Thu, 08 Oct 2009) | 3 lines
branches/zip: Revert r6044 which added os_file_is_same() function
(issue#186). This functionality is moved to Hot Backup source tree.
------------------------------------------------------------------------
r6048 | vasil | 2009-10-09 16:42:55 +1100 (Fri, 09 Oct 2009) | 16 lines
branches/zip:
When scanning a directory readdir() is called and stat() after it,
if a file is deleted between the two calls stat will fail and the
whole precedure will fail. Change this behavior to continue with the
next entry if stat() fails because of nonexistent file. This is
transparent change as it will make it look as if the file was deleted
before the readdir() call.
This change is needed in order to fix
https://svn.innodb.com/mantis/view.php?id=174
in which we need to abort if os_file_readdir_next_file()
encounters "real" errors.
Approved by: Marko, Pekka (rb://177)
------------------------------------------------------------------------
r6049 | vasil | 2009-10-10 03:05:26 +1100 (Sat, 10 Oct 2009) | 7 lines
branches/zip:
Fix compilation warning in Hot Backup:
innodb/fil/fil0fil.c: In function 'fil_load_single_table_tablespace':
innodb/fil/fil0fil.c:3253: warning: format '%lld' expects type 'long long int', but argument 6 has type 'ib_int64_t'
------------------------------------------------------------------------
r6064 | calvin | 2009-10-14 02:23:35 +1100 (Wed, 14 Oct 2009) | 4 lines
branches/zip: non-functional changes
Changes from MySQL to fix build issue.
------------------------------------------------------------------------
r6065 | inaam | 2009-10-14 04:43:13 +1100 (Wed, 14 Oct 2009) | 7 lines
branches/zip rb://182
Call fsync() on datafiles after a batch of pages is written to disk
even when skip_innodb_doublewrite is set.
Approved by: Heikki
------------------------------------------------------------------------
r6080 | sunny | 2009-10-15 09:29:01 +1100 (Thu, 15 Oct 2009) | 3 lines
branches/zip: Change page_mem_alloc_free() to inline.
Fix Bug #47058 - Failure to compile innodb_plugin on solaris 10u7 + spro cc/CC 5.10
------------------------------------------------------------------------
r6084 | vasil | 2009-10-15 16:21:17 +1100 (Thu, 15 Oct 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r6080.
------------------------------------------------------------------------
r6095 | vasil | 2009-10-20 00:04:59 +1100 (Tue, 20 Oct 2009) | 7 lines
branches/zip:
Fix Bug#47808 innodb_information_schema.test fails when run under valgrind
by using the wait_until_rows_count macro that loops until the number of
rows becomes 14 instead of sleep 0.1, which is obviously very fragile.
------------------------------------------------------------------------
r6096 | vasil | 2009-10-20 00:06:09 +1100 (Tue, 20 Oct 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r6095.
------------------------------------------------------------------------
r6099 | jyang | 2009-10-22 13:58:39 +1100 (Thu, 22 Oct 2009) | 7 lines
branches/zip: Port bug #46000 related changes from 5.1 to zip
branch. Due to different code path for creating index in zip
branch comparing to 5.1), the index reserved name check function
is extended to be used in ha_innobase::add_index().
rb://190 Approved by: Marko
------------------------------------------------------------------------
r6100 | jyang | 2009-10-22 14:51:07 +1100 (Thu, 22 Oct 2009) | 6 lines
branches/zip: As a request from mysql, WARN_LEVEL_ERROR cannot
be used for push_warning_* call any more. Switch to
WARN_LEVEL_WARN. Bug #47233.
rb://172 approved by Sunny Bains and Marko.
------------------------------------------------------------------------
r6101 | jyang | 2009-10-23 19:45:50 +1100 (Fri, 23 Oct 2009) | 7 lines
branches/zip: Update test result with the WARN_LEVEL_ERROR
to WARN_LEVEL_WARN change. This is the same result as
submitted in rb://172 review, which approved by Sunny Bains
and Marko.
------------------------------------------------------------------------
r6102 | marko | 2009-10-26 18:32:23 +1100 (Mon, 26 Oct 2009) | 1 line
branches/zip: row_prebuilt_struct::prebuilts: Unused field, remove.
------------------------------------------------------------------------
r6103 | marko | 2009-10-27 00:46:18 +1100 (Tue, 27 Oct 2009) | 4 lines
branches/zip: row_ins_alloc_sys_fields(): Zero out the system columns
DB_TRX_ID, DB_ROLL_PTR and DB_ROW_ID, in order to avoid harmless
Valgrind warnings about uninitialized data. (The warnings were
harmless, because the fields would be initialized at a later stage.)
------------------------------------------------------------------------
r6105 | calvin | 2009-10-28 09:05:52 +1100 (Wed, 28 Oct 2009) | 6 lines
branches/zip: backport r3848 from 6.0 branch
----
branches/6.0: innobase_start_or_create_for_mysql(): Make the 10 MB
minimum tablespace limit independent of UNIV_PAGE_SIZE. (Bug #41490)
------------------------------------------------------------------------
r6107 | marko | 2009-10-29 01:10:34 +1100 (Thu, 29 Oct 2009) | 5 lines
branches/zip: buf_page_set_old(): Improve UNIV_LRU_DEBUG diagnostics
in order to catch the buf_pool->LRU_old corruption reported in Issue #381.
buf_LRU_old_init(): Set the property from the tail towards the front
of the buf_pool->LRU list, in order not to trip the debug check.
------------------------------------------------------------------------
r6108 | calvin | 2009-10-29 16:58:04 +1100 (Thu, 29 Oct 2009) | 5 lines
branches/zip: close file handle when building with UNIV_HOTBACKUP
The change does not affect regular InnoDB engine. Confirmed by
Marko.
------------------------------------------------------------------------
r6109 | jyang | 2009-10-29 19:37:32 +1100 (Thu, 29 Oct 2009) | 7 lines
branches/zip: In os_mem_alloc_large(), if we fail to attach
the shared memory, reset memory pointer ptr to NULL, and
allocate memory from conventional pool.
Bug #48237 Error handling in os_mem_alloc_large appears to be incorrect
rb://198 Approved by: Marko
------------------------------------------------------------------------
r6110 | marko | 2009-10-29 21:44:57 +1100 (Thu, 29 Oct 2009) | 2 lines
branches/zip: Makefile.am (INCLUDES): Merge a change from MySQL:
Use $(srcdir)/include instead of $(top_srcdir)/storage/innobase/include.
------------------------------------------------------------------------
r6111 | marko | 2009-10-29 22:04:11 +1100 (Thu, 29 Oct 2009) | 33 lines
branches/zip: Fix corruption of buf_pool->LRU_old and improve debug assertions.
This was reported as Issue #381.
buf_page_set_old(): Assert that blocks may only be set old if
buf_pool->LRU_old is initialized and buf_pool->LRU_old_len is nonzero.
Assert that buf_pool->LRU_old points to the block at the old/new boundary.
buf_LRU_old_adjust_len(): Invoke buf_page_set_old() after adjusting
buf_pool->LRU_old and buf_pool->LRU_old_len, in order not to violate
the added assertions.
buf_LRU_old_init(): Replace buf_page_set_old() with a direct
assignment to bpage->old, because these loops that initialize all the
blocks would temporarily violate the assertions about
buf_pool->LRU_old.
buf_LRU_remove_block(): When setting buf_pool->LRU_old = NULL, also
clear all bpage->old flags and set buf_pool->LRU_old_len = 0.
buf_LRU_add_block_to_end_low(), buf_LRU_add_block_low(): Move the
buf_page_set_old() call later in order not to violate the debug
assertions. If buf_pool->LRU_old is NULL, set old=FALSE.
buf_LRU_free_block(): Replace the UNIV_LRU_DEBUG assertion with a
dummy buf_page_set_old() call that performs more thorough checks.
buf_LRU_validate(): Do not tolerate garbage in buf_pool->LRU_old_len
even if buf_pool->LRU_old is NULL. Check that bpage->old is monotonic.
buf_relocate(): Make the UNIV_LRU_DEBUG checks stricter.
buf0buf.h: Revise the documentation of buf_page_t::old and
buf_pool_t::LRU_old_len.
------------------------------------------------------------------------
r6112 | calvin | 2009-10-30 01:21:15 +1100 (Fri, 30 Oct 2009) | 4 lines
branches/zip: consideration for icc compilers
Proposed by MySQL, and approved by Marko.
------------------------------------------------------------------------
r6113 | vasil | 2009-10-30 03:15:50 +1100 (Fri, 30 Oct 2009) | 93 lines
branches/zip: Merge r5912:6112 from branches/5.1:
(after this merge the innodb-autoinc test starts to fail, but
I commit anyway because it would be easier to investigate the
failure this way)
------------------------------------------------------------------------
r5952 | calvin | 2009-09-22 19:45:07 +0300 (Tue, 22 Sep 2009) | 7 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: fix bug#42383: Can't create table 'test.bug39438'
For embedded server, MySQL may pass in full path, which is
currently disallowed. It is needed to relax the condition by
accepting full paths in the embedded case.
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r6032 | vasil | 2009-10-01 15:55:49 +0300 (Thu, 01 Oct 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Fix Bug#38996 Race condition in ANALYZE TABLE
by serializing ANALYZE TABLE inside InnoDB.
Approved by: Heikki (rb://175)
------------------------------------------------------------------------
r6045 | jyang | 2009-10-08 02:27:08 +0300 (Thu, 08 Oct 2009) | 7 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
A /branches/5.1/mysql-test/innodb_bug47777.result
A /branches/5.1/mysql-test/innodb_bug47777.test
branches/5.1: Fix bug #47777. Treat the Geometry data same as
Binary BLOB in ha_innobase::store_key_val_for_row(), since the
Geometry data is stored as Binary BLOB in Innodb.
Review: rb://180 approved by Marko Makela.
------------------------------------------------------------------------
r6051 | sunny | 2009-10-12 07:05:00 +0300 (Mon, 12 Oct 2009) | 6 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Ignore negative values supplied by the user when calculating the
next value to store in dict_table_t. Setting autoincrement columns top negative
values is undefined behavior and this change should bring the behavior of
InnoDB closer to what users expect. Added several tests to check.
rb://162
------------------------------------------------------------------------
r6052 | sunny | 2009-10-12 07:09:56 +0300 (Mon, 12 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Reset the statement level autoinc counter on ROLLBACK. Fix
the test results too.
rb://164
------------------------------------------------------------------------
r6053 | sunny | 2009-10-12 07:37:49 +0300 (Mon, 12 Oct 2009) | 6 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Copy the maximum AUTOINC value from the old table to the new
table when MySQL does a CREATE INDEX ON T. This is required because MySQL
does a table copy, rename and drops the old table.
Fix Bug#47125: auto_increment start value is ignored if an index is created and engine=innodb
rb://168
------------------------------------------------------------------------
r6076 | vasil | 2009-10-14 19:30:12 +0300 (Wed, 14 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/row/row0mysql.c
branches/5.1:
Fix typo.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6114 | vasil | 2009-10-30 03:43:51 +1100 (Fri, 30 Oct 2009) | 6 lines
branches/zip:
* Add ChangeLog entries for latest changes
* Obey alphabetical order in the list of the files
* White-space fixup
------------------------------------------------------------------------
r6121 | sunny | 2009-10-30 10:42:11 +1100 (Fri, 30 Oct 2009) | 7 lines
branches/zip: This test has been problematic for sometime now. The underlying
bug is that the data dictionaries get out of sync. In the AUTOINC code we
try and apply salve to the symptoms. In the past MySQL made some unrelated
change and the dictionaries stopped getting out of sync and this test started
to fail. Now, it seems they have reverted that changed and the test is
passing again. I suspect this is not he last time that this test will change.
------------------------------------------------------------------------
r6124 | jyang | 2009-10-30 19:02:31 +1100 (Fri, 30 Oct 2009) | 5 lines
branches/zip: Correct the bug number for -r6109 change
from # 48273 to #48237
------------------------------------------------------------------------
r6126 | vasil | 2009-10-30 19:36:07 +1100 (Fri, 30 Oct 2009) | 45 lines
branches/zip: Merge r6112:6125 from branches/5.1:
(skipping r6122 and r6123, Jimmy says these are already present and need
not be merged):
------------------------------------------------------------------------
r6122 | jyang | 2009-10-30 05:18:38 +0200 (Fri, 30 Oct 2009) | 7 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb_bug44369.result
M /branches/5.1/mysql-test/innodb_bug44369.test
M /branches/5.1/mysql-test/innodb_bug46000.result
M /branches/5.1/mysql-test/innodb_bug46000.test
branches/5.1: Chnage WARN_LEVEL_ERROR to WARN_LEVEL_WARN
for push_warning_printf() call in innodb.
Fix Bug#47233: Innodb calls push_warning(MYSQL_ERROR::WARN_LEVEL_ERROR)
rb://170 approved by Marko.
------------------------------------------------------------------------
r6123 | jyang | 2009-10-30 05:43:06 +0200 (Fri, 30 Oct 2009) | 8 lines
Changed paths:
M /branches/5.1/os/os0proc.c
branches/5.1: In os_mem_alloc_large(), if we fail to attach
the shared memory, reset memory pointer ptr to NULL, and
allocate memory from conventional pool. This is a port
from branches/zip.
Bug #48237 Error handling in os_mem_alloc_large appears to be incorrect
rb://198 Approved by: Marko
------------------------------------------------------------------------
r6125 | vasil | 2009-10-30 10:31:23 +0200 (Fri, 30 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
White-space fixup.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6130 | marko | 2009-11-02 20:42:56 +1100 (Mon, 02 Nov 2009) | 9 lines
branches/zip: Free all resources at shutdown. Set pointers to NULL, so
that Valgrind will not complain about freed data structures that are
reachable via pointers. This addresses Bug #45992 and Bug #46656.
This patch is mostly based on changes copied from branches/embedded-1.0,
mainly c5432, c3439, c3134, c2994, c2978, but also some other code was
copied. Some added cleanup code is specific to MySQL/InnoDB.
rb://199 approved by Sunny Bains
------------------------------------------------------------------------
16 years ago  branches/innodb+: Merge revisions 5144:5524 from branches/zip
------------------------------------------------------------------------
r5147 | marko | 2009-05-27 06:55:14 -0400 (Wed, 27 May 2009) | 1 line
branches/zip: ibuf0ibuf.c: Improve a comment.
------------------------------------------------------------------------
r5149 | marko | 2009-05-27 07:46:42 -0400 (Wed, 27 May 2009) | 34 lines
branches/zip: Merge revisions 4994:5148 from branches/5.1:
------------------------------------------------------------------------
r5126 | vasil | 2009-05-26 16:57:12 +0300 (Tue, 26 May 2009) | 9 lines
branches/5.1:
Preparation for the fix of
Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Non-functional change: move FSP_* macros from fsp0fsp.h to a new file
fsp0types.h. This is needed in order to be able to use FSP_EXTENT_SIZE
in mtr0log.ic.
------------------------------------------------------------------------
r5127 | vasil | 2009-05-26 17:05:43 +0300 (Tue, 26 May 2009) | 9 lines
branches/5.1:
Preparation for the fix of
Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Do not include unnecessary headers mtr0log.h and fut0lst.h in trx0sys.h
and include fsp0fsp.h just before it is needed. This is needed in order
to be able to use TRX_SYS_SPACE in mtr0log.ic.
------------------------------------------------------------------------
r5128 | vasil | 2009-05-26 17:26:37 +0300 (Tue, 26 May 2009) | 7 lines
branches/5.1:
Fix Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Do not write redo log for the pages in the doublewrite buffer. Also, do not
make a dummy change to the page because this is not needed.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5169 | marko | 2009-05-28 03:21:55 -0400 (Thu, 28 May 2009) | 1 line
branches/zip: mtr0mtr.h: Add Doxygen comments for the redo log entry types.
------------------------------------------------------------------------
r5176 | marko | 2009-05-28 07:14:02 -0400 (Thu, 28 May 2009) | 1 line
branches/zip: Correct a debug assertion that was added in r5125.
------------------------------------------------------------------------
r5201 | marko | 2009-06-01 06:35:25 -0400 (Mon, 01 Jun 2009) | 2 lines
branches/zip: Clean up some comments.
Make the rec parameter of mlog_open_and_write_index() const.
------------------------------------------------------------------------
r5234 | marko | 2009-06-03 08:26:41 -0400 (Wed, 03 Jun 2009) | 44 lines
branches/zip: Merge revisions 5148:5233 from branches/5.1:
------------------------------------------------------------------------
r5150 | vasil | 2009-05-27 18:56:03 +0300 (Wed, 27 May 2009) | 4 lines
branches/5.1:
Whitespace fixup.
------------------------------------------------------------------------
r5191 | vasil | 2009-05-30 17:46:05 +0300 (Sat, 30 May 2009) | 19 lines
branches/5.1:
Merge a change from MySQL (this fixes the failing innodb_mysql test):
------------------------------------------------------------
revno: 1810.3894.10
committer: Sergey Glukhov <Sergey.Glukhov@sun.com>
branch nick: mysql-5.0-bugteam
timestamp: Tue 2009-05-19 11:32:21 +0500
message:
Bug#39793 Foreign keys not constructed when column has a '#' in a comment or default value
Internal InnoDN FK parser does not recognize '\'' as quotation symbol.
Suggested fix is to add '\'' symbol check for quotation condition
(dict_strip_comments() function).
modified:
innobase/dict/dict0dict.c
mysql-test/r/innodb_mysql.result
mysql-test/t/innodb_mysql.test
------------------------------------------------------------------------
r5233 | marko | 2009-06-03 15:12:44 +0300 (Wed, 03 Jun 2009) | 11 lines
branches/5.1: Merge the test case from r5232 from branches/5.0:
------------------------------------------------------------------------
r5232 | marko | 2009-06-03 14:31:04 +0300 (Wed, 03 Jun 2009) | 21 lines
branches/5.0: Merge r3590 from branches/5.1 in order to fix Bug #40565
(Update Query Results in "1 Row Affected" But Should Be "Zero Rows").
Also, add a test case for Bug #40565.
rb://128 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
r5250 | marko | 2009-06-04 02:58:23 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Add Doxygen comments to the rest of buf0*.
------------------------------------------------------------------------
r5251 | marko | 2009-06-04 02:59:51 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Replace <= in a function comment.
------------------------------------------------------------------------
r5253 | marko | 2009-06-04 06:37:35 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Add missing Doxygen comments for page0zip.
------------------------------------------------------------------------
r5261 | vasil | 2009-06-05 11:13:31 -0400 (Fri, 05 Jun 2009) | 15 lines
branches/zip:
Fix Mantis Issue#244 fix bug in linear read ahead (no check on access pattern)
The changes are:
1) Take into account access pattern when deciding whether or not to do linear
read ahead.
2) Expose a knob innodb_read_ahead_factor = [0-64] default (8), dynamic,
global to control linear read ahead behvior
3) Disable random read ahead. Keep the code for now.
Submitted by: Inaam (rb://122)
Approved by: Heikki (rb://122)
------------------------------------------------------------------------
r5262 | vasil | 2009-06-05 12:04:25 -0400 (Fri, 05 Jun 2009) | 22 lines
branches/zip:
Enable functionality to have multiple background io helper threads.
This patch is based on percona contributions.
More details about this patch will be written at:
https://svn.innodb.com/innobase/MultipleBackgroundThreads
The patch essentially does the following:
expose following knobs:
innodb_read_io_threads = [1 - 64] default 1
innodb_write_io_threads = [1 - 64] default 1
deprecate innodb_file_io_threads (this parameter was relevant only on windows)
Internally it allows multiple segments for read and write IO request arrays
where one thread works on one segement.
Submitted by: Inaam (rb://124)
Approved by: Heikki (rb://124)
------------------------------------------------------------------------
r5263 | vasil | 2009-06-05 12:19:37 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Whitespace cleanup.
------------------------------------------------------------------------
r5264 | vasil | 2009-06-05 12:26:58 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5261.
------------------------------------------------------------------------
r5265 | vasil | 2009-06-05 12:34:11 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5262.
------------------------------------------------------------------------
r5268 | inaam | 2009-06-08 12:18:21 -0400 (Mon, 08 Jun 2009) | 7 lines
branches/zip
Non functional change:
Added legal notices acknowledging percona contribution to the multiple
IO helper threads patch i.e.: r5262
------------------------------------------------------------------------
r5283 | inaam | 2009-06-09 13:46:29 -0400 (Tue, 09 Jun 2009) | 9 lines
branches/zip
rb://130
Enable Group Commit functionality that was broken in 5.0 when
distributed transactions were introduced.
Reviewed by: Heikki
------------------------------------------------------------------------
r5319 | marko | 2009-06-11 04:40:33 -0400 (Thu, 11 Jun 2009) | 3 lines
branches/zip: Declare os_thread_id_t as unsigned long,
because ulint is wrong on Win64.
Pointed out by Vladislav Vaintroub <wlad@sun.com>.
------------------------------------------------------------------------
r5320 | inaam | 2009-06-11 09:15:41 -0400 (Thu, 11 Jun 2009) | 14 lines
branches/zip rb://131
This patch changes the following defaults:
max_dirty_pages_pct: default from 90 to 75. max allowed from 100 to 99
additional_mem_pool_size: default from 1 to 8 MB
buffer_pool_size: default from 8 to 128 MB
log_buffer_size: default from 1 to 8 MB
read_io_threads/write_io_threads: default from 1 to 4
The log file sizes are untouched because of upgrade issues
Reviewed by: Heikki
------------------------------------------------------------------------
r5330 | marko | 2009-06-16 04:08:59 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: buf_page_get_gen(): Reduce mutex holding time by adjusting
buf_pool->n_pend_unzip while only holding buf_pool_mutex.
------------------------------------------------------------------------
r5331 | marko | 2009-06-16 05:00:48 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: buf_page_get_zip(): Eliminate a buf_page_get_mutex() call.
The function must switch on the block state anyway.
------------------------------------------------------------------------
r5332 | vasil | 2009-06-16 05:03:27 -0400 (Tue, 16 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5283 and r5320.
------------------------------------------------------------------------
r5333 | marko | 2009-06-16 05:27:46 -0400 (Tue, 16 Jun 2009) | 1 line
branches/zip: buf_page_io_query(): Remove unused function.
------------------------------------------------------------------------
r5335 | marko | 2009-06-16 09:23:10 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: innodb.test: Adjust the tolerance of
innodb_buffer_pool_pages_total for r5320.
------------------------------------------------------------------------
r5342 | marko | 2009-06-17 06:15:32 -0400 (Wed, 17 Jun 2009) | 60 lines
branches/zip: Merge revisions 5233:5341 from branches/5.1:
------------------------------------------------------------------------
r5233 | marko | 2009-06-03 15:12:44 +0300 (Wed, 03 Jun 2009) | 11 lines
branches/5.1: Merge the test case from r5232 from branches/5.0:
------------------------------------------------------------------------
r5232 | marko | 2009-06-03 14:31:04 +0300 (Wed, 03 Jun 2009) | 21 lines
branches/5.0: Merge r3590 from branches/5.1 in order to fix Bug #40565
(Update Query Results in "1 Row Affected" But Should Be "Zero Rows").
Also, add a test case for Bug #40565.
rb://128 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
r5243 | sunny | 2009-06-04 03:17:14 +0300 (Thu, 04 Jun 2009) | 14 lines
branches/5.1: When the InnoDB and MySQL data dictionaries go out of sync, before
the bug fix we would assert on missing autoinc columns. With this fix we allow
MySQL to open the table but set the next autoinc value for the column to the
MAX value. This effectively disables the next value generation. INSERTs will
fail with a generic AUTOINC failure. However, the user should be able to
read/dump the table, set the column values explicitly, use ALTER TABLE to
set the next autoinc value and/or sync the two data dictionaries to resume
normal operations.
Fix Bug#44030 Error: (1500) Couldn't read the MAX(ID) autoinc value from the
index (PRIMARY)
rb://118
------------------------------------------------------------------------
r5252 | sunny | 2009-06-04 10:16:24 +0300 (Thu, 04 Jun 2009) | 2 lines
branches/5.1: The version of the result file checked in was broken in r5243.
------------------------------------------------------------------------
r5259 | vasil | 2009-06-05 10:29:16 +0300 (Fri, 05 Jun 2009) | 7 lines
branches/5.1:
Remove the word "Error" from the printout because the mysqltest suite
interprets it as an error and thus the innodb-autoinc test fails.
Approved by: Sunny (via IM)
------------------------------------------------------------------------
r5339 | marko | 2009-06-17 11:01:37 +0300 (Wed, 17 Jun 2009) | 2 lines
branches/5.1: Add missing #include "mtr0log.h" so that the code compiles
with -DUNIV_MUST_NOT_INLINE.
(null merge; this had already been committed in branches/zip)
------------------------------------------------------------------------
r5340 | marko | 2009-06-17 12:11:49 +0300 (Wed, 17 Jun 2009) | 4 lines
branches/5.1: row_unlock_for_mysql(): When the clustered index is unknown,
refuse to unlock the record.
(Bug #45357, caused by the fix of Bug #39320).
rb://132 approved by Sunny Bains.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5343 | vasil | 2009-06-17 08:56:12 -0400 (Wed, 17 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5342.
------------------------------------------------------------------------
r5344 | marko | 2009-06-17 09:03:45 -0400 (Wed, 17 Jun 2009) | 1 line
branches/zip: row_merge_read_rec(): Fix a UNIV_DEBUG bug (Bug #45426)
------------------------------------------------------------------------
r5391 | marko | 2009-06-22 05:31:35 -0400 (Mon, 22 Jun 2009) | 2 lines
branches/zip: buf_page_get_zip(): Fix a bogus warning about
block_mutex being possibly uninitialized.
------------------------------------------------------------------------
r5392 | marko | 2009-06-22 07:58:20 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: ha_innobase::check_if_incompatible_data(): When
ROW_FORMAT=DEFAULT, do not compare to get_row_type().
Without this change, fast index creation will be disabled
in recent versions of MySQL 5.1.
------------------------------------------------------------------------
r5393 | pekka | 2009-06-22 09:27:55 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: Minor changes for Hot Backup to build correctly. (The
code bracketed between #ifdef UNIV_HOTBACKUP and #endif /* UNIV_HOTBACKUP */).
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r5394 | pekka | 2009-06-22 09:46:34 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: Add functions for checking the format of tablespaces
for Hot Backup build (UNIV_HOTBACKUP defined).
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r5397 | calvin | 2009-06-23 16:59:42 -0400 (Tue, 23 Jun 2009) | 7 lines
branches/zip: change the header file path.
Change the header file path from ../storage/innobase/include/
to ../include/. In the planned 5.1 + plugin release, the source
directory of the plugin will not be in storage/innobase.
Approved by: Heikki (IM)
------------------------------------------------------------------------
r5407 | calvin | 2009-06-24 09:51:08 -0400 (Wed, 24 Jun 2009) | 4 lines
branches/zip: remove relative path of header files.
Suggested by Marko.
------------------------------------------------------------------------
r5412 | marko | 2009-06-25 06:27:08 -0400 (Thu, 25 Jun 2009) | 1 line
branches/zip: Replace a DBUG_ASSERT with ut_a to track down Issue #290.
------------------------------------------------------------------------
r5415 | marko | 2009-06-25 06:45:57 -0400 (Thu, 25 Jun 2009) | 3 lines
branches/zip: dict_index_find_cols(): Print diagnostic on name mismatch.
This addresses Bug #44571 but does not fix it.
rb://135 approved by Sunny Bains.
------------------------------------------------------------------------
r5417 | marko | 2009-06-25 08:20:56 -0400 (Thu, 25 Jun 2009) | 1 line
branches/zip: ha_innodb.cc: Move the misplaced Doxygen @file comment.
------------------------------------------------------------------------
r5418 | marko | 2009-06-25 08:55:52 -0400 (Thu, 25 Jun 2009) | 5 lines
branches/zip: Fix a race condition caused by
SET GLOBAL innodb_commit_concurrency=DEFAULT. (Bug #45749)
When innodb_commit_concurrency is initially set nonzero,
DEFAULT would change it back to 0, triggering Bug #42101.
rb://139 approved by Heikki Tuuri.
------------------------------------------------------------------------
r5423 | calvin | 2009-06-26 16:52:52 -0400 (Fri, 26 Jun 2009) | 2 lines
branches/zip: Fix typos.
------------------------------------------------------------------------
r5425 | marko | 2009-06-29 04:52:30 -0400 (Mon, 29 Jun 2009) | 4 lines
branches/zip: ha_innobase::add_index(), ha_innobase::final_drop_index():
Start prebuilt->trx before locking the table. This should fix Issue #293
and could fix Issue #229.
Approved by Sunny (over IM).
------------------------------------------------------------------------
r5426 | marko | 2009-06-29 05:24:27 -0400 (Mon, 29 Jun 2009) | 3 lines
branches/zip: buf_page_get_gen(): Fix a race condition when reading
buf_fix_count. This could explain Issue #156.
Tested by Michael.
------------------------------------------------------------------------
r5427 | marko | 2009-06-29 05:54:53 -0400 (Mon, 29 Jun 2009) | 5 lines
branches/zip: lock_print_info_all_transactions(), buf_read_recv_pages():
Tolerate missing tablespaces (zip_size==ULINT_UNDEFINED).
buf_page_get_gen(): Add ut_ad(ut_is_2pow(zip_size)).
Issue #289, rb://136 approved by Sunny Bains
------------------------------------------------------------------------
r5428 | marko | 2009-06-29 07:06:29 -0400 (Mon, 29 Jun 2009) | 2 lines
branches/zip: row_sel_store_mysql_rec(): Add missing pointer cast.
Do not do arithmetics on void pointers.
------------------------------------------------------------------------
r5429 | marko | 2009-06-29 09:49:54 -0400 (Mon, 29 Jun 2009) | 13 lines
branches/zip: Do not crash on SET GLOBAL innodb_file_format=DEFAULT
or SET GLOBAL innodb_file_format_check=DEFAULT.
innodb_file_format.test: New test for innodb_file_format and
innodb_file_format_check.
innodb_file_format_name_validate(): Store the string in *save.
innodb_file_format_name_update(): Check the string again.
innodb_file_format_check_validate(): Store the string in *save.
innodb_file_format_check_update(): Check the string again.
Issue #282, rb://140 approved by Heikki Tuuri
------------------------------------------------------------------------
r5430 | marko | 2009-06-29 09:58:07 -0400 (Mon, 29 Jun 2009) | 2 lines
branches/zip: lock_rec_validate_page(): Add another assertion
to track down Issue #289.
------------------------------------------------------------------------
r5431 | marko | 2009-06-29 09:58:40 -0400 (Mon, 29 Jun 2009) | 1 line
branches/zip: Revert an accidentally made change in r5430 to univ.i.
------------------------------------------------------------------------
r5437 | marko | 2009-06-30 05:10:01 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: ibuf_dummy_index_free(): Beautify the comment.
------------------------------------------------------------------------
r5438 | marko | 2009-06-30 05:10:32 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: fseg_free(): Remove this unused function.
------------------------------------------------------------------------
r5439 | marko | 2009-06-30 05:15:22 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: fseg_validate(): Enclose in #ifdef UNIV_DEBUG.
This function is unused, but it could turn out to be a useful debugging aid.
------------------------------------------------------------------------
r5441 | marko | 2009-06-30 06:30:14 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: ha_delete(): Remove this unused function that was
very similar to ha_search_and_delete_if_found().
------------------------------------------------------------------------
r5442 | marko | 2009-06-30 06:45:41 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: lock_is_on_table(), lock_table_unlock(): Unused, remove.
------------------------------------------------------------------------
r5443 | marko | 2009-06-30 07:03:00 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_event_create_auto(): Unused, remove.
------------------------------------------------------------------------
r5444 | marko | 2009-06-30 07:19:49 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: que_graph_try_free(): Unused, remove.
------------------------------------------------------------------------
r5445 | marko | 2009-06-30 07:28:11 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: row_build_row_ref_from_row(): Unused, remove.
------------------------------------------------------------------------
r5446 | marko | 2009-06-30 07:35:45 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: srv_que_round_robin(), srv_que_task_enqueue(): Unused, remove.
------------------------------------------------------------------------
r5447 | marko | 2009-06-30 07:37:58 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: srv_que_task_queue_check(): Unused, remove.
------------------------------------------------------------------------
r5448 | marko | 2009-06-30 07:56:36 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: mem_heap_cat(): Unused, remove.
------------------------------------------------------------------------
r5449 | marko | 2009-06-30 08:00:50 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: innobase_start_or_create_for_mysql():
Invoke os_get_os_version() at most once.
------------------------------------------------------------------------
r5450 | marko | 2009-06-30 08:02:20 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_file_close_no_error_handling(): Unused, remove.
------------------------------------------------------------------------
r5451 | marko | 2009-06-30 08:09:49 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: page_set_max_trx_id(): Make the code compile
with UNIV_HOTBACKUP.
------------------------------------------------------------------------
r5452 | marko | 2009-06-30 08:10:26 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: os_file_close_no_error_handling(): Restore,
as this function is used within InnoDB Hot Backup.
------------------------------------------------------------------------
r5453 | marko | 2009-06-30 08:14:01 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_process_set_priority_boost(): Unused, remove.
------------------------------------------------------------------------
r5454 | marko | 2009-06-30 08:42:52 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: Replace a non-ASCII character
(ISO 8859-1 encoded U+00AD SOFT HYPHEN) with a cheap ASCII substitute.
------------------------------------------------------------------------
r5456 | inaam | 2009-06-30 14:21:09 -0400 (Tue, 30 Jun 2009) | 4 lines
branches/zip
Non functional change. s/Percona/Percona Inc./
------------------------------------------------------------------------
r5470 | vasil | 2009-07-02 09:12:36 -0400 (Thu, 02 Jul 2009) | 16 lines
branches/zip:
Use PAUSE instruction inside spinloop if it is available.
The patch was originally developed by Mikael Ronstrom <mikael@mysql.com>
and can be found here:
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2768
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2771
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2772
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2774
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2777
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2799
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2800
Approved by: Heikki (rb://137)
------------------------------------------------------------------------
r5481 | vasil | 2009-07-06 13:16:32 -0400 (Mon, 06 Jul 2009) | 4 lines
branches/zip:
Remove unnecessary quotes and simplify plug.in.
------------------------------------------------------------------------
r5482 | calvin | 2009-07-06 18:36:35 -0400 (Mon, 06 Jul 2009) | 5 lines
branches/zip: add COPYING files for Percona and Sun Micro.
1.0.4 contains patches based on contributions from Percona
and Sun Microsystems.
------------------------------------------------------------------------
r5483 | calvin | 2009-07-07 05:36:43 -0400 (Tue, 07 Jul 2009) | 3 lines
branches/zip: add IB_HAVE_PAUSE_INSTRUCTION to CMake.
Windows will support PAUSE instruction by default.
------------------------------------------------------------------------
r5484 | inaam | 2009-07-07 18:57:14 -0400 (Tue, 07 Jul 2009) | 13 lines
branches/zip rb://126
Based on contribution from Google Inc.
This patch introduces a new parameter innodb_io_capacity to control the
rate at which master threads performs various tasks. The default value
is 200 and higher values imply more aggressive flushing and ibuf merges
from within the master thread.
This patch also changes the ibuf merge from synchronous to asynchronous.
Another minor change is not to force the master thread to wait for a
log flush to complete every second.
Approved by: Heikki
------------------------------------------------------------------------
r5485 | inaam | 2009-07-07 19:00:49 -0400 (Tue, 07 Jul 2009) | 18 lines
branches/zip rb://138
The current implementation is to try to flush the neighbors of every
page that we flush. This patch makes the following distinction:
1) If the flush is from flush_list AND
2) If the flush is intended to move the oldest_modification LSN ahead
(this happens when a user thread sees little space in the log file and
attempts to flush pages from the buffer pool so that a checkpoint can
be made)
THEN
Do not try to flush the neighbors. Just focus on flushing dirty pages at
the end of flush_list
Approved by: Heikki
------------------------------------------------------------------------
r5486 | inaam | 2009-07-08 12:11:40 -0400 (Wed, 08 Jul 2009) | 29 lines
branches/zip rb://133
This patch introduces heuristics based flushing rate of dirty pages to
avoid IO bursts at checkpoint.
1) log_capacity / log_generated per second gives us number of seconds
in which ALL dirty pages need to be flushed. Based on this rough
assumption we can say that
n_dirty_pages / (log_capacity / log_generation_rate) = desired_flush_rate
2) We use weighted averages (hard coded to 20 seconds) of
log_generation_rate to avoid resonance.
3) From the desired_flush_rate we subtract the number of pages that have
been flushed due to LRU flushing. That gives us pages that we should
flush as part of flush_list cleanup. And that is the number (capped by
maximum io_capacity) that we try to flush from the master thread.
Knobs:
======
innodb_adaptive_flushing: boolean, global, dynamic, default TRUE.
Since this heuristic is very experimental and has the potential to
dramatically change the IO pattern I think it is a good idea to leave a
knob to turn it off.
Approved by: Heikki
------------------------------------------------------------------------
r5487 | calvin | 2009-07-08 12:42:28 -0400 (Wed, 08 Jul 2009) | 7 lines
branches/zip: fix PAUSE instruction patch on Windows
The original PAUSE instruction patch (r5470) does not
compile on Windows. Also, there is an elegant way of
doing it on Windows - YieldProcessor().
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r5489 | vasil | 2009-07-10 05:02:22 -0400 (Fri, 10 Jul 2009) | 9 lines
branches/zip:
Change the defaults for
innodb_sync_spin_loops: 20 -> 30
innodb_spin_wait_delay: 5 -> 6
This change was proposed by Sun/MySQL based on their performance testing,
see https://svn.innodb.com/innobase/Release_tasks_for_InnoDB_Plugin_V1.0.4
------------------------------------------------------------------------
r5490 | vasil | 2009-07-10 05:04:20 -0400 (Fri, 10 Jul 2009) | 4 lines
branches/zip:
Add ChangeLog entry for 5489.
------------------------------------------------------------------------
r5491 | calvin | 2009-07-10 12:19:17 -0400 (Fri, 10 Jul 2009) | 6 lines
branches/zip: add copyright info to files related to PAUSE
instruction patch, contributed by Sun Microsystems.
------------------------------------------------------------------------
r5492 | calvin | 2009-07-10 17:47:34 -0400 (Fri, 10 Jul 2009) | 5 lines
branches/zip: add ChangeLog entries for r5484-r5486.
------------------------------------------------------------------------
r5494 | vasil | 2009-07-13 03:37:35 -0400 (Mon, 13 Jul 2009) | 6 lines
branches/zip:
Restore the original value of innodb_sync_spin_loops at the end, previously
the test assumed that setting it to 20 will do this, but now the default is
30 and MTR's internal check failed.
------------------------------------------------------------------------
r5495 | inaam | 2009-07-13 11:48:45 -0400 (Mon, 13 Jul 2009) | 5 lines
branches/zip rb://138 (REVERT)
Revert the flush neighbors patch as it shows regression in
the benchmarks run by Michael.
------------------------------------------------------------------------
r5496 | inaam | 2009-07-13 14:04:57 -0400 (Mon, 13 Jul 2009) | 4 lines
branches/zip
Fixed warnings on windows where ulint != ib_uint64_t
------------------------------------------------------------------------
r5497 | calvin | 2009-07-13 15:01:00 -0400 (Mon, 13 Jul 2009) | 9 lines
branches/zip: fix run-time symbols clash on Solaris.
This patch is from Sergey Vojtovich of Sun Microsystems,
to fix run-time symbols clash on Solaris with older C++
compiler:
- when finding out a way to hide symbols, make decision basing
on compiler, not operating system.
- Sun Studio supports __hidden declaration specifier for this
purpose.
------------------------------------------------------------------------
r5498 | vasil | 2009-07-14 03:16:18 -0400 (Tue, 14 Jul 2009) | 92 lines
branches/zip: Merge r5341:5497 from branches/5.1, skipping:
c5419 because it is merge from branches/zip into branches/5.1
c5466 because the source code has been adjusted to match the MySQL
behavior and the innodb-autoinc test does not fail in branches/zip,
if c5466 is merged, then innodb-autoinc starts failing, Sunny suggested
not to merge c5466.
and resolving conflicts in c5410, c5440, c5488:
------------------------------------------------------------------------
r5410 | marko | 2009-06-24 22:26:34 +0300 (Wed, 24 Jun 2009) | 2 lines
Changed paths:
M /branches/5.1/include/trx0sys.ic
M /branches/5.1/trx/trx0purge.c
M /branches/5.1/trx/trx0sys.c
M /branches/5.1/trx/trx0undo.c
branches/5.1: Add missing #include "mtr0log.h" to avoid warnings
when compiling with -DUNIV_MUST_NOT_INLINE.
------------------------------------------------------------------------
r5419 | marko | 2009-06-25 16:11:57 +0300 (Thu, 25 Jun 2009) | 18 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb_bug42101-nonzero.result
M /branches/5.1/mysql-test/innodb_bug42101-nonzero.test
M /branches/5.1/mysql-test/innodb_bug42101.result
M /branches/5.1/mysql-test/innodb_bug42101.test
branches/5.1: Merge r5418 from branches/zip:
------------------------------------------------------------------------
r5418 | marko | 2009-06-25 15:55:52 +0300 (Thu, 25 Jun 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/mysql-test/innodb_bug42101-nonzero.result
M /branches/zip/mysql-test/innodb_bug42101-nonzero.test
M /branches/zip/mysql-test/innodb_bug42101.result
M /branches/zip/mysql-test/innodb_bug42101.test
branches/zip: Fix a race condition caused by
SET GLOBAL innodb_commit_concurrency=DEFAULT. (Bug #45749)
When innodb_commit_concurrency is initially set nonzero,
DEFAULT would change it back to 0, triggering Bug #42101.
rb://139 approved by Heikki Tuuri.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5440 | vasil | 2009-06-30 13:04:29 +0300 (Tue, 30 Jun 2009) | 8 lines
Changed paths:
M /branches/5.1/fil/fil0fil.c
branches/5.1:
Fix Bug#45814 URL reference in InnoDB server errors needs adjusting to match documentation
by changing the URL from
http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html to
http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting-datadict.html
------------------------------------------------------------------------
r5466 | vasil | 2009-07-02 10:46:45 +0300 (Thu, 02 Jul 2009) | 6 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Adjust the failing innodb-autoinc test to conform to the latest behavior
of the MySQL code. The idea and the comment in innodb-autoinc.test come
from Sunny.
------------------------------------------------------------------------
r5488 | vasil | 2009-07-09 19:16:44 +0300 (Thu, 09 Jul 2009) | 13 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
A /branches/5.1/mysql-test/innodb_bug21704.result
A /branches/5.1/mysql-test/innodb_bug21704.test
branches/5.1:
Fix Bug#21704 Renaming column does not update FK definition
by checking whether a column that participates in a FK definition is being
renamed and denying the ALTER in this case.
The patch was originally developed by Davi Arnaut <Davi.Arnaut@Sun.COM>:
http://lists.mysql.com/commits/77714
and was later adjusted to conform to InnoDB coding style by me (Vasil),
I also added some more comments and moved the bug specific mysql-test to
a separate file to make it more manageable and flexible.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5499 | calvin | 2009-07-14 12:55:10 -0400 (Tue, 14 Jul 2009) | 3 lines
branches/zip: add a missing file in Makefile.am
This change was suggested by MySQL.
------------------------------------------------------------------------
r5500 | calvin | 2009-07-14 13:03:26 -0400 (Tue, 14 Jul 2009) | 3 lines
branches/zip: minor change
Remove an extra "with".
------------------------------------------------------------------------
r5501 | vasil | 2009-07-14 13:58:15 -0400 (Tue, 14 Jul 2009) | 5 lines
branches/zip:
Add @ZLIB_INCLUDES@ so that the InnoDB Plugin picks up the same zlib.h
header file that is eventually used by mysqld.
------------------------------------------------------------------------
r5502 | vasil | 2009-07-14 13:59:59 -0400 (Tue, 14 Jul 2009) | 4 lines
branches/zip:
Add include/ut0auxconf.h to noinst_HEADERS
------------------------------------------------------------------------
r5503 | vasil | 2009-07-14 14:16:11 -0400 (Tue, 14 Jul 2009) | 8 lines
branches/zip:
Non-functional change:
put files in noinst_HEADERS and libinnobase_a_SOURCES one per line and sort
alphabetically, so it is easier to find if a file is there or not and
also diffs show exactly the added or removed file instead of surrounding
lines too.
------------------------------------------------------------------------
r5504 | calvin | 2009-07-15 04:58:44 -0400 (Wed, 15 Jul 2009) | 6 lines
branches/zip: fix compile errors on Win64
Both srv_read_ahead_factor and srv_io_capacity should
be defined as ulong.
Approved by: Sunny
------------------------------------------------------------------------
r5508 | calvin | 2009-07-16 09:40:47 -0400 (Thu, 16 Jul 2009) | 16 lines
branches/zip: Support inlining of functions and prefetch with
Sun Studio
Those changes are contributed by Sun/MySQL. Two sets of changes
in this patch when Sun Studio is used:
- Explicit inlining of functions
- Prefetch Support
This patch has been tested by Sunny with the plugin statically
built in. Since we've never built the plugin as a dynamically
loaded module on Solaris, it is a separate task to change
plug.in.
rb://142
Approved by: Heikki
------------------------------------------------------------------------
r5509 | calvin | 2009-07-16 09:45:28 -0400 (Thu, 16 Jul 2009) | 2 lines
branches/zip: add ChangeLog entry for r5508.
------------------------------------------------------------------------
r5512 | sunny | 2009-07-19 19:52:48 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Remove unused extern ref to timed_mutexes.
------------------------------------------------------------------------
r5513 | sunny | 2009-07-19 19:58:43 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Undo r5512
------------------------------------------------------------------------
r5514 | sunny | 2009-07-19 20:08:49 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Only use my_bool when UNIV_HOTBACKUP is not defined.
------------------------------------------------------------------------
r5515 | sunny | 2009-07-20 03:29:14 -0400 (Mon, 20 Jul 2009) | 2 lines
branches/zip: The dict_table_t::autoinc_mutex field is not used in HotBackup.
------------------------------------------------------------------------
r5516 | sunny | 2009-07-20 03:46:05 -0400 (Mon, 20 Jul 2009) | 4 lines
branches/zip: Make this file usable from within HotBackup. A new file has
been introduced called hb_univ.i. This file should have all the HotBackup
specific configuration.
------------------------------------------------------------------------
r5517 | sunny | 2009-07-20 03:55:11 -0400 (Mon, 20 Jul 2009) | 2 lines
Add /* UNIV_HOTBACK */
------------------------------------------------------------------------
r5519 | vasil | 2009-07-20 04:45:18 -0400 (Mon, 20 Jul 2009) | 31 lines
branches/zip: Merge r5497:5518 from branches/5.1:
------------------------------------------------------------------------
r5518 | vasil | 2009-07-20 11:29:47 +0300 (Mon, 20 Jul 2009) | 22 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2874.2.1
committer: Anurag Shekhar <anurag.shekhar@sun.com>
branch nick: mysql-5.1-bugteam-windows-warning
timestamp: Wed 2009-05-13 15:41:24 +0530
message:
Bug #39802 On Windows, 32-bit time_t should be enforced
This patch fixes compilation warning, "conversion from 'time_t' to 'ulong',
possible loss of data".
The fix is to typecast time_t to ulong before assigning it to ulong.
Backported this from 6.0-bugteam tree.
modified:
storage/archive/ha_archive.cc
storage/federated/ha_federated.cc
storage/innobase/handler/ha_innodb.cc
storage/myisam/ha_myisam.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r5520 | vasil | 2009-07-20 04:51:47 -0400 (Mon, 20 Jul 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5498 and r5519.
------------------------------------------------------------------------
r5524 | inaam | 2009-07-20 12:23:15 -0400 (Mon, 20 Jul 2009) | 9 lines
branches/zip
Change the read ahead parameter name to innodb_read_ahead_threshold.
Change the meaning of this parameter to signify the number of pages
that must be sequentially accessed for InnoDB to trigger a readahead
request.
Suggested by: Ken
------------------------------------------------------------------------
16 years ago  branches/innodb+: Merge revisions 5144:5524 from branches/zip
------------------------------------------------------------------------
r5147 | marko | 2009-05-27 06:55:14 -0400 (Wed, 27 May 2009) | 1 line
branches/zip: ibuf0ibuf.c: Improve a comment.
------------------------------------------------------------------------
r5149 | marko | 2009-05-27 07:46:42 -0400 (Wed, 27 May 2009) | 34 lines
branches/zip: Merge revisions 4994:5148 from branches/5.1:
------------------------------------------------------------------------
r5126 | vasil | 2009-05-26 16:57:12 +0300 (Tue, 26 May 2009) | 9 lines
branches/5.1:
Preparation for the fix of
Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Non-functional change: move FSP_* macros from fsp0fsp.h to a new file
fsp0types.h. This is needed in order to be able to use FSP_EXTENT_SIZE
in mtr0log.ic.
------------------------------------------------------------------------
r5127 | vasil | 2009-05-26 17:05:43 +0300 (Tue, 26 May 2009) | 9 lines
branches/5.1:
Preparation for the fix of
Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Do not include unnecessary headers mtr0log.h and fut0lst.h in trx0sys.h
and include fsp0fsp.h just before it is needed. This is needed in order
to be able to use TRX_SYS_SPACE in mtr0log.ic.
------------------------------------------------------------------------
r5128 | vasil | 2009-05-26 17:26:37 +0300 (Tue, 26 May 2009) | 7 lines
branches/5.1:
Fix Bug#45097 Hang during recovery, redo logs for doublewrite buffer pages
Do not write redo log for the pages in the doublewrite buffer. Also, do not
make a dummy change to the page because this is not needed.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5169 | marko | 2009-05-28 03:21:55 -0400 (Thu, 28 May 2009) | 1 line
branches/zip: mtr0mtr.h: Add Doxygen comments for the redo log entry types.
------------------------------------------------------------------------
r5176 | marko | 2009-05-28 07:14:02 -0400 (Thu, 28 May 2009) | 1 line
branches/zip: Correct a debug assertion that was added in r5125.
------------------------------------------------------------------------
r5201 | marko | 2009-06-01 06:35:25 -0400 (Mon, 01 Jun 2009) | 2 lines
branches/zip: Clean up some comments.
Make the rec parameter of mlog_open_and_write_index() const.
------------------------------------------------------------------------
r5234 | marko | 2009-06-03 08:26:41 -0400 (Wed, 03 Jun 2009) | 44 lines
branches/zip: Merge revisions 5148:5233 from branches/5.1:
------------------------------------------------------------------------
r5150 | vasil | 2009-05-27 18:56:03 +0300 (Wed, 27 May 2009) | 4 lines
branches/5.1:
Whitespace fixup.
------------------------------------------------------------------------
r5191 | vasil | 2009-05-30 17:46:05 +0300 (Sat, 30 May 2009) | 19 lines
branches/5.1:
Merge a change from MySQL (this fixes the failing innodb_mysql test):
------------------------------------------------------------
revno: 1810.3894.10
committer: Sergey Glukhov <Sergey.Glukhov@sun.com>
branch nick: mysql-5.0-bugteam
timestamp: Tue 2009-05-19 11:32:21 +0500
message:
Bug#39793 Foreign keys not constructed when column has a '#' in a comment or default value
Internal InnoDN FK parser does not recognize '\'' as quotation symbol.
Suggested fix is to add '\'' symbol check for quotation condition
(dict_strip_comments() function).
modified:
innobase/dict/dict0dict.c
mysql-test/r/innodb_mysql.result
mysql-test/t/innodb_mysql.test
------------------------------------------------------------------------
r5233 | marko | 2009-06-03 15:12:44 +0300 (Wed, 03 Jun 2009) | 11 lines
branches/5.1: Merge the test case from r5232 from branches/5.0:
------------------------------------------------------------------------
r5232 | marko | 2009-06-03 14:31:04 +0300 (Wed, 03 Jun 2009) | 21 lines
branches/5.0: Merge r3590 from branches/5.1 in order to fix Bug #40565
(Update Query Results in "1 Row Affected" But Should Be "Zero Rows").
Also, add a test case for Bug #40565.
rb://128 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
r5250 | marko | 2009-06-04 02:58:23 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Add Doxygen comments to the rest of buf0*.
------------------------------------------------------------------------
r5251 | marko | 2009-06-04 02:59:51 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Replace <= in a function comment.
------------------------------------------------------------------------
r5253 | marko | 2009-06-04 06:37:35 -0400 (Thu, 04 Jun 2009) | 1 line
branches/zip: Add missing Doxygen comments for page0zip.
------------------------------------------------------------------------
r5261 | vasil | 2009-06-05 11:13:31 -0400 (Fri, 05 Jun 2009) | 15 lines
branches/zip:
Fix Mantis Issue#244 fix bug in linear read ahead (no check on access pattern)
The changes are:
1) Take into account access pattern when deciding whether or not to do linear
read ahead.
2) Expose a knob innodb_read_ahead_factor = [0-64] default (8), dynamic,
global to control linear read ahead behvior
3) Disable random read ahead. Keep the code for now.
Submitted by: Inaam (rb://122)
Approved by: Heikki (rb://122)
------------------------------------------------------------------------
r5262 | vasil | 2009-06-05 12:04:25 -0400 (Fri, 05 Jun 2009) | 22 lines
branches/zip:
Enable functionality to have multiple background io helper threads.
This patch is based on percona contributions.
More details about this patch will be written at:
https://svn.innodb.com/innobase/MultipleBackgroundThreads
The patch essentially does the following:
expose following knobs:
innodb_read_io_threads = [1 - 64] default 1
innodb_write_io_threads = [1 - 64] default 1
deprecate innodb_file_io_threads (this parameter was relevant only on windows)
Internally it allows multiple segments for read and write IO request arrays
where one thread works on one segement.
Submitted by: Inaam (rb://124)
Approved by: Heikki (rb://124)
------------------------------------------------------------------------
r5263 | vasil | 2009-06-05 12:19:37 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Whitespace cleanup.
------------------------------------------------------------------------
r5264 | vasil | 2009-06-05 12:26:58 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5261.
------------------------------------------------------------------------
r5265 | vasil | 2009-06-05 12:34:11 -0400 (Fri, 05 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5262.
------------------------------------------------------------------------
r5268 | inaam | 2009-06-08 12:18:21 -0400 (Mon, 08 Jun 2009) | 7 lines
branches/zip
Non functional change:
Added legal notices acknowledging percona contribution to the multiple
IO helper threads patch i.e.: r5262
------------------------------------------------------------------------
r5283 | inaam | 2009-06-09 13:46:29 -0400 (Tue, 09 Jun 2009) | 9 lines
branches/zip
rb://130
Enable Group Commit functionality that was broken in 5.0 when
distributed transactions were introduced.
Reviewed by: Heikki
------------------------------------------------------------------------
r5319 | marko | 2009-06-11 04:40:33 -0400 (Thu, 11 Jun 2009) | 3 lines
branches/zip: Declare os_thread_id_t as unsigned long,
because ulint is wrong on Win64.
Pointed out by Vladislav Vaintroub <wlad@sun.com>.
------------------------------------------------------------------------
r5320 | inaam | 2009-06-11 09:15:41 -0400 (Thu, 11 Jun 2009) | 14 lines
branches/zip rb://131
This patch changes the following defaults:
max_dirty_pages_pct: default from 90 to 75. max allowed from 100 to 99
additional_mem_pool_size: default from 1 to 8 MB
buffer_pool_size: default from 8 to 128 MB
log_buffer_size: default from 1 to 8 MB
read_io_threads/write_io_threads: default from 1 to 4
The log file sizes are untouched because of upgrade issues
Reviewed by: Heikki
------------------------------------------------------------------------
r5330 | marko | 2009-06-16 04:08:59 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: buf_page_get_gen(): Reduce mutex holding time by adjusting
buf_pool->n_pend_unzip while only holding buf_pool_mutex.
------------------------------------------------------------------------
r5331 | marko | 2009-06-16 05:00:48 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: buf_page_get_zip(): Eliminate a buf_page_get_mutex() call.
The function must switch on the block state anyway.
------------------------------------------------------------------------
r5332 | vasil | 2009-06-16 05:03:27 -0400 (Tue, 16 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5283 and r5320.
------------------------------------------------------------------------
r5333 | marko | 2009-06-16 05:27:46 -0400 (Tue, 16 Jun 2009) | 1 line
branches/zip: buf_page_io_query(): Remove unused function.
------------------------------------------------------------------------
r5335 | marko | 2009-06-16 09:23:10 -0400 (Tue, 16 Jun 2009) | 2 lines
branches/zip: innodb.test: Adjust the tolerance of
innodb_buffer_pool_pages_total for r5320.
------------------------------------------------------------------------
r5342 | marko | 2009-06-17 06:15:32 -0400 (Wed, 17 Jun 2009) | 60 lines
branches/zip: Merge revisions 5233:5341 from branches/5.1:
------------------------------------------------------------------------
r5233 | marko | 2009-06-03 15:12:44 +0300 (Wed, 03 Jun 2009) | 11 lines
branches/5.1: Merge the test case from r5232 from branches/5.0:
------------------------------------------------------------------------
r5232 | marko | 2009-06-03 14:31:04 +0300 (Wed, 03 Jun 2009) | 21 lines
branches/5.0: Merge r3590 from branches/5.1 in order to fix Bug #40565
(Update Query Results in "1 Row Affected" But Should Be "Zero Rows").
Also, add a test case for Bug #40565.
rb://128 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
r5243 | sunny | 2009-06-04 03:17:14 +0300 (Thu, 04 Jun 2009) | 14 lines
branches/5.1: When the InnoDB and MySQL data dictionaries go out of sync, before
the bug fix we would assert on missing autoinc columns. With this fix we allow
MySQL to open the table but set the next autoinc value for the column to the
MAX value. This effectively disables the next value generation. INSERTs will
fail with a generic AUTOINC failure. However, the user should be able to
read/dump the table, set the column values explicitly, use ALTER TABLE to
set the next autoinc value and/or sync the two data dictionaries to resume
normal operations.
Fix Bug#44030 Error: (1500) Couldn't read the MAX(ID) autoinc value from the
index (PRIMARY)
rb://118
------------------------------------------------------------------------
r5252 | sunny | 2009-06-04 10:16:24 +0300 (Thu, 04 Jun 2009) | 2 lines
branches/5.1: The version of the result file checked in was broken in r5243.
------------------------------------------------------------------------
r5259 | vasil | 2009-06-05 10:29:16 +0300 (Fri, 05 Jun 2009) | 7 lines
branches/5.1:
Remove the word "Error" from the printout because the mysqltest suite
interprets it as an error and thus the innodb-autoinc test fails.
Approved by: Sunny (via IM)
------------------------------------------------------------------------
r5339 | marko | 2009-06-17 11:01:37 +0300 (Wed, 17 Jun 2009) | 2 lines
branches/5.1: Add missing #include "mtr0log.h" so that the code compiles
with -DUNIV_MUST_NOT_INLINE.
(null merge; this had already been committed in branches/zip)
------------------------------------------------------------------------
r5340 | marko | 2009-06-17 12:11:49 +0300 (Wed, 17 Jun 2009) | 4 lines
branches/5.1: row_unlock_for_mysql(): When the clustered index is unknown,
refuse to unlock the record.
(Bug #45357, caused by the fix of Bug #39320).
rb://132 approved by Sunny Bains.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5343 | vasil | 2009-06-17 08:56:12 -0400 (Wed, 17 Jun 2009) | 4 lines
branches/zip:
Add ChangeLog entry for r5342.
------------------------------------------------------------------------
r5344 | marko | 2009-06-17 09:03:45 -0400 (Wed, 17 Jun 2009) | 1 line
branches/zip: row_merge_read_rec(): Fix a UNIV_DEBUG bug (Bug #45426)
------------------------------------------------------------------------
r5391 | marko | 2009-06-22 05:31:35 -0400 (Mon, 22 Jun 2009) | 2 lines
branches/zip: buf_page_get_zip(): Fix a bogus warning about
block_mutex being possibly uninitialized.
------------------------------------------------------------------------
r5392 | marko | 2009-06-22 07:58:20 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: ha_innobase::check_if_incompatible_data(): When
ROW_FORMAT=DEFAULT, do not compare to get_row_type().
Without this change, fast index creation will be disabled
in recent versions of MySQL 5.1.
------------------------------------------------------------------------
r5393 | pekka | 2009-06-22 09:27:55 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: Minor changes for Hot Backup to build correctly. (The
code bracketed between #ifdef UNIV_HOTBACKUP and #endif /* UNIV_HOTBACKUP */).
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r5394 | pekka | 2009-06-22 09:46:34 -0400 (Mon, 22 Jun 2009) | 4 lines
branches/zip: Add functions for checking the format of tablespaces
for Hot Backup build (UNIV_HOTBACKUP defined).
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r5397 | calvin | 2009-06-23 16:59:42 -0400 (Tue, 23 Jun 2009) | 7 lines
branches/zip: change the header file path.
Change the header file path from ../storage/innobase/include/
to ../include/. In the planned 5.1 + plugin release, the source
directory of the plugin will not be in storage/innobase.
Approved by: Heikki (IM)
------------------------------------------------------------------------
r5407 | calvin | 2009-06-24 09:51:08 -0400 (Wed, 24 Jun 2009) | 4 lines
branches/zip: remove relative path of header files.
Suggested by Marko.
------------------------------------------------------------------------
r5412 | marko | 2009-06-25 06:27:08 -0400 (Thu, 25 Jun 2009) | 1 line
branches/zip: Replace a DBUG_ASSERT with ut_a to track down Issue #290.
------------------------------------------------------------------------
r5415 | marko | 2009-06-25 06:45:57 -0400 (Thu, 25 Jun 2009) | 3 lines
branches/zip: dict_index_find_cols(): Print diagnostic on name mismatch.
This addresses Bug #44571 but does not fix it.
rb://135 approved by Sunny Bains.
------------------------------------------------------------------------
r5417 | marko | 2009-06-25 08:20:56 -0400 (Thu, 25 Jun 2009) | 1 line
branches/zip: ha_innodb.cc: Move the misplaced Doxygen @file comment.
------------------------------------------------------------------------
r5418 | marko | 2009-06-25 08:55:52 -0400 (Thu, 25 Jun 2009) | 5 lines
branches/zip: Fix a race condition caused by
SET GLOBAL innodb_commit_concurrency=DEFAULT. (Bug #45749)
When innodb_commit_concurrency is initially set nonzero,
DEFAULT would change it back to 0, triggering Bug #42101.
rb://139 approved by Heikki Tuuri.
------------------------------------------------------------------------
r5423 | calvin | 2009-06-26 16:52:52 -0400 (Fri, 26 Jun 2009) | 2 lines
branches/zip: Fix typos.
------------------------------------------------------------------------
r5425 | marko | 2009-06-29 04:52:30 -0400 (Mon, 29 Jun 2009) | 4 lines
branches/zip: ha_innobase::add_index(), ha_innobase::final_drop_index():
Start prebuilt->trx before locking the table. This should fix Issue #293
and could fix Issue #229.
Approved by Sunny (over IM).
------------------------------------------------------------------------
r5426 | marko | 2009-06-29 05:24:27 -0400 (Mon, 29 Jun 2009) | 3 lines
branches/zip: buf_page_get_gen(): Fix a race condition when reading
buf_fix_count. This could explain Issue #156.
Tested by Michael.
------------------------------------------------------------------------
r5427 | marko | 2009-06-29 05:54:53 -0400 (Mon, 29 Jun 2009) | 5 lines
branches/zip: lock_print_info_all_transactions(), buf_read_recv_pages():
Tolerate missing tablespaces (zip_size==ULINT_UNDEFINED).
buf_page_get_gen(): Add ut_ad(ut_is_2pow(zip_size)).
Issue #289, rb://136 approved by Sunny Bains
------------------------------------------------------------------------
r5428 | marko | 2009-06-29 07:06:29 -0400 (Mon, 29 Jun 2009) | 2 lines
branches/zip: row_sel_store_mysql_rec(): Add missing pointer cast.
Do not do arithmetics on void pointers.
------------------------------------------------------------------------
r5429 | marko | 2009-06-29 09:49:54 -0400 (Mon, 29 Jun 2009) | 13 lines
branches/zip: Do not crash on SET GLOBAL innodb_file_format=DEFAULT
or SET GLOBAL innodb_file_format_check=DEFAULT.
innodb_file_format.test: New test for innodb_file_format and
innodb_file_format_check.
innodb_file_format_name_validate(): Store the string in *save.
innodb_file_format_name_update(): Check the string again.
innodb_file_format_check_validate(): Store the string in *save.
innodb_file_format_check_update(): Check the string again.
Issue #282, rb://140 approved by Heikki Tuuri
------------------------------------------------------------------------
r5430 | marko | 2009-06-29 09:58:07 -0400 (Mon, 29 Jun 2009) | 2 lines
branches/zip: lock_rec_validate_page(): Add another assertion
to track down Issue #289.
------------------------------------------------------------------------
r5431 | marko | 2009-06-29 09:58:40 -0400 (Mon, 29 Jun 2009) | 1 line
branches/zip: Revert an accidentally made change in r5430 to univ.i.
------------------------------------------------------------------------
r5437 | marko | 2009-06-30 05:10:01 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: ibuf_dummy_index_free(): Beautify the comment.
------------------------------------------------------------------------
r5438 | marko | 2009-06-30 05:10:32 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: fseg_free(): Remove this unused function.
------------------------------------------------------------------------
r5439 | marko | 2009-06-30 05:15:22 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: fseg_validate(): Enclose in #ifdef UNIV_DEBUG.
This function is unused, but it could turn out to be a useful debugging aid.
------------------------------------------------------------------------
r5441 | marko | 2009-06-30 06:30:14 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: ha_delete(): Remove this unused function that was
very similar to ha_search_and_delete_if_found().
------------------------------------------------------------------------
r5442 | marko | 2009-06-30 06:45:41 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: lock_is_on_table(), lock_table_unlock(): Unused, remove.
------------------------------------------------------------------------
r5443 | marko | 2009-06-30 07:03:00 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_event_create_auto(): Unused, remove.
------------------------------------------------------------------------
r5444 | marko | 2009-06-30 07:19:49 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: que_graph_try_free(): Unused, remove.
------------------------------------------------------------------------
r5445 | marko | 2009-06-30 07:28:11 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: row_build_row_ref_from_row(): Unused, remove.
------------------------------------------------------------------------
r5446 | marko | 2009-06-30 07:35:45 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: srv_que_round_robin(), srv_que_task_enqueue(): Unused, remove.
------------------------------------------------------------------------
r5447 | marko | 2009-06-30 07:37:58 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: srv_que_task_queue_check(): Unused, remove.
------------------------------------------------------------------------
r5448 | marko | 2009-06-30 07:56:36 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: mem_heap_cat(): Unused, remove.
------------------------------------------------------------------------
r5449 | marko | 2009-06-30 08:00:50 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: innobase_start_or_create_for_mysql():
Invoke os_get_os_version() at most once.
------------------------------------------------------------------------
r5450 | marko | 2009-06-30 08:02:20 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_file_close_no_error_handling(): Unused, remove.
------------------------------------------------------------------------
r5451 | marko | 2009-06-30 08:09:49 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: page_set_max_trx_id(): Make the code compile
with UNIV_HOTBACKUP.
------------------------------------------------------------------------
r5452 | marko | 2009-06-30 08:10:26 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: os_file_close_no_error_handling(): Restore,
as this function is used within InnoDB Hot Backup.
------------------------------------------------------------------------
r5453 | marko | 2009-06-30 08:14:01 -0400 (Tue, 30 Jun 2009) | 1 line
branches/zip: os_process_set_priority_boost(): Unused, remove.
------------------------------------------------------------------------
r5454 | marko | 2009-06-30 08:42:52 -0400 (Tue, 30 Jun 2009) | 2 lines
branches/zip: Replace a non-ASCII character
(ISO 8859-1 encoded U+00AD SOFT HYPHEN) with a cheap ASCII substitute.
------------------------------------------------------------------------
r5456 | inaam | 2009-06-30 14:21:09 -0400 (Tue, 30 Jun 2009) | 4 lines
branches/zip
Non functional change. s/Percona/Percona Inc./
------------------------------------------------------------------------
r5470 | vasil | 2009-07-02 09:12:36 -0400 (Thu, 02 Jul 2009) | 16 lines
branches/zip:
Use PAUSE instruction inside spinloop if it is available.
The patch was originally developed by Mikael Ronstrom <mikael@mysql.com>
and can be found here:
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2768
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2771
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2772
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2774
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2777
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2799
http://bazaar.launchpad.net/%7Emysql/mysql-server/mysql-5.4/revision/2800
Approved by: Heikki (rb://137)
------------------------------------------------------------------------
r5481 | vasil | 2009-07-06 13:16:32 -0400 (Mon, 06 Jul 2009) | 4 lines
branches/zip:
Remove unnecessary quotes and simplify plug.in.
------------------------------------------------------------------------
r5482 | calvin | 2009-07-06 18:36:35 -0400 (Mon, 06 Jul 2009) | 5 lines
branches/zip: add COPYING files for Percona and Sun Micro.
1.0.4 contains patches based on contributions from Percona
and Sun Microsystems.
------------------------------------------------------------------------
r5483 | calvin | 2009-07-07 05:36:43 -0400 (Tue, 07 Jul 2009) | 3 lines
branches/zip: add IB_HAVE_PAUSE_INSTRUCTION to CMake.
Windows will support PAUSE instruction by default.
------------------------------------------------------------------------
r5484 | inaam | 2009-07-07 18:57:14 -0400 (Tue, 07 Jul 2009) | 13 lines
branches/zip rb://126
Based on contribution from Google Inc.
This patch introduces a new parameter innodb_io_capacity to control the
rate at which master threads performs various tasks. The default value
is 200 and higher values imply more aggressive flushing and ibuf merges
from within the master thread.
This patch also changes the ibuf merge from synchronous to asynchronous.
Another minor change is not to force the master thread to wait for a
log flush to complete every second.
Approved by: Heikki
------------------------------------------------------------------------
r5485 | inaam | 2009-07-07 19:00:49 -0400 (Tue, 07 Jul 2009) | 18 lines
branches/zip rb://138
The current implementation is to try to flush the neighbors of every
page that we flush. This patch makes the following distinction:
1) If the flush is from flush_list AND
2) If the flush is intended to move the oldest_modification LSN ahead
(this happens when a user thread sees little space in the log file and
attempts to flush pages from the buffer pool so that a checkpoint can
be made)
THEN
Do not try to flush the neighbors. Just focus on flushing dirty pages at
the end of flush_list
Approved by: Heikki
------------------------------------------------------------------------
r5486 | inaam | 2009-07-08 12:11:40 -0400 (Wed, 08 Jul 2009) | 29 lines
branches/zip rb://133
This patch introduces heuristics based flushing rate of dirty pages to
avoid IO bursts at checkpoint.
1) log_capacity / log_generated per second gives us number of seconds
in which ALL dirty pages need to be flushed. Based on this rough
assumption we can say that
n_dirty_pages / (log_capacity / log_generation_rate) = desired_flush_rate
2) We use weighted averages (hard coded to 20 seconds) of
log_generation_rate to avoid resonance.
3) From the desired_flush_rate we subtract the number of pages that have
been flushed due to LRU flushing. That gives us pages that we should
flush as part of flush_list cleanup. And that is the number (capped by
maximum io_capacity) that we try to flush from the master thread.
Knobs:
======
innodb_adaptive_flushing: boolean, global, dynamic, default TRUE.
Since this heuristic is very experimental and has the potential to
dramatically change the IO pattern I think it is a good idea to leave a
knob to turn it off.
Approved by: Heikki
------------------------------------------------------------------------
r5487 | calvin | 2009-07-08 12:42:28 -0400 (Wed, 08 Jul 2009) | 7 lines
branches/zip: fix PAUSE instruction patch on Windows
The original PAUSE instruction patch (r5470) does not
compile on Windows. Also, there is an elegant way of
doing it on Windows - YieldProcessor().
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r5489 | vasil | 2009-07-10 05:02:22 -0400 (Fri, 10 Jul 2009) | 9 lines
branches/zip:
Change the defaults for
innodb_sync_spin_loops: 20 -> 30
innodb_spin_wait_delay: 5 -> 6
This change was proposed by Sun/MySQL based on their performance testing,
see https://svn.innodb.com/innobase/Release_tasks_for_InnoDB_Plugin_V1.0.4
------------------------------------------------------------------------
r5490 | vasil | 2009-07-10 05:04:20 -0400 (Fri, 10 Jul 2009) | 4 lines
branches/zip:
Add ChangeLog entry for 5489.
------------------------------------------------------------------------
r5491 | calvin | 2009-07-10 12:19:17 -0400 (Fri, 10 Jul 2009) | 6 lines
branches/zip: add copyright info to files related to PAUSE
instruction patch, contributed by Sun Microsystems.
------------------------------------------------------------------------
r5492 | calvin | 2009-07-10 17:47:34 -0400 (Fri, 10 Jul 2009) | 5 lines
branches/zip: add ChangeLog entries for r5484-r5486.
------------------------------------------------------------------------
r5494 | vasil | 2009-07-13 03:37:35 -0400 (Mon, 13 Jul 2009) | 6 lines
branches/zip:
Restore the original value of innodb_sync_spin_loops at the end, previously
the test assumed that setting it to 20 will do this, but now the default is
30 and MTR's internal check failed.
------------------------------------------------------------------------
r5495 | inaam | 2009-07-13 11:48:45 -0400 (Mon, 13 Jul 2009) | 5 lines
branches/zip rb://138 (REVERT)
Revert the flush neighbors patch as it shows regression in
the benchmarks run by Michael.
------------------------------------------------------------------------
r5496 | inaam | 2009-07-13 14:04:57 -0400 (Mon, 13 Jul 2009) | 4 lines
branches/zip
Fixed warnings on windows where ulint != ib_uint64_t
------------------------------------------------------------------------
r5497 | calvin | 2009-07-13 15:01:00 -0400 (Mon, 13 Jul 2009) | 9 lines
branches/zip: fix run-time symbols clash on Solaris.
This patch is from Sergey Vojtovich of Sun Microsystems,
to fix run-time symbols clash on Solaris with older C++
compiler:
- when finding out a way to hide symbols, make decision basing
on compiler, not operating system.
- Sun Studio supports __hidden declaration specifier for this
purpose.
------------------------------------------------------------------------
r5498 | vasil | 2009-07-14 03:16:18 -0400 (Tue, 14 Jul 2009) | 92 lines
branches/zip: Merge r5341:5497 from branches/5.1, skipping:
c5419 because it is merge from branches/zip into branches/5.1
c5466 because the source code has been adjusted to match the MySQL
behavior and the innodb-autoinc test does not fail in branches/zip,
if c5466 is merged, then innodb-autoinc starts failing, Sunny suggested
not to merge c5466.
and resolving conflicts in c5410, c5440, c5488:
------------------------------------------------------------------------
r5410 | marko | 2009-06-24 22:26:34 +0300 (Wed, 24 Jun 2009) | 2 lines
Changed paths:
M /branches/5.1/include/trx0sys.ic
M /branches/5.1/trx/trx0purge.c
M /branches/5.1/trx/trx0sys.c
M /branches/5.1/trx/trx0undo.c
branches/5.1: Add missing #include "mtr0log.h" to avoid warnings
when compiling with -DUNIV_MUST_NOT_INLINE.
------------------------------------------------------------------------
r5419 | marko | 2009-06-25 16:11:57 +0300 (Thu, 25 Jun 2009) | 18 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb_bug42101-nonzero.result
M /branches/5.1/mysql-test/innodb_bug42101-nonzero.test
M /branches/5.1/mysql-test/innodb_bug42101.result
M /branches/5.1/mysql-test/innodb_bug42101.test
branches/5.1: Merge r5418 from branches/zip:
------------------------------------------------------------------------
r5418 | marko | 2009-06-25 15:55:52 +0300 (Thu, 25 Jun 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/mysql-test/innodb_bug42101-nonzero.result
M /branches/zip/mysql-test/innodb_bug42101-nonzero.test
M /branches/zip/mysql-test/innodb_bug42101.result
M /branches/zip/mysql-test/innodb_bug42101.test
branches/zip: Fix a race condition caused by
SET GLOBAL innodb_commit_concurrency=DEFAULT. (Bug #45749)
When innodb_commit_concurrency is initially set nonzero,
DEFAULT would change it back to 0, triggering Bug #42101.
rb://139 approved by Heikki Tuuri.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5440 | vasil | 2009-06-30 13:04:29 +0300 (Tue, 30 Jun 2009) | 8 lines
Changed paths:
M /branches/5.1/fil/fil0fil.c
branches/5.1:
Fix Bug#45814 URL reference in InnoDB server errors needs adjusting to match documentation
by changing the URL from
http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html to
http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting-datadict.html
------------------------------------------------------------------------
r5466 | vasil | 2009-07-02 10:46:45 +0300 (Thu, 02 Jul 2009) | 6 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Adjust the failing innodb-autoinc test to conform to the latest behavior
of the MySQL code. The idea and the comment in innodb-autoinc.test come
from Sunny.
------------------------------------------------------------------------
r5488 | vasil | 2009-07-09 19:16:44 +0300 (Thu, 09 Jul 2009) | 13 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
A /branches/5.1/mysql-test/innodb_bug21704.result
A /branches/5.1/mysql-test/innodb_bug21704.test
branches/5.1:
Fix Bug#21704 Renaming column does not update FK definition
by checking whether a column that participates in a FK definition is being
renamed and denying the ALTER in this case.
The patch was originally developed by Davi Arnaut <Davi.Arnaut@Sun.COM>:
http://lists.mysql.com/commits/77714
and was later adjusted to conform to InnoDB coding style by me (Vasil),
I also added some more comments and moved the bug specific mysql-test to
a separate file to make it more manageable and flexible.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5499 | calvin | 2009-07-14 12:55:10 -0400 (Tue, 14 Jul 2009) | 3 lines
branches/zip: add a missing file in Makefile.am
This change was suggested by MySQL.
------------------------------------------------------------------------
r5500 | calvin | 2009-07-14 13:03:26 -0400 (Tue, 14 Jul 2009) | 3 lines
branches/zip: minor change
Remove an extra "with".
------------------------------------------------------------------------
r5501 | vasil | 2009-07-14 13:58:15 -0400 (Tue, 14 Jul 2009) | 5 lines
branches/zip:
Add @ZLIB_INCLUDES@ so that the InnoDB Plugin picks up the same zlib.h
header file that is eventually used by mysqld.
------------------------------------------------------------------------
r5502 | vasil | 2009-07-14 13:59:59 -0400 (Tue, 14 Jul 2009) | 4 lines
branches/zip:
Add include/ut0auxconf.h to noinst_HEADERS
------------------------------------------------------------------------
r5503 | vasil | 2009-07-14 14:16:11 -0400 (Tue, 14 Jul 2009) | 8 lines
branches/zip:
Non-functional change:
put files in noinst_HEADERS and libinnobase_a_SOURCES one per line and sort
alphabetically, so it is easier to find if a file is there or not and
also diffs show exactly the added or removed file instead of surrounding
lines too.
------------------------------------------------------------------------
r5504 | calvin | 2009-07-15 04:58:44 -0400 (Wed, 15 Jul 2009) | 6 lines
branches/zip: fix compile errors on Win64
Both srv_read_ahead_factor and srv_io_capacity should
be defined as ulong.
Approved by: Sunny
------------------------------------------------------------------------
r5508 | calvin | 2009-07-16 09:40:47 -0400 (Thu, 16 Jul 2009) | 16 lines
branches/zip: Support inlining of functions and prefetch with
Sun Studio
Those changes are contributed by Sun/MySQL. Two sets of changes
in this patch when Sun Studio is used:
- Explicit inlining of functions
- Prefetch Support
This patch has been tested by Sunny with the plugin statically
built in. Since we've never built the plugin as a dynamically
loaded module on Solaris, it is a separate task to change
plug.in.
rb://142
Approved by: Heikki
------------------------------------------------------------------------
r5509 | calvin | 2009-07-16 09:45:28 -0400 (Thu, 16 Jul 2009) | 2 lines
branches/zip: add ChangeLog entry for r5508.
------------------------------------------------------------------------
r5512 | sunny | 2009-07-19 19:52:48 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Remove unused extern ref to timed_mutexes.
------------------------------------------------------------------------
r5513 | sunny | 2009-07-19 19:58:43 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Undo r5512
------------------------------------------------------------------------
r5514 | sunny | 2009-07-19 20:08:49 -0400 (Sun, 19 Jul 2009) | 2 lines
branches/zip: Only use my_bool when UNIV_HOTBACKUP is not defined.
------------------------------------------------------------------------
r5515 | sunny | 2009-07-20 03:29:14 -0400 (Mon, 20 Jul 2009) | 2 lines
branches/zip: The dict_table_t::autoinc_mutex field is not used in HotBackup.
------------------------------------------------------------------------
r5516 | sunny | 2009-07-20 03:46:05 -0400 (Mon, 20 Jul 2009) | 4 lines
branches/zip: Make this file usable from within HotBackup. A new file has
been introduced called hb_univ.i. This file should have all the HotBackup
specific configuration.
------------------------------------------------------------------------
r5517 | sunny | 2009-07-20 03:55:11 -0400 (Mon, 20 Jul 2009) | 2 lines
Add /* UNIV_HOTBACK */
------------------------------------------------------------------------
r5519 | vasil | 2009-07-20 04:45:18 -0400 (Mon, 20 Jul 2009) | 31 lines
branches/zip: Merge r5497:5518 from branches/5.1:
------------------------------------------------------------------------
r5518 | vasil | 2009-07-20 11:29:47 +0300 (Mon, 20 Jul 2009) | 22 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2874.2.1
committer: Anurag Shekhar <anurag.shekhar@sun.com>
branch nick: mysql-5.1-bugteam-windows-warning
timestamp: Wed 2009-05-13 15:41:24 +0530
message:
Bug #39802 On Windows, 32-bit time_t should be enforced
This patch fixes compilation warning, "conversion from 'time_t' to 'ulong',
possible loss of data".
The fix is to typecast time_t to ulong before assigning it to ulong.
Backported this from 6.0-bugteam tree.
modified:
storage/archive/ha_archive.cc
storage/federated/ha_federated.cc
storage/innobase/handler/ha_innodb.cc
storage/myisam/ha_myisam.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r5520 | vasil | 2009-07-20 04:51:47 -0400 (Mon, 20 Jul 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5498 and r5519.
------------------------------------------------------------------------
r5524 | inaam | 2009-07-20 12:23:15 -0400 (Mon, 20 Jul 2009) | 9 lines
branches/zip
Change the read ahead parameter name to innodb_read_ahead_threshold.
Change the meaning of this parameter to signify the number of pages
that must be sequentially accessed for InnoDB to trigger a readahead
request.
Suggested by: Ken
------------------------------------------------------------------------
16 years ago  MDEV-25506 (3 of 3): Do not delete .ibd files before commit
This is a complete rewrite of DROP TABLE, also as part of other DDL,
such as ALTER TABLE, CREATE TABLE...SELECT, TRUNCATE TABLE.
The background DROP TABLE queue hack is removed.
If a transaction needs to drop and create a table by the same name
(like TRUNCATE TABLE does), it must first rename the table to an
internal #sql-ib name. No committed version of the data dictionary
will include any #sql-ib tables, because whenever a transaction
renames a table to a #sql-ib name, it will also drop that table.
Either the rename will be rolled back, or the drop will be committed.
Data files will be unlinked after the transaction has been committed
and a FILE_RENAME record has been durably written. The file will
actually be deleted when the detached file handle returned by
fil_delete_tablespace() will be closed, after the latches have been
released. It is possible that a purge of the delete of the SYS_INDEXES
record for the clustered index will execute fil_delete_tablespace()
concurrently with the DDL transaction. In that case, the thread that
arrives later will wait for the other thread to finish.
HTON_TRUNCATE_REQUIRES_EXCLUSIVE_USE: A new handler flag.
ha_innobase::truncate() now requires that all other references to
the table be released in advance. This was implemented by Monty.
ha_innobase::delete_table(): If CREATE TABLE..SELECT is detected,
we will "hijack" the current transaction, drop the table in
the current transaction and commit the current transaction.
This essentially fixes MDEV-21602. There is a FIXME comment about
making the check less failure-prone.
ha_innobase::truncate(), ha_innobase::delete_table():
Implement a fast path for temporary tables. We will no longer allow
temporary tables to use the adaptive hash index.
dict_table_t::mdl_name: The original table name for the purpose of
acquiring MDL in purge, to prevent a race condition between a
DDL transaction that is dropping a table, and purge processing
undo log records of DML that had executed before the DDL operation.
For #sql-backup- tables during ALTER TABLE...ALGORITHM=COPY, the
dict_table_t::mdl_name will differ from dict_table_t::name.
dict_table_t::parse_name(): Use mdl_name instead of name.
dict_table_rename_in_cache(): Update mdl_name.
For the internal FTS_ tables of FULLTEXT INDEX, purge would
acquire MDL on the FTS_ table name, but not on the main table,
and therefore it would be able to run concurrently with a
DDL transaction that is dropping the table. Previously, the
DROP TABLE queue hack prevented a race between purge and DDL.
For now, we introduce purge_sys.stop_FTS() to prevent purge from
opening any table, while a DDL transaction that may drop FTS_
tables is in progress. The function fts_lock_table(), which will
be invoked before the dictionary is locked, will wait for
purge to release any table handles.
trx_t::drop_table_statistics(): Drop statistics for the table.
This replaces dict_stats_drop_index(). We will drop or rename
persistent statistics atomically as part of DDL transactions.
On lock conflict for dropping statistics, we will fail instantly
with DB_LOCK_WAIT_TIMEOUT, because we will be holding the
exclusive data dictionary latch.
trx_t::commit_cleanup(): Separated from trx_t::commit_in_memory().
Relax an assertion around fts_commit() and allow DB_LOCK_WAIT_TIMEOUT
in addition to DB_DUPLICATE_KEY. The call to fts_commit() is
entirely misplaced here and may obviously break the consistency
of transactions that affect FULLTEXT INDEX. It needs to be fixed
separately.
dict_table_t::n_foreign_key_checks_running: Remove (MDEV-21175).
The counter was a work-around for missing meta-data locking (MDL)
on the SQL layer, and not really needed in MariaDB.
ER_TABLE_IN_FK_CHECK: Replaced with ER_UNUSED_28.
HA_ERR_TABLE_IN_FK_CHECK: Remove.
row_ins_check_foreign_constraints(): Do not acquire
dict_sys.latch either. The SQL-layer MDL will protect us.
This was reviewed by Thirunarayanan Balathandayuthapani
and tested by Matthias Leich.
4 years ago  branches/innodb+: Merged revisions 5525:5971 from branches/zip
------------------------------------------------------------------------
r5971 | marko | 2009-09-23 09:03:51 -0400 (Wed, 23 Sep 2009) | 2 lines
branches/zip: os_file_pwrite(): Make the code compile in InnoDB Hot Backup
when the pwrite system call is not available.
------------------------------------------------------------------------
r5956 | calvin | 2009-09-22 19:30:10 -0400 (Tue, 22 Sep 2009) | 4 lines
branches/zip: remove handler0vars.h from Makefile.am
Left over from r5950.
------------------------------------------------------------------------
r5951 | calvin | 2009-09-22 11:17:01 -0400 (Tue, 22 Sep 2009) | 4 lines
branches/zip: adjust CMake file to work with old versions of MySQL
Tested with MySQL 5.1.38 and 5.1.30.
------------------------------------------------------------------------
r5950 | calvin | 2009-09-22 02:42:46 -0400 (Tue, 22 Sep 2009) | 17 lines
branches/zip: adjust Windows loading method for 5.1.38
Starting at 5.1.38, MySQL server exports symbols needed
for dynamic plugin on Windows. There is no need for
Windows specific loading. Also, the CMake files are
simplified in 5.1.38.
When WITH_INNOBASE_STORAGE_ENGINE is specified during
configuration (win\configure.js), InnoDB is built as
a static library. Otherwise, a dynamic InnoDB will be
built (ha_innodb.dll).
CMakeLists.txt requires minor changes in order to work
with MySQL prior to 5.1.38. The changes will be in a
separate patch.
This patch addresses Mantis issue#286.
------------------------------------------------------------------------
r5945 | calvin | 2009-09-21 10:53:22 -0400 (Mon, 21 Sep 2009) | 4 lines
branches/zip: fix a type in r5935
Should be innodb_open_files, spotted by Michael.
------------------------------------------------------------------------
r5940 | vasil | 2009-09-21 01:26:04 -0400 (Mon, 21 Sep 2009) | 4 lines
branches/zip:
Add ChangeLog entries for c5938.
------------------------------------------------------------------------
r5938 | calvin | 2009-09-19 03:14:25 -0400 (Sat, 19 Sep 2009) | 41 lines
branches/zip: Merge revisions 2584:2956 from branches/6.0,
except c2932.
Bug#37232 and bug#31183 were fixed in the 6.0 branch only.
They should be fixed in the plugin too, specially MySQL 6.0
is discontinued at this point.
------------------------------------------------------------------------
r2604 | inaam | 2008-08-21 09:37:06 -0500 (Thu, 21 Aug 2008) | 8 lines
branches/6.0 bug#37232
Relax locking behaviour for REPLACE INTO t SELECT ... FROM t1.
Now SELECT on t1 is performed as a consistent read when the isolation
level is set to READ COMMITTED.
Reviewed by: Heikki
------------------------------------------------------------------------
r2605 | inaam | 2008-08-21 09:59:33 -0500 (Thu, 21 Aug 2008) | 7 lines
branches/6.0
Added a comment to clarify why distinct calls to read MySQL binary
log file name and log position do not entail any race condition.
Suggested by: Heikki
------------------------------------------------------------------------
r2956 | inaam | 2008-11-04 04:47:30 -0600 (Tue, 04 Nov 2008) | 11 lines
branches/6.0 bug#31183
If the system tablespace runs out of space because 'autoextend' is
not specified with innodb_data_file_path there was no error message
printed to the error log. The client would get 'table full' error.
This patch prints an appropriate error message to the error log.
rb://43
Approved by: Marko
------------------------------------------------------------------------
------------------------------------------------------------------------
r5935 | calvin | 2009-09-18 17:08:02 -0400 (Fri, 18 Sep 2009) | 6 lines
branches/zip: fix bug#44338; minor non-functional changes
Bug#44338 innodb has message about non-existing option
innodb_max_files_open. Change the option to innodb_open_files.
The fix was committed into 6.0 branch.
------------------------------------------------------------------------
r5934 | vasil | 2009-09-18 13:06:46 -0400 (Fri, 18 Sep 2009) | 4 lines
branches/zip:
Fix typo.
------------------------------------------------------------------------
r5924 | vasil | 2009-09-18 00:59:30 -0400 (Fri, 18 Sep 2009) | 4 lines
branches/zip:
White space and formatting cleanup in the ChangeLog
------------------------------------------------------------------------
r5922 | marko | 2009-09-17 02:32:08 -0400 (Thu, 17 Sep 2009) | 4 lines
branches/zip: innodb-zip.test: Make the test work with zlib 1.2.3.3.
Apparently, the definition of compressBound() has slightly changed.
This has been filed as Mantis Issue #345.
------------------------------------------------------------------------
r5920 | vasil | 2009-09-16 14:47:22 -0400 (Wed, 16 Sep 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5916.
------------------------------------------------------------------------
r5919 | vasil | 2009-09-16 14:37:13 -0400 (Wed, 16 Sep 2009) | 4 lines
branches/zip:
Whitespace cleanup in the ChangeLog.
------------------------------------------------------------------------
r5917 | marko | 2009-09-16 05:56:23 -0400 (Wed, 16 Sep 2009) | 1 line
branches/zip: innobase_get_cset_width(): Cache the value of current_thd.
------------------------------------------------------------------------
r5916 | marko | 2009-09-16 05:54:43 -0400 (Wed, 16 Sep 2009) | 128 lines
branches/zip: Merge revisions 5622:5912 from branches/5.1, except r5700
(changes to CMakeLists.txt)
------------------------------------------------------------------------
r5622 | vasil | 2009-08-03 15:27:00 +0300 (Mon, 03 Aug 2009) | 20 lines
Changed paths:
M /branches/5.1/Makefile.am
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2988
committer: Satya B <satya.bn@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Wed 2009-07-01 11:06:05 +0530
message:
Fix build failure after applying Innodb snapshot 5.1-ss5282
After applying Innodb snapshot 5.1-ss5282, build was broken
because of missing header file.
Adding the header file to Makefile.am after informing the
innodb developers.
modified:
storage/innobase/Makefile.am
------------------------------------------------------------------------
r5740 | jyang | 2009-09-03 06:33:47 +0300 (Thu, 03 Sep 2009) | 5 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/db0err.h
A /branches/5.1/mysql-test/innodb_bug46000.result
A /branches/5.1/mysql-test/innodb_bug46000.test
branches/5.1: Disallow creating index with the name of
"GEN_CLUST_INDEX" which is reserved for the default system
primary index. (Bug #46000) rb://149 approved by Sunny Bains.
------------------------------------------------------------------------
r5741 | jyang | 2009-09-03 07:16:01 +0300 (Thu, 03 Sep 2009) | 5 lines
Changed paths:
M /branches/5.1/dict/dict0dict.c
M /branches/5.1/handler/ha_innodb.cc
A /branches/5.1/mysql-test/innodb_bug44369.result
A /branches/5.1/mysql-test/innodb_bug44369.test
M /branches/5.1/row/row0mysql.c
branches/5.1: Block creating table with column name conflicting
with Innodb reserved key words. (Bug #44369) rb://151 approved
by Sunny Bains.
------------------------------------------------------------------------
r5757 | jyang | 2009-09-04 04:26:13 +0300 (Fri, 04 Sep 2009) | 3 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/db0err.h
D /branches/5.1/mysql-test/innodb_bug46000.result
D /branches/5.1/mysql-test/innodb_bug46000.test
branches/5.1: Revert change in 5740. Making the fix in a subsequent
check in.
------------------------------------------------------------------------
r5760 | jyang | 2009-09-04 07:07:34 +0300 (Fri, 04 Sep 2009) | 3 lines
Changed paths:
M /branches/5.1/dict/dict0dict.c
M /branches/5.1/handler/ha_innodb.cc
D /branches/5.1/mysql-test/innodb_bug44369.result
D /branches/5.1/mysql-test/innodb_bug44369.test
M /branches/5.1/row/row0mysql.c
branches/5.1: This is to revert change 5741. A return status for
create_table_def() needs to be fixed.
------------------------------------------------------------------------
r5797 | calvin | 2009-09-09 18:26:29 +0300 (Wed, 09 Sep 2009) | 3 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: merge change from 5.1.38
HA_ERR_TOO_MANY_CONCURRENT_TRXS is added in 5.1.38.
------------------------------------------------------------------------
r5799 | calvin | 2009-09-09 20:47:31 +0300 (Wed, 09 Sep 2009) | 10 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: fix bug#46256
Allow tables to be dropped even if the collation is not found,
but issue a warning.
Could not find an easy way to add mysql-test since it requires
changes to charsets and restarting the server. Tests were
executed manually.
Approved by: Heikki (on IM)
------------------------------------------------------------------------
r5805 | vasil | 2009-09-10 08:41:48 +0300 (Thu, 10 Sep 2009) | 7 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Fix a compilation warning caused by c5799:
handler/ha_innodb.cc: In function 'void innobase_get_cset_width(ulint, ulint*, ulint*)':
handler/ha_innodb.cc:830: warning: format '%d' expects type 'int', but argument 2 has type 'ulint'
------------------------------------------------------------------------
r5834 | jyang | 2009-09-11 00:43:05 +0300 (Fri, 11 Sep 2009) | 5 lines
Changed paths:
M /branches/5.1/dict/dict0dict.c
M /branches/5.1/handler/ha_innodb.cc
A /branches/5.1/mysql-test/innodb_bug44369.result
A /branches/5.1/mysql-test/innodb_bug44369.test
M /branches/5.1/row/row0mysql.c
branches/5.1: Block creating table with column name conflicting
with Innodb reserved key words. (Bug #44369) rb://151 approved
by Sunny Bains.
------------------------------------------------------------------------
r5895 | jyang | 2009-09-15 03:39:21 +0300 (Tue, 15 Sep 2009) | 5 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
A /branches/5.1/mysql-test/innodb_bug46000.result
A /branches/5.1/mysql-test/innodb_bug46000.test
branches/5.1: Disallow creating index with the name of
"GEN_CLUST_INDEX" which is reserved for the default system
primary index. (Bug #46000) rb://149 approved by Marko Makela.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5910 | marko | 2009-09-16 04:07:21 -0400 (Wed, 16 Sep 2009) | 9 lines
branches/zip: Introduce UNIV_LOG_LSN_DEBUG and MLOG_LSN for redo log
diagnostics. This was written in order to better track down
Issue #313 in InnoDB Hot Backup.
MLOG_LSN: A new redo log entry type, for recording the current log
sequence number (LSN). This will be checked in an assertion in
recv_parse_log_rec().
rb://161, discussed with Sunny and Vasil.
------------------------------------------------------------------------
r5899 | marko | 2009-09-15 07:26:01 -0400 (Tue, 15 Sep 2009) | 4 lines
branches/zip: ut0ut.h: Do not #include "os0sync.h" #ifdef UNIV_HOTBACKUP.
Since r5872, the InnoDB Hot Backup build was broken.
Fix it by not defining any thread synchronization primitives in ut0ut.h.
InnoDB Hot Backup is a single-threaded program.
------------------------------------------------------------------------
r5898 | marko | 2009-09-15 06:18:50 -0400 (Tue, 15 Sep 2009) | 2 lines
branches/zip: Add */.dirstamp to svn:ignore,
for https://svn.innodb.com/svn/hotbackup/branches/3.5
------------------------------------------------------------------------
r5897 | marko | 2009-09-15 04:29:00 -0400 (Tue, 15 Sep 2009) | 8 lines
branches/zip: Avoid bogus messages about latching order violations when
UNIV_SYNC_DEBUG is defined.
sync_thread_levels_g(): Add the parameter "warn". Do not print
anything unless it is set.
sync_thread_add_level(): Pass warn=TRUE to sync_thread_levels_g()
when the check is within an assertion; FALSE if it is not.
------------------------------------------------------------------------
r5893 | inaam | 2009-09-14 11:20:48 -0400 (Mon, 14 Sep 2009) | 10 lines
branches/zip rb://159
In case of pages that are not made young the counter is incremented
only when the page in question is 'old'. In case of pages that are
made young the counter is incremented in case of all pages. For apple
to apple comparison this patch changes the 'young-making' counter to
consider only 'old' blocks.
Approved by: Marko
------------------------------------------------------------------------
r5889 | vasil | 2009-09-14 05:17:18 -0400 (Mon, 14 Sep 2009) | 5 lines
branches/zip:
Add missing return statement in the test program that could have
caused a warning.
------------------------------------------------------------------------
r5888 | vasil | 2009-09-14 04:38:45 -0400 (Mon, 14 Sep 2009) | 40 lines
branches/zip:
Back-merge c5880 and c5881 from branches/embedded-1.0:
------------------------------------------------------------------------
r5880 | vasil | 2009-09-12 17:28:44 +0300 (Sat, 12 Sep 2009) | 18 lines
Changed paths:
M /branches/embedded-1.0/configure.in
M /branches/embedded-1.0/include/os0sync.h
M /branches/embedded-1.0/srv/srv0start.c
branches/embedded-1.0:
Clean up and simplify the code that surrounds the atomic ops:
* Simplify the code that prints what atomics are used:
Instead of repeating the same conditions on which each atomics are used
use just one printf that prints a variable defined by the code which
chooses what atomics to use.
* In os0sync.h pick up each atomic variant only if it has been selected
by autoconf (based on IB_ATOMIC_MODE_* macros). Define the startup message
to be printed.
* In configure.in: check what user has chosen and if he has chosen
something that is not available, emit an error. If nothing has been chosen
explicitly by the user, auto select an option according to the described
logic in configure.in.
------------------------------------------------------------------------
r5881 | vasil | 2009-09-12 20:08:27 +0300 (Sat, 12 Sep 2009) | 4 lines
Changed paths:
M /branches/embedded-1.0/configure.in
branches/embedded-1.0:
Fix syntax error in test program.
------------------------------------------------------------------------
------------------------------------------------------------------------
r5875 | vasil | 2009-09-12 08:11:25 -0400 (Sat, 12 Sep 2009) | 4 lines
branches/zip:
Remove unnecessary macro.
------------------------------------------------------------------------
r5872 | vasil | 2009-09-12 05:35:17 -0400 (Sat, 12 Sep 2009) | 5 lines
branches/zip:
Explicitly include os0sync.h to the places where HAVE_ATOMIC_BUILTINS and
INNODB_RW_LOCKS_USE_ATOMICS are used to avoid potential problems.
------------------------------------------------------------------------
r5871 | vasil | 2009-09-12 05:25:44 -0400 (Sat, 12 Sep 2009) | 6 lines
branches/zip:
Rename HAVE_SOLARIS_ATOMICS to HAVE_IB_SOLARIS_ATOMICS and
IB_HAVE_PAUSE_INSTRUCTION to HAVE_IB_PAUSE_INSTRUCTION so they
all follow the same HAVE_IB_* convention.
------------------------------------------------------------------------
r5870 | vasil | 2009-09-12 05:13:44 -0400 (Sat, 12 Sep 2009) | 7 lines
branches/zip:
Define HAVE_ATOMIC_BUILTINS and INNODB_RW_LOCKS_USE_ATOMICS in os0sync.h
instead of in univ.i. The code expects os_*() macros to be present if
HAVE_ATOMIC_BUILTINS and INNODB_RW_LOCKS_USE_ATOMICS are defined. So define
them next to defining the os_*() macros.
------------------------------------------------------------------------
r5869 | vasil | 2009-09-12 04:33:11 -0400 (Sat, 12 Sep 2009) | 15 lines
branches/zip:
Include ut0auxconf.h only if none of the macros it would define is defined.
The check when to include this header was outdated from the time when there
was only one macro involved.
Move the atomics checks that are in univ.i outside of
#if windows ... #else ... #endif
This simplifies the code and removes some duplicates like defining
HAVE_ATOMIC_BUILTINS if HAVE_WINDOWS_ATOMICS is defined in both branches.
Do not define the same macro HAVE_ATOMIC_PTHREAD_T for different events.
Instead define HAVE_IB_ATOMIC_PTHREAD_T_GCC and
HAVE_IB_ATOMIC_PTHREAD_T_SOLARIS.
------------------------------------------------------------------------
r5868 | vasil | 2009-09-12 04:01:17 -0400 (Sat, 12 Sep 2009) | 6 lines
branches/zip:
Move the check whether to include ut0auxconf.h before everything because
we are now even checking for GCC atomics, we relied on MySQL to define
this macro before.
------------------------------------------------------------------------
r5867 | vasil | 2009-09-12 03:43:45 -0400 (Sat, 12 Sep 2009) | 4 lines
branches/zip:
Update comment to reflect reality.
------------------------------------------------------------------------
r5866 | vasil | 2009-09-12 03:30:08 -0400 (Sat, 12 Sep 2009) | 5 lines
branches/zip:
Add the check for GCC atomics to ut0auxconf* (copied from plug.in) because
we no longer rely on MySQL's HAVE_GCC_ATOMIC_BUILTINS.
------------------------------------------------------------------------
r5865 | vasil | 2009-09-12 03:26:03 -0400 (Sat, 12 Sep 2009) | 10 lines
branches/zip:
Simplify the compile time checks by splittig them into 5 independent checks:
* Whether GCC atomics are available
* Whether pthread_t can be used by GCC atomics
* Whether Solaris libc atomics are available
* Whether pthread_t can be used by Solaris libs atomics
* Checking the size of pthread_t
------------------------------------------------------------------------
r5864 | vasil | 2009-09-12 03:22:55 -0400 (Sat, 12 Sep 2009) | 4 lines
branches/zip:
Include string.h which is needed for memset().
------------------------------------------------------------------------
r5863 | vasil | 2009-09-12 03:07:08 -0400 (Sat, 12 Sep 2009) | 10 lines
branches/zip:
Check that pthread_t can indeed be passed to Solaris atomic functions, instead
of assuming that it can be passed if 0 can be assigned to it. It could be that:
* 0 can be assigned, but pthread_t cannot be passed and
* 0 cannot be assigned but pthread_t can be passed
Better to check what we are interested in, not something else and make
assumptions.
------------------------------------------------------------------------
r5858 | vasil | 2009-09-11 13:46:47 -0400 (Fri, 11 Sep 2009) | 4 lines
branches/zip:
Fix the indentation of the closing bracket.
------------------------------------------------------------------------
r5826 | marko | 2009-09-10 07:29:46 -0400 (Thu, 10 Sep 2009) | 12 lines
branches/zip: Roll back recovered dictionary transactions before
dropping incomplete indexes (Issue #337).
trx_rollback_or_clean_recovered(ibool all): New function, split from
trx_rollback_or_clean_all_recovered(). all==FALSE will only roll back
dictionary transactions.
recv_recovery_from_checkpoint_finish(): Call
trx_rollback_or_clean_recovered(FALSE) before
row_merge_drop_temp_indexes().
rb://158 approved by Sunny Bains
------------------------------------------------------------------------
r5825 | marko | 2009-09-10 06:47:09 -0400 (Thu, 10 Sep 2009) | 20 lines
branches/zip: Reduce mutex contention that was introduced when
addressing Bug #45015 (Issue #316), in r5703.
buf_page_set_accessed_make_young(): New auxiliary function, called by
buf_page_get_zip(), buf_page_get_gen(),
buf_page_optimistic_get_func(). Call ut_time_ms() outside of
buf_pool_mutex. Use cached access_time.
buf_page_set_accessed(): Add the parameter time_ms, so that
ut_time_ms() need not be called while holding buf_pool_mutex.
buf_page_optimistic_get_func(), buf_page_get_known_nowait(): Read
buf_page_t::access_time without holding buf_pool_mutex. This should be
OK, because the field is only used for heuristic purposes.
buf_page_peek_if_too_old(): If buf_pool->freed_page_clock == 0, return
FALSE, so that we will not waste time moving blocks in the LRU list in
the warm-up phase or when the workload fits in the buffer pool.
rb://156 approved by Sunny Bains
------------------------------------------------------------------------
r5822 | marko | 2009-09-10 06:10:20 -0400 (Thu, 10 Sep 2009) | 1 line
branches/zip: buf_page_release(): De-stutter the function comment.
------------------------------------------------------------------------
r5804 | marko | 2009-09-10 01:29:31 -0400 (Thu, 10 Sep 2009) | 1 line
branches/zip: trx_cleanup_at_db_startup(): Fix a typo in comment.
------------------------------------------------------------------------
r5798 | calvin | 2009-09-09 11:28:10 -0400 (Wed, 09 Sep 2009) | 5 lines
branches/zip:
HA_ERR_TOO_MANY_CONCURRENT_TRXS is added in 5.1.38.
But the plugin should still work with previous versions
of MySQL.
------------------------------------------------------------------------
r5792 | vasil | 2009-09-09 09:35:58 -0400 (Wed, 09 Sep 2009) | 32 lines
branches/zip:
Fix a bug in manipulating the variable innodb_old_blocks_pct:
for any value assigned it got that value -1, except for 75. When
assigned 75, it got 75.
mysql> set global innodb_old_blocks_pct=15;
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like 'innodb_old_blocks_pct';
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| innodb_old_blocks_pct | 14 |
+-----------------------+-------+
1 row in set (0.00 sec)
mysql> set global innodb_old_blocks_pct=75;
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like 'innodb_old_blocks_pct';
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| innodb_old_blocks_pct | 75 |
+-----------------------+-------+
After the fix it gets exactly what was assigned.
Approved by: Marko (via IM)
------------------------------------------------------------------------
r5783 | marko | 2009-09-09 03:25:00 -0400 (Wed, 09 Sep 2009) | 1 line
branches/zip: buf_page_is_accessed(): Correct the function comment.
------------------------------------------------------------------------
r5782 | marko | 2009-09-09 03:00:59 -0400 (Wed, 09 Sep 2009) | 2 lines
branches/zip: buf_page_peek_if_too_old(): Silence a compiler warning
that was introduced in r5779 on 32-bit systems.
------------------------------------------------------------------------
r5780 | marko | 2009-09-09 02:50:50 -0400 (Wed, 09 Sep 2009) | 1 line
branches/zip: ut_time_ms(): Return ulint, not uint.
------------------------------------------------------------------------
r5779 | marko | 2009-09-09 02:17:19 -0400 (Wed, 09 Sep 2009) | 2 lines
branches/zip: buf_page_peek_if_too_old(): Make the bitmasking work when
buf_pool->freed_page_clock is wider than 32 bits.
------------------------------------------------------------------------
r5777 | marko | 2009-09-08 11:50:25 -0400 (Tue, 08 Sep 2009) | 2 lines
branches/zip: Remove BUF_LRU_INITIAL_RATIO, which should have been removed
together with buf_LRU_get_recent_limit().
------------------------------------------------------------------------
r5775 | calvin | 2009-09-07 17:15:05 -0400 (Mon, 07 Sep 2009) | 13 lines
branches/zip: Build InnoDB on Windows with UNIV_HOTBACKUP
The changes are non-functional changes for normal InnoDB,
but needed for building the Hot Backup on Windows (with
UNIV_HOTBACKUP defined).
- Define os_aio_use_native_aio for HB.
- Do not acquire seek mutexes for backup since HB is single threaded.
- Do not use srv_flush_log_at_trx_commit for HB build
rb://155
Approved by: Marko
------------------------------------------------------------------------
r5752 | marko | 2009-09-03 10:55:51 -0400 (Thu, 03 Sep 2009) | 10 lines
branches/zip: recv_recover_page_func(): Write the log sequence number
to the compressed page, if there is one. Previously, the function only
wrote the LSN to the uncompressed page.
It is not clear why recv_recover_page_func() is updating FIL_PAGE_LSN
in the buffer pool. The log sequence number will be stamped on the
page when it is flushed to disk, in buf_flush_init_for_writing().
I noticed this inconsistency when analyzing Issue #313, but this patch
does not fix it. That is no surprise, since FIL_PAGE_LSN should only
matter on disk files, not in the buffer pool.
------------------------------------------------------------------------
r5751 | marko | 2009-09-03 10:36:15 -0400 (Thu, 03 Sep 2009) | 7 lines
branches/zip: row_merge(): Remove a bogus debug assertion
that was triggered when creating an index on an empty table.
row_merge_sort(): Add debug assertions and comments that justify
the loop termination condition.
The bogus assertion ut_ad(ihalf > 0) was reported by Michael.
------------------------------------------------------------------------
r5748 | marko | 2009-09-03 07:05:44 -0400 (Thu, 03 Sep 2009) | 1 line
branches/zip: MLOG_MULTI_REC_END: Correct the comment.
------------------------------------------------------------------------
r5747 | marko | 2009-09-03 06:46:38 -0400 (Thu, 03 Sep 2009) | 2 lines
branches/zip: recv_scan_log_recs(): Replace while with do...while,
because the termination condition will always hold on the first iteration.
------------------------------------------------------------------------
r5746 | marko | 2009-09-03 04:55:36 -0400 (Thu, 03 Sep 2009) | 2 lines
branches/zip: log_reserve_and_write_fast(): Do not cache the log_sys pointer
in a local variable.
------------------------------------------------------------------------
r5745 | marko | 2009-09-03 04:38:22 -0400 (Thu, 03 Sep 2009) | 2 lines
branches/zip: log_check_log_recs(): Enclose in #ifdef UNIV_LOG_DEBUG.
Add const qualifiers.
------------------------------------------------------------------------
r5744 | marko | 2009-09-03 04:28:35 -0400 (Thu, 03 Sep 2009) | 1 line
branches/zip: ut_align(): Make ptr const, like in ut_align_down().
------------------------------------------------------------------------
r5743 | marko | 2009-09-03 02:36:12 -0400 (Thu, 03 Sep 2009) | 3 lines
branches/zip: log_reserve_and_write_fast(): Remove the redundant
output parameter "success".
Success is also indicated by a nonzero return value.
------------------------------------------------------------------------
r5736 | marko | 2009-09-02 03:53:19 -0400 (Wed, 02 Sep 2009) | 1 line
branches/zip: Enclose some timestamp functions in #ifndef UNIV_HOTBACKUP.
------------------------------------------------------------------------
r5735 | marko | 2009-09-02 03:43:09 -0400 (Wed, 02 Sep 2009) | 2 lines
branches/zip: univ.i: Do not undefine PACKAGE or VERSION.
InnoDB source code does not refer to these macros.
------------------------------------------------------------------------
r5734 | sunny | 2009-09-02 03:08:45 -0400 (Wed, 02 Sep 2009) | 2 lines
branches/zip: Update ChangeLog with r5733 changes.
------------------------------------------------------------------------
r5733 | sunny | 2009-09-02 03:05:15 -0400 (Wed, 02 Sep 2009) | 6 lines
branches/zip: Fix a regression introduced by the fix for bug#26316. We check
whether a transaction holds any AUTOINC locks before we acquire the kernel
mutex and release those locks.
Fix for rb://153. Approved by Marko.
------------------------------------------------------------------------
r5716 | vasil | 2009-08-31 03:47:49 -0400 (Mon, 31 Aug 2009) | 9 lines
branches/zip:
Fix Bug#46718 InnoDB plugin incompatible with gcc 4.1 (at least: on PPC): "Undefined symbol"
by implementing our own check in plug.in instead of using the result from
the check from MySQL because it is insufficient.
Approved by: Marko (rb://154)
------------------------------------------------------------------------
r5714 | marko | 2009-08-31 02:10:10 -0400 (Mon, 31 Aug 2009) | 5 lines
branches/zip: buf_chunk_not_freed(): Do not acquire block->mutex unless
block->page.state == BUF_BLOCK_FILE_PAGE. Check that block->page.state
makes sense.
Approved by Sunny Bains over the IM.
------------------------------------------------------------------------
r5709 | inaam | 2009-08-28 02:22:46 -0400 (Fri, 28 Aug 2009) | 5 lines
branches/zip rb://152
Disable display of deprecated parameter innodb_file_io_threads in
'show variables'.
------------------------------------------------------------------------
r5708 | inaam | 2009-08-27 18:43:32 -0400 (Thu, 27 Aug 2009) | 4 lines
branches/zip
Remove redundant TRUE : FALSE from the return statement
------------------------------------------------------------------------
r5707 | inaam | 2009-08-27 12:20:35 -0400 (Thu, 27 Aug 2009) | 6 lines
branches/zip
Remove unused macros as we erased the random readahead code in r5703.
Also fixed some comments.
------------------------------------------------------------------------
r5706 | inaam | 2009-08-27 12:00:27 -0400 (Thu, 27 Aug 2009) | 20 lines
branches/zip rb://147
Done away with following two status variables:
innodb_buffer_pool_read_ahead_rnd
innodb_buffer_pool_read_ahead_seq
Introduced two new status variables:
innodb_buffer_pool_read_ahead = number of pages read as part of
readahead since server startup
innodb_buffer_pool_read_ahead_evicted = number of pages that are read
in as readahead but were evicted before ever being accessed since
server startup i.e.: a measure of how badly our readahead is
performing
SHOW INNODB STATUS will show two extra numbers in buffer pool section:
pages read ahead/sec and pages evicted without access/sec
Approved by: Marko
------------------------------------------------------------------------
r5705 | marko | 2009-08-27 07:56:24 -0400 (Thu, 27 Aug 2009) | 11 lines
branches/zip: dict_index_find_cols(): On column name lookup failure,
return DB_CORRUPTION (HA_ERR_CRASHED) instead of abnormally
terminating the server. Also, disable the previously added diagnostic
output to the error log, because mysql-test-run does not like extra
output in the error log. (Bug #44571)
dict_index_add_to_cache(): Handle errors from dict_index_find_cols().
mysql-test/innodb_bug44571.test: A test case for triggering the bug.
rb://135 approved by Sunny Bains.
------------------------------------------------------------------------
r5704 | marko | 2009-08-27 04:31:17 -0400 (Thu, 27 Aug 2009) | 32 lines
branches/zip: Fix a critical bug in fast index creation that could
corrupt the created indexes.
row_merge(): Make "half" an in/out parameter. Determine the offset of
half the output file. Copy the last blocks record-by-record instead of
block-by-block, so that the records can be counted. Check that the
input and output have matching n_rec.
row_merge_sort(): Do not assume that two blocks of size N are merged
into a block of size 2*N. The output block can be shorter than the
input if the last page of each input block is almost empty. Use an
accurate termination condition, based on the "half" computed by
row_merge().
row_merge_read(), row_merge_write(), row_merge_blocks(): Add debug output.
merge_file_t, row_merge_file_create(): Add n_rec, the number of records
in the merge file.
row_merge_read_clustered_index(): Update n_rec.
row_merge_blocks(): Update and check n_rec.
row_merge_blocks_copy(): New function, for copying the last blocks in
row_merge(). Update and check n_rec.
This bug was discovered with a user-supplied test case that creates an
index where the initial temporary file is 249 one-megabyte blocks and
the merged files become smaller. In the test, possible merge record
sizes are 10, 18, and 26 bytes.
rb://150 approved by Sunny Bains. This addresses Issue #320.
------------------------------------------------------------------------
r5703 | marko | 2009-08-27 03:25:00 -0400 (Thu, 27 Aug 2009) | 41 lines
branches/zip: Replace the constant 3/8 ratio that controls the LRU_old
size with the settable global variable innodb_old_blocks_pct. The
minimum and maximum values are 5 and 95 per cent, respectively. The
default is 100*3/8, in line with the old behavior.
ut_time_ms(): New utility function, to return the current time in
milliseconds. TODO: Is there a more efficient timestamp function, such
as rdtsc divided by a power of two?
buf_LRU_old_threshold_ms: New variable, corresponding to
innodb_old_blocks_time. The value 0 is the default behaviour: no
timeout before making blocks 'new'.
bpage->accessed, bpage->LRU_position, buf_pool->ulint_clock: Remove.
bpage->access_time: New field, replacing bpage->accessed. Protected by
buf_pool_mutex instead of bpage->mutex. Updated when a page is created
or accessed the first time in the buffer pool.
buf_LRU_old_ratio, innobase_old_blocks_pct: New variables,
corresponding to innodb_old_blocks_pct
buf_LRU_old_ratio_update(), innobase_old_blocks_pct_update(): Update
functions for buf_LRU_old_ratio, innobase_old_blocks_pct.
buf_page_peek_if_too_old(): Compare ut_time_ms() to bpage->access_time
if buf_LRU_old_threshold_ms && bpage->old. Else observe
buf_LRU_old_ratio and bpage->freed_page_clock.
buf_pool_t: Add n_pages_made_young, n_pages_not_made_young,
n_pages_made_young_old, n_pages_not_made_young, for statistics.
buf_print(): Display buf_pool->n_pages_made_young,
buf_pool->n_pages_not_made_young. This function is only for crash
diagnostics.
buf_print_io(): Display buf_pool->LRU_old_len and quantities derived
from buf_pool->n_pages_made_young, buf_pool->n_pages_not_made_young.
This function is invoked by SHOW ENGINE INNODB STATUS.
rb://129 approved by Heikki Tuuri. This addresses Bug #45015.
------------------------------------------------------------------------
r5702 | marko | 2009-08-27 03:03:15 -0400 (Thu, 27 Aug 2009) | 1 line
branches/zip: Document also the files affected by r5698 in the ChangeLog.
------------------------------------------------------------------------
r5701 | marko | 2009-08-27 03:01:42 -0400 (Thu, 27 Aug 2009) | 1 line
branches/zip: Document r5698 in the ChangeLog.
------------------------------------------------------------------------
r5698 | inaam | 2009-08-26 10:34:35 -0400 (Wed, 26 Aug 2009) | 13 lines
branches/zip bug#42885 rb://148
The call to put IO threads to sleep was most probably meant for Windows
only as the comment in buf0rea.c suggests. However it was enabled on
all platforms. This patch restricts the sleep call to windows. This
approach of not putting threads to sleep makes even more sense because
now we have multiple threads working in the background and it probably
is not a good idea to put all of them to sleep because a user thread
wants to post a batch for readahead.
Approved by: Marko
------------------------------------------------------------------------
r5697 | vasil | 2009-08-26 09:44:40 -0400 (Wed, 26 Aug 2009) | 4 lines
branches/zip:
Fix typo.
------------------------------------------------------------------------
r5696 | vasil | 2009-08-26 09:15:59 -0400 (Wed, 26 Aug 2009) | 14 lines
branches/zip:
Merge a change from MySQL:
http://lists.mysql.com/commits/80832
2968 Jonathan Perkin 2009-08-14
Build fixes for Windows, AIX, HP/UX and Sun Studio11, from Timothy Smith.
modified:
CMakeLists.txt
cmd-line-utils/readline/util.c
storage/innodb_plugin/handler/i_s.cc
storage/innodb_plugin/include/univ.i
------------------------------------------------------------------------
r5695 | marko | 2009-08-26 09:14:59 -0400 (Wed, 26 Aug 2009) | 1 line
branches/zip: UNIV_DEBUG_LOCK_VALIDATE: Move the definition to univ.i.
------------------------------------------------------------------------
r5694 | marko | 2009-08-26 07:25:26 -0400 (Wed, 26 Aug 2009) | 2 lines
branches/zip: buf_page_t: Clarify that bpage->list may contain garbage.
This comment was provoked by Inaam.
------------------------------------------------------------------------
r5687 | vasil | 2009-08-20 05:20:22 -0400 (Thu, 20 Aug 2009) | 8 lines
branches/zip:
ChangeLog:
Follow the convention from the rest of the ChangeLog: for bugfixes from
bugs.mysql.com only the bug number and title goes in the ChangeLog. Detailed
explanation on what is the problem and how it was fixed is present in
the bugs database.
------------------------------------------------------------------------
r5686 | vasil | 2009-08-20 05:15:05 -0400 (Thu, 20 Aug 2009) | 4 lines
branches/zip:
White-space fixup.
------------------------------------------------------------------------
r5685 | sunny | 2009-08-20 04:18:29 -0400 (Thu, 20 Aug 2009) | 2 lines
branches/zip: Update the ChangeLog with r5684 change.
------------------------------------------------------------------------
r5684 | sunny | 2009-08-20 04:05:30 -0400 (Thu, 20 Aug 2009) | 10 lines
branches/zip: Fix bug# 46650: Innodb assertion autoinc_lock == lock in lock_table_remove_low on INSERT SELECT
We only store the autoinc locks that are granted in the transaction's autoinc
lock vector. A transacton, that has been rolled back due to a deadlock because
of an AUTOINC lock attempt, will not have added that lock to the vector. We
need to check for that when we remove that lock.
rb://145
Approved by Marko.
------------------------------------------------------------------------
r5681 | sunny | 2009-08-14 02:16:24 -0400 (Fri, 14 Aug 2009) | 3 lines
branches/zip: When building HotBackup srv_use_sys_malloc is #ifdef out. We
move access to the this variable within a !UNIV_HOTBACKUP block.
------------------------------------------------------------------------
r5671 | marko | 2009-08-13 04:46:33 -0400 (Thu, 13 Aug 2009) | 5 lines
branches/zip: ha_innobase::add_index(): Fix Bug #46557:
after a successful operation, read innodb_table->flags from
the newly created table object, not from the old one that was just freed.
Approved by Sunny.
------------------------------------------------------------------------
r5670 | marko | 2009-08-12 09:16:37 -0400 (Wed, 12 Aug 2009) | 2 lines
branches/zip: trx_undo_rec_copy(): Add const qualifier to undo_rec.
This is a non-functional change.
------------------------------------------------------------------------
r5663 | marko | 2009-08-11 07:42:37 -0400 (Tue, 11 Aug 2009) | 2 lines
branches/zip: trx_general_rollback_for_mysql(): Remove the redundant
parameter partial. If savept==NULL, partial==FALSE.
------------------------------------------------------------------------
r5662 | marko | 2009-08-11 05:54:16 -0400 (Tue, 11 Aug 2009) | 1 line
branches/zip: Bump the version number to 1.0.5 after releasing 1.0.4.
------------------------------------------------------------------------
r5642 | calvin | 2009-08-06 19:04:03 -0400 (Thu, 06 Aug 2009) | 2 lines
branches/zip: remove duplicate "the" in comments.
------------------------------------------------------------------------
r5639 | marko | 2009-08-06 06:39:34 -0400 (Thu, 06 Aug 2009) | 3 lines
branches/zip: mem_heap_block_free(): If innodb_use_sys_malloc is set,
do not tell Valgrind that the memory is free, to avoid
a bogus warning in Valgrind's built-in free() hook.
------------------------------------------------------------------------
r5636 | marko | 2009-08-05 08:27:30 -0400 (Wed, 05 Aug 2009) | 2 lines
branches/zip: lock_rec_validate_page(): Add the parameter zip_size.
This should help track down Mantis Issue #289.
------------------------------------------------------------------------
r5635 | marko | 2009-08-05 07:06:55 -0400 (Wed, 05 Aug 2009) | 2 lines
branches/zip: Replace <number> with NUMBER in some comments,
to avoid problems with Doxygen XML output.
------------------------------------------------------------------------
r5629 | marko | 2009-08-04 07:42:44 -0400 (Tue, 04 Aug 2009) | 1 line
branches/zip: mysql-test: Pass MTR's internal checks.
------------------------------------------------------------------------
r5626 | vasil | 2009-08-04 01:53:31 -0400 (Tue, 04 Aug 2009) | 4 lines
branches/zip:
Revert the dummy change from c5625.
------------------------------------------------------------------------
r5625 | vasil | 2009-08-04 01:52:48 -0400 (Tue, 04 Aug 2009) | 32 lines
branches/zip: Merge 5518:5622 from branches/5.1, resolving conflict in r5622
(after resolving the conflict Makefile.am was not changed so I have made
a dummy change so I can commit and thus record that branches/5.1 has been
merged in branches/zip up to 5622):
------------------------------------------------------------------------
r5622 | vasil | 2009-08-03 15:27:00 +0300 (Mon, 03 Aug 2009) | 20 lines
Changed paths:
M /branches/5.1/Makefile.am
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2988
committer: Satya B <satya.bn@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Wed 2009-07-01 11:06:05 +0530
message:
Fix build failure after applying Innodb snapshot 5.1-ss5282
After applying Innodb snapshot 5.1-ss5282, build was broken
because of missing header file.
Adding the header file to Makefile.am after informing the
innodb developers.
modified:
storage/innobase/Makefile.am
------------------------------------------------------------------------
------------------------------------------------------------------------
r5614 | vasil | 2009-07-31 11:09:07 -0400 (Fri, 31 Jul 2009) | 6 lines
branches/zip:
Add fsp0types.h to the list of noinst_HEADERS
Suggested by: Sergey Vojtovich <svoj@sun.com>
------------------------------------------------------------------------
r5539 | vasil | 2009-07-21 06:28:27 -0400 (Tue, 21 Jul 2009) | 4 lines
branches/zip:
Add a test program to check whether the PAUSE instruction is available.
------------------------------------------------------------------------
r5537 | vasil | 2009-07-21 05:31:26 -0400 (Tue, 21 Jul 2009) | 5 lines
branches/zip:
Fixups in ChangeLog: sort filenames alphabetically and wrap to 78 chars per
line.
------------------------------------------------------------------------
r5527 | sunny | 2009-07-20 17:56:30 -0400 (Mon, 20 Jul 2009) | 2 lines
branches/zip: For HotBackup builds we don't want to hide the symbols.
------------------------------------------------------------------------
r5525 | calvin | 2009-07-20 13:14:30 -0400 (Mon, 20 Jul 2009) | 2 lines
branches/zip: add ChangeLog entry for r5524.
------------------------------------------------------------------------
16 years ago  MDEV-11254: innodb-use-trim has no effect in 10.2
Problem was that implementation merged from 10.1 was incompatible
with InnoDB 5.7.
buf0buf.cc: Add functions to return should we punch hole and
how big.
buf0flu.cc: Add written page to IORequest
fil0fil.cc: Remove unneeded status call and add test is
sparse files and punch hole supported by file system when
tablespace is created. Add call to get file system
block size. Used file node is added to IORequest. Added
functions to check is punch hole supported and setting
punch hole.
ha_innodb.cc: Remove unneeded status variables (trim512-32768)
and trim_op_saved. Deprecate innodb_use_trim and
set it ON by default. Add function to set innodb-use-trim
dynamically.
dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
if punch hole operation fails.
fil0fil.h: Add punch_hole variable to fil_space_t and
block size to fil_node_t.
os0api.h: Header to helper functions on buf0buf.cc and
fil0fil.cc for os0file.h
os0file.h: Remove unneeded m_block_size from IORequest
and add bpage to IORequest to know actual size of
the block and m_fil_node to know tablespace file
system block size and does it support punch hole.
os0file.cc: Add function punch_hole() to IORequest
to do punch_hole operation,
get the file system block size and determine
does file system support sparse files (for punch hole).
page0size.h: remove implicit copy disable and
use this implicit copy to implement copy_from()
function.
buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
os0file.h, os0file.cc, log0log.cc, log0recv.cc:
Remove unneeded write_size parameter from fil_io
calls.
srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
trim512-trim32678 status variables. Removed
these from monitor tests.
9 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  branches/innodb+: Merge revisions 6130:6364 from branches/zip:
------------------------------------------------------------------------
r6130 | marko | 2009-11-02 11:42:56 +0200 (Mon, 02 Nov 2009) | 9 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/btr/btr0sea.c
M /branches/zip/buf/buf0buf.c
M /branches/zip/dict/dict0dict.c
M /branches/zip/fil/fil0fil.c
M /branches/zip/ibuf/ibuf0ibuf.c
M /branches/zip/include/btr0sea.h
M /branches/zip/include/dict0dict.h
M /branches/zip/include/fil0fil.h
M /branches/zip/include/ibuf0ibuf.h
M /branches/zip/include/lock0lock.h
M /branches/zip/include/log0log.h
M /branches/zip/include/log0recv.h
M /branches/zip/include/mem0mem.h
M /branches/zip/include/mem0pool.h
M /branches/zip/include/os0file.h
M /branches/zip/include/pars0pars.h
M /branches/zip/include/srv0srv.h
M /branches/zip/include/thr0loc.h
M /branches/zip/include/trx0i_s.h
M /branches/zip/include/trx0purge.h
M /branches/zip/include/trx0rseg.h
M /branches/zip/include/trx0sys.h
M /branches/zip/include/trx0undo.h
M /branches/zip/include/usr0sess.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/log/log0log.c
M /branches/zip/log/log0recv.c
M /branches/zip/mem/mem0dbg.c
M /branches/zip/mem/mem0pool.c
M /branches/zip/os/os0file.c
M /branches/zip/os/os0sync.c
M /branches/zip/os/os0thread.c
M /branches/zip/pars/lexyy.c
M /branches/zip/pars/pars0lex.l
M /branches/zip/que/que0que.c
M /branches/zip/srv/srv0srv.c
M /branches/zip/srv/srv0start.c
M /branches/zip/sync/sync0arr.c
M /branches/zip/sync/sync0sync.c
M /branches/zip/thr/thr0loc.c
M /branches/zip/trx/trx0i_s.c
M /branches/zip/trx/trx0purge.c
M /branches/zip/trx/trx0rseg.c
M /branches/zip/trx/trx0sys.c
M /branches/zip/trx/trx0undo.c
M /branches/zip/usr/usr0sess.c
M /branches/zip/ut/ut0mem.c
branches/zip: Free all resources at shutdown. Set pointers to NULL, so
that Valgrind will not complain about freed data structures that are
reachable via pointers. This addresses Bug #45992 and Bug #46656.
This patch is mostly based on changes copied from branches/embedded-1.0,
mainly c5432, c3439, c3134, c2994, c2978, but also some other code was
copied. Some added cleanup code is specific to MySQL/InnoDB.
rb://199 approved by Sunny Bains
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 09:57:29 +0200 (Wed, 04 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
r6137 | marko | 2009-11-04 15:24:28 +0200 (Wed, 04 Nov 2009) | 1 line
Changed paths:
M /branches/zip/dict/dict0dict.c
branches/zip: dict_index_too_big_for_undo(): Correct a typo.
------------------------------------------------------------------------
r6153 | vasil | 2009-11-10 15:33:22 +0200 (Tue, 10 Nov 2009) | 145 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Merge r6125:6152 from branches/5.1:
(everything except the last white-space change was skipped as it is already
in branches/zip)
------------------------------------------------------------------------
r6127 | vasil | 2009-10-30 11:18:25 +0200 (Fri, 30 Oct 2009) | 18 lines
Changed paths:
M /branches/5.1/Makefile.am
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Backport c6121 from branches/zip:
------------------------------------------------------------------------
r6121 | sunny | 2009-10-30 01:42:11 +0200 (Fri, 30 Oct 2009) | 7 lines
Changed paths:
M /branches/zip/mysql-test/innodb-autoinc.result
branches/zip: This test has been problematic for sometime now. The underlying
bug is that the data dictionaries get out of sync. In the AUTOINC code we
try and apply salve to the symptoms. In the past MySQL made some unrelated
change and the dictionaries stopped getting out of sync and this test started
to fail. Now, it seems they have reverted that changed and the test is
passing again. I suspect this is not he last time that this test will change.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6129 | vasil | 2009-10-30 17:14:22 +0200 (Fri, 30 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/Makefile.am
branches/5.1:
Revert a change to Makefile.am that sneaked unnoticed in c6127.
------------------------------------------------------------------------
r6136 | marko | 2009-11-04 12:28:10 +0200 (Wed, 04 Nov 2009) | 15 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/ha_prototypes.h
M /branches/5.1/ut/ut0ut.c
branches/5.1: Port r6134 from branches/zip:
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 07:57:29 +0000 (Wed, 04 Nov 2009) | 5 lines
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
innobase_print_identifier(): Replace with innobase_convert_name().
innobase_convert_identifier(): New function, called by innobase_convert_name().
------------------------------------------------------------------------
r6149 | vasil | 2009-11-09 11:15:01 +0200 (Mon, 09 Nov 2009) | 5 lines
Changed paths:
M /branches/5.1/CMakeLists.txt
branches/5.1:
Followup to r5700: Adjust the changes so they are the same as in the BZR
repository.
------------------------------------------------------------------------
r6150 | vasil | 2009-11-09 11:43:31 +0200 (Mon, 09 Nov 2009) | 58 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a part of r2911.5.5 from MySQL:
(the other part of this was merged in c5700)
------------------------------------------------------------
revno: 2911.5.5
committer: Vladislav Vaintroub <vvaintroub@mysql.com>
branch nick: 5.1-innodb_plugin
timestamp: Wed 2009-06-10 10:59:49 +0200
message:
Backport WL#3653 to 5.1 to enable bundled innodb plugin.
Remove custom DLL loader code from innodb plugin code, use
symbols exported from mysqld.
removed:
storage/innodb_plugin/handler/handler0vars.h
storage/innodb_plugin/handler/win_delay_loader.cc
added:
storage/mysql_storage_engine.cmake
win/create_def_file.js
modified:
CMakeLists.txt
include/m_ctype.h
include/my_global.h
include/my_sys.h
include/mysql/plugin.h
libmysqld/CMakeLists.txt
mysql-test/mysql-test-run.pl
mysql-test/t/plugin.test
mysql-test/t/plugin_load-master.opt
mysys/charset.c
sql/CMakeLists.txt
sql/handler.h
sql/mysql_priv.h
sql/mysqld.cc
sql/sql_class.cc
sql/sql_class.h
sql/sql_list.h
sql/sql_profile.h
storage/Makefile.am
storage/archive/CMakeLists.txt
storage/blackhole/CMakeLists.txt
storage/csv/CMakeLists.txt
storage/example/CMakeLists.txt
storage/federated/CMakeLists.txt
storage/heap/CMakeLists.txt
storage/innobase/CMakeLists.txt
storage/innobase/handler/ha_innodb.cc
storage/innodb_plugin/CMakeLists.txt
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/handler0alter.cc
storage/innodb_plugin/handler/i_s.cc
storage/innodb_plugin/plug.in
storage/myisam/CMakeLists.txt
storage/myisammrg/CMakeLists.txt
win/Makefile.am
win/configure.js
------------------------------------------------------------------------
r6152 | vasil | 2009-11-10 15:30:20 +0200 (Tue, 10 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
White space fixup.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6157 | jyang | 2009-11-11 14:27:09 +0200 (Wed, 11 Nov 2009) | 10 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
A /branches/zip/mysql-test/innodb_bug47167.result
A /branches/zip/mysql-test/innodb_bug47167.test
M /branches/zip/mysql-test/innodb_file_format.result
branches/zip: Fix an issue that a local variable defined
in innodb_file_format_check_validate() is being referenced
across function in innodb_file_format_check_update().
In addition, fix "set global innodb_file_format_check =
DEFAULT" call.
Bug #47167: "set global innodb_file_format_check" cannot
set value by User-Defined Variable."
rb://169 approved by Sunny Bains and Marko.
------------------------------------------------------------------------
r6159 | vasil | 2009-11-11 15:13:01 +0200 (Wed, 11 Nov 2009) | 37 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/handler/ha_innodb.h
branches/zip:
Merge a change from MySQL:
(this has been reviewed by Calvin and Marko, and Calvin says Luis has
incorporated Marko's suggestions)
------------------------------------------------------------
revno: 3092.5.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Thu 2009-09-24 15:52:52 +0100
message:
BUG#42829: binlogging enabled for all schemas regardless of
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
added:
mysql-test/suite/binlog/r/binlog_stm_do_db.result
mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt
mysql-test/suite/binlog/t/binlog_stm_do_db.test
modified:
sql/sql_base.cc
sql/sql_class.cc
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/ha_innodb.h
------------------------------------------------------------------------
r6160 | vasil | 2009-11-11 15:33:49 +0200 (Wed, 11 Nov 2009) | 72 lines
Changed paths:
M /branches/zip/include/os0file.h
M /branches/zip/os/os0file.c
branches/zip: Merge r6152:6159 from branches/5.1:
(r6158 was skipped as an equivallent change has already been merged from MySQL)
------------------------------------------------------------------------
r6154 | calvin | 2009-11-11 02:51:17 +0200 (Wed, 11 Nov 2009) | 17 lines
Changed paths:
M /branches/5.1/include/os0file.h
M /branches/5.1/os/os0file.c
branches/5.1: fix bug#3139: Mysql crashes: 'windows error 995'
after several selects on a large DB
During stress environment, Windows AIO may fail with error code
ERROR_OPERATION_ABORTED. InnoDB does not handle the error, rather
crashes. The cause of the error is unknown, but likely due to
faulty hardware or driver.
This patch introduces a new error code OS_FILE_OPERATION_ABORTED,
which maps to Windows ERROR_OPERATION_ABORTED (995). When the error
is detected during AIO, the InnoDB will issue a synchronous retry
(read/write).
This patch has been extensively tested by MySQL support.
Approved by: Marko
rb://196
------------------------------------------------------------------------
r6158 | vasil | 2009-11-11 14:52:14 +0200 (Wed, 11 Nov 2009) | 37 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/handler/ha_innodb.h
branches/5.1:
Merge a change from MySQL:
(this has been reviewed by Calvin and Marko, and Calvin says Luis has
incorporated Marko's suggestions)
------------------------------------------------------------
revno: 3092.5.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Thu 2009-09-24 15:52:52 +0100
message:
BUG#42829: binlogging enabled for all schemas regardless of
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
added:
mysql-test/suite/binlog/r/binlog_stm_do_db.result
mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt
mysql-test/suite/binlog/t/binlog_stm_do_db.test
modified:
sql/sql_base.cc
sql/sql_class.cc
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/ha_innodb.h
------------------------------------------------------------------------
------------------------------------------------------------------------
r6161 | vasil | 2009-11-11 15:36:16 +0200 (Wed, 11 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add changelog entry for r6160.
------------------------------------------------------------------------
r6162 | vasil | 2009-11-11 16:00:12 +0200 (Wed, 11 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog for r6157.
------------------------------------------------------------------------
r6163 | calvin | 2009-11-11 17:53:20 +0200 (Wed, 11 Nov 2009) | 8 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/handler/ha_innodb.h
branches/zip: Exclude thd_binlog_filter_ok() when building
with older version of MySQL.
thd_binlog_filter_ok() is introduced in MySQL 5.1.41. But the
plugin can be built with MySQL prior to 5.1.41.
Approved by Heikki (on IM).
------------------------------------------------------------------------
r6169 | calvin | 2009-11-12 14:40:43 +0200 (Thu, 12 Nov 2009) | 6 lines
Changed paths:
A /branches/zip/mysql-test/innodb_bug46676.result
A /branches/zip/mysql-test/innodb_bug46676.test
branches/zip: add test case for bug#46676
This crash is reproducible with InnoDB plugin 1.0.4 + MySQL 5.1.37.
But no longer reproducible after MySQL 5.1.38 (with plugin 1.0.5).
Add test case to catch future regression.
------------------------------------------------------------------------
r6170 | marko | 2009-11-12 15:49:08 +0200 (Thu, 12 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/db0err.h
M /branches/zip/row/row0merge.c
M /branches/zip/row/row0mysql.c
branches/zip: Allow CREATE INDEX to be interrupted. (Issue #354)
rb://183 approved by Heikki Tuuri
------------------------------------------------------------------------
r6175 | vasil | 2009-11-16 20:07:39 +0200 (Mon, 16 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Wrap line at 78th char in the ChangeLog
------------------------------------------------------------------------
r6177 | calvin | 2009-11-16 20:20:38 +0200 (Mon, 16 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip: add an entry to ChangeLog for r6065
------------------------------------------------------------------------
r6179 | marko | 2009-11-17 10:19:34 +0200 (Tue, 17 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: ha_innobase::change_active_index(): When the history is
missing, report it to the client, not to the error log.
------------------------------------------------------------------------
r6181 | vasil | 2009-11-17 12:21:41 +0200 (Tue, 17 Nov 2009) | 33 lines
Changed paths:
M /branches/zip/mysql-test/innodb-index.test
branches/zip:
At the end of innodb-index.test: restore the environment as it was before
the test was started to silence this warning:
MTR's internal check of the test case 'main.innodb-index' failed.
This means that the test case does not preserve the state that existed
before the test case was executed. Most likely the test case did not
do a proper clean-up.
This is the diff of the states of the servers before and after the
test case was executed:
mysqltest: Logging to '/tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.log'.
mysqltest: Results saved in '/tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result'.
mysqltest: Connecting to server localhost:13000 (socket /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/mysqld.1.sock) as 'root', connection 'default', attempt 0 ...
mysqltest: ... Connected.
mysqltest: Start processing test commands from './include/check-testcase.test' ...
mysqltest: ... Done processing test commands.
--- /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result 2009-11-17 13:10:40.000000000 +0300
+++ /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.reject 2009-11-17 13:10:54.000000000 +0300
@@ -84,7 +84,7 @@
INNODB_DOUBLEWRITE ON
INNODB_FAST_SHUTDOWN 1
INNODB_FILE_FORMAT Antelope
-INNODB_FILE_FORMAT_CHECK Antelope
+INNODB_FILE_FORMAT_CHECK Barracuda
INNODB_FILE_PER_TABLE OFF
INNODB_FLUSH_LOG_AT_TRX_COMMIT 1
INNODB_FLUSH_METHOD
mysqltest: Result content mismatch
not ok
------------------------------------------------------------------------
r6182 | marko | 2009-11-17 13:49:15 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-consistent-master.opt
M /branches/zip/mysql-test/innodb-consistent.result
M /branches/zip/mysql-test/innodb-consistent.test
M /branches/zip/mysql-test/innodb-use-sys-malloc-master.opt
M /branches/zip/mysql-test/innodb-use-sys-malloc.result
M /branches/zip/mysql-test/innodb-use-sys-malloc.test
M /branches/zip/mysql-test/innodb_bug21704.result
M /branches/zip/mysql-test/innodb_bug21704.test
M /branches/zip/mysql-test/innodb_bug40360.test
M /branches/zip/mysql-test/innodb_bug40565.result
M /branches/zip/mysql-test/innodb_bug40565.test
M /branches/zip/mysql-test/innodb_bug41904.result
M /branches/zip/mysql-test/innodb_bug41904.test
M /branches/zip/mysql-test/innodb_bug42101-nonzero-master.opt
M /branches/zip/mysql-test/innodb_bug42101-nonzero.result
M /branches/zip/mysql-test/innodb_bug42101-nonzero.test
M /branches/zip/mysql-test/innodb_bug42101.result
M /branches/zip/mysql-test/innodb_bug42101.test
M /branches/zip/mysql-test/innodb_bug44032.result
M /branches/zip/mysql-test/innodb_bug44032.test
M /branches/zip/mysql-test/innodb_bug44369.result
M /branches/zip/mysql-test/innodb_bug44369.test
M /branches/zip/mysql-test/innodb_bug44571.result
M /branches/zip/mysql-test/innodb_bug44571.test
M /branches/zip/mysql-test/innodb_bug45357.test
M /branches/zip/mysql-test/innodb_bug46000.result
M /branches/zip/mysql-test/innodb_bug46000.test
M /branches/zip/mysql-test/innodb_bug46676.result
M /branches/zip/mysql-test/innodb_bug46676.test
M /branches/zip/mysql-test/innodb_bug47167.result
M /branches/zip/mysql-test/innodb_bug47167.test
M /branches/zip/mysql-test/innodb_bug47777.result
M /branches/zip/mysql-test/innodb_bug47777.test
M /branches/zip/mysql-test/innodb_file_format.result
M /branches/zip/mysql-test/innodb_file_format.test
branches/zip: Set svn:eol-style on mysql-test files.
------------------------------------------------------------------------
r6183 | marko | 2009-11-17 13:51:16 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-consistent-master.opt
M /branches/zip/mysql-test/innodb-master.opt
M /branches/zip/mysql-test/innodb-semi-consistent-master.opt
M /branches/zip/mysql-test/innodb-use-sys-malloc-master.opt
M /branches/zip/mysql-test/innodb_bug42101-nonzero-master.opt
branches/zip: Prepend loose_ to plugin-only mysql-test options.
------------------------------------------------------------------------
r6184 | marko | 2009-11-17 13:52:01 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-index.result
M /branches/zip/mysql-test/innodb-index.test
branches/zip: innodb-index.test: Restore innodb_file_format_check.
------------------------------------------------------------------------
r6185 | marko | 2009-11-17 16:44:20 +0200 (Tue, 17 Nov 2009) | 16 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/mysql-test/innodb.result
M /branches/zip/mysql-test/innodb.test
M /branches/zip/mysql-test/innodb_bug44369.result
M /branches/zip/mysql-test/innodb_bug44369.test
D /branches/zip/mysql-test/patches/innodb-index.diff
M /branches/zip/row/row0mysql.c
branches/zip: Report duplicate table names
to the client connection, not to the error log. This change will allow
innodb-index.test to be re-enabled. It was previously disabled, because
mysql-test-run does not like output in the error log.
row_create_table_for_mysql(): Do not output anything to the error log
when reporting DB_DUPLICATE_KEY. Let the caller report the error.
Add a TODO comment that the dict_table_t object is apparently not freed
when an error occurs.
create_table_def(): Convert InnoDB table names to the character set
of the client connection for reporting. Use my_error(ER_WRONG_COLUMN_NAME)
for reporting reserved column names. Report my_error(ER_TABLE_EXISTS_ERROR)
when row_create_table_for_mysql() returns DB_DUPLICATE_KEY.
rb://206
------------------------------------------------------------------------
r6186 | vasil | 2009-11-17 16:48:14 +0200 (Tue, 17 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for r6185.
------------------------------------------------------------------------
r6189 | marko | 2009-11-18 11:36:18 +0200 (Wed, 18 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/handler0alter.cc
branches/zip: ha_innobase::add_index(): When creating the primary key
and the table is being locked by another transaction,
do not attempt to drop the table. (Bug #48782)
Approved by Sunny Bains over IM
------------------------------------------------------------------------
r6194 | vasil | 2009-11-19 09:24:45 +0200 (Thu, 19 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/include/univ.i
branches/zip:
Increment version number from 1.0.5 to 1.0.6 since 1.0.5 was just released
by MySQL and we will soon release 1.0.6.
------------------------------------------------------------------------
r6197 | calvin | 2009-11-19 09:32:55 +0200 (Thu, 19 Nov 2009) | 6 lines
Changed paths:
M /branches/zip/CMakeLists.txt
branches/zip: merge the fix of bug#48317 (CMake file)
Due to MySQL changes to the CMake, it is no longer able
to build InnoDB plugin as a static library on Windows.
The fix is proposed by Vlad of MySQL.
------------------------------------------------------------------------
r6198 | vasil | 2009-11-19 09:44:31 +0200 (Thu, 19 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for r6197.
------------------------------------------------------------------------
r6199 | vasil | 2009-11-19 12:10:12 +0200 (Thu, 19 Nov 2009) | 31 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/btr/btr0btr.c
M /branches/zip/data/data0type.c
branches/zip: Merge r6159:6198 from branches/5.1:
------------------------------------------------------------------------
r6187 | jyang | 2009-11-18 05:27:30 +0200 (Wed, 18 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1: Fix bug #48469 "when innodb tablespace is
configured too small, crash and corruption!". Function
btr_create() did not check the return status of fseg_create(),
and continue the index creation even there is no sufficient
space.
rb://205 Approved by Marko
------------------------------------------------------------------------
r6188 | jyang | 2009-11-18 07:14:23 +0200 (Wed, 18 Nov 2009) | 8 lines
Changed paths:
M /branches/5.1/data/data0type.c
branches/5.1: Fix bug #48526 "Data type for float and
double is incorrectly reported in InnoDB table monitor".
Certain datatypes are not printed correctly in
dtype_print().
rb://204 Approved by Marko.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6201 | marko | 2009-11-19 14:09:11 +0200 (Thu, 19 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/handler/handler0alter.cc
branches/zip: ha_innobase::add_index(): Clarify the comment
on orphaned tables when creating a primary key.
------------------------------------------------------------------------
r6202 | jyang | 2009-11-19 15:01:00 +0200 (Thu, 19 Nov 2009) | 8 lines
Changed paths:
M /branches/zip/btr/btr0btr.c
branches/zip: Function fseg_free() is no longer defined
in branches/zip. To port fix for bug #48469 to zip,
we can use btr_free_root() which frees the page,
and also does not require mini-transaction.
Approved by Marko.
------------------------------------------------------------------------
r6207 | vasil | 2009-11-20 10:19:14 +0200 (Fri, 20 Nov 2009) | 54 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Merge r6198:6206 from branches/5.1:
(r6203 was skipped as it is already in branches/zip)
------------------------------------------------------------------------
r6200 | vasil | 2009-11-19 12:14:23 +0200 (Thu, 19 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1:
White space fixup - indent under the opening (
------------------------------------------------------------------------
r6203 | jyang | 2009-11-19 15:12:22 +0200 (Thu, 19 Nov 2009) | 8 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1: Use btr_free_root() instead of fseg_free() for
the fix of bug #48469, because fseg_free() is not defined
in the zip branch. And we could save one mini-trasaction started
by fseg_free().
Approved by Marko.
------------------------------------------------------------------------
r6205 | jyang | 2009-11-20 07:55:48 +0200 (Fri, 20 Nov 2009) | 11 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Add a special case to handle the Duplicated Key error
and return DB_ERROR instead. This is to avoid a possible SIGSEGV
by mysql error handling re-entering the storage layer for dup key
info without proper table handle.
This is to prevent a server crash when error situation in bug
#45961 "DDL on partitioned innodb tables leaves data dictionary
in an inconsistent state" happens.
rb://157 approved by Sunny Bains.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 5 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Fix a minor code formating issue for
the parenthesis iplacement of the if condition in
rename_table().
------------------------------------------------------------------------
------------------------------------------------------------------------
r6208 | vasil | 2009-11-20 10:49:24 +0200 (Fri, 20 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for c6207.
------------------------------------------------------------------------
r6210 | vasil | 2009-11-20 23:39:48 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/zip/trx/trx0i_s.c
branches/zip:
Whitespace fixup.
------------------------------------------------------------------------
r6248 | marko | 2009-11-30 12:19:50 +0200 (Mon, 30 Nov 2009) | 1 line
Changed paths:
M /branches/zip/ChangeLog
branches/zip: ChangeLog: Document r4922 that was forgotten.
------------------------------------------------------------------------
r6252 | marko | 2009-11-30 12:50:11 +0200 (Mon, 30 Nov 2009) | 23 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/dict/dict0boot.c
M /branches/zip/dict/dict0crea.c
M /branches/zip/dict/dict0load.c
M /branches/zip/dict/dict0mem.c
M /branches/zip/fil/fil0fil.c
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/dict0mem.h
M /branches/zip/row/row0mysql.c
branches/zip: Suppress errors about non-found temporary tables.
Write the is_temp flag to SYS_TABLES.MIX_LEN.
dict_table_t::flags: Add a flag for is_temporary, DICT_TF2_TEMPORARY.
Unlike other flags, this will not be written to the tablespace flags
or SYS_TABLES.TYPE, but only to SYS_TABLES.MIX_LEN.
dict_build_table_def_step(): Only pass DICT_TF_BITS to tablespaces.
dict_check_tablespaces_and_store_max_id(), dict_load_table():
Suppress errors about temporary tables not being found.
dict_create_sys_tables_tuple(): Write the DICT_TF2_TEMPORARY flag
to SYS_TABLES.MIX_LEN.
fil_space_create(), fil_create_new_single_table_tablespace(): Add assertions
about space->flags.
row_drop_table_for_mysql(): Do not complain about non-found temporary tables.
rb://160 approved by Heikki Tuuri. This addresses the second part of
Bug #41609 Crash recovery does not work for InnoDB temporary tables.
------------------------------------------------------------------------
r6263 | vasil | 2009-12-01 14:49:05 +0200 (Tue, 01 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/include/univ.i
branches/zip: Increment version number from 1.0.6 to 1.0.7
1.0.6 has been released
------------------------------------------------------------------------
r6264 | vasil | 2009-12-01 16:19:44 +0200 (Tue, 01 Dec 2009) | 1 line
Changed paths:
M /branches/zip/ChangeLog
branches/zip: Add ChangeLog entry for the release of 1.0.6.
------------------------------------------------------------------------
r6269 | marko | 2009-12-02 11:35:22 +0200 (Wed, 02 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/srv/srv0start.c
branches/zip: innobase_start_or_create_for_mysql(): UNIV_IBUF_DEBUG
should not break crash recovery, but UNIV_IBUF_COUNT_DEBUG will.
------------------------------------------------------------------------
r6270 | marko | 2009-12-02 11:36:47 +0200 (Wed, 02 Dec 2009) | 1 line
Changed paths:
M /branches/zip/srv/srv0start.c
branches/zip: innobase_start_or_create_for_mysql(): Log the zlib version.
------------------------------------------------------------------------
r6271 | marko | 2009-12-02 11:43:49 +0200 (Wed, 02 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/Makefile.am
M /branches/zip/include/univ.i
M /branches/zip/plug.in
branches/zip: ChangeLog: Document that since r6270, the zlib version number
will be displayed at start-up.
------------------------------------------------------------------------
r6272 | marko | 2009-12-02 11:46:05 +0200 (Wed, 02 Dec 2009) | 1 line
Changed paths:
M /branches/zip/Makefile.am
M /branches/zip/include/univ.i
M /branches/zip/plug.in
branches/zip: Revert changes that were accidentally committed in r6271.
------------------------------------------------------------------------
r6274 | marko | 2009-12-03 14:47:12 +0200 (Thu, 03 Dec 2009) | 6 lines
Changed paths:
M /branches/zip/dict/dict0dict.c
branches/zip: dict_table_check_for_dup_indexes(): Assert that the
data dictionary mutex is being held while table->indexes is accessed.
This is already the case.
Currently, only dict_table_get_next_index() and dict_table_get_first_index()
are being invoked without holding dict_sys->mutex.
------------------------------------------------------------------------
r6275 | pekka | 2009-12-03 18:32:47 +0200 (Thu, 03 Dec 2009) | 10 lines
Changed paths:
M /branches/zip/include/log0recv.h
M /branches/zip/include/trx0sys.h
M /branches/zip/log/log0recv.c
M /branches/zip/trx/trx0sys.c
branches/zip: Minor changes which allow build with UNIV_HOTBACKUP
defined to succeed:
include/trx0sys.h: Allow Hot Backup build to see some
TRX_SYS_DOUBLEWRITE_... macros.
trx/trx0sys.c: Exclude trx_sys_close() function from Hot Backup build.
log/log0recv.[ch]: Exclude recv_sys_var_init() function from Hot Backup build.
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r6277 | marko | 2009-12-08 11:13:36 +0200 (Tue, 08 Dec 2009) | 1 line
Changed paths:
M /branches/zip/fsp/fsp0fsp.c
branches/zip: fsp0fsp.c: Add some missing in/out and const qualifiers.
------------------------------------------------------------------------
r6285 | marko | 2009-12-09 09:24:50 +0200 (Wed, 09 Dec 2009) | 13 lines
Changed paths:
M /branches/zip/row/row0sel.c
branches/zip: row_sel_fetch_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns,
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This redundant code was noticed by Sergey Petrunya on the MySQL
internals list.
------------------------------------------------------------------------
r6288 | marko | 2009-12-09 09:51:00 +0200 (Wed, 09 Dec 2009) | 15 lines
Changed paths:
M /branches/zip/row/row0upd.c
branches/zip: row_upd_copy_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns.
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This is similar to the redundant code in row_sel_fetch_columns() that
was noticed by Sergey Petrunya on the MySQL internals list and removed
in r6285. As far as I can tell, there are no redundant UNIV_SQL_NULL
assignments remaining after this change.
------------------------------------------------------------------------
r6305 | marko | 2009-12-14 13:03:57 +0200 (Mon, 14 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/row/row0umod.c
branches/zip: row_undo_mod_del_unmark_sec_and_undo_update(): Add a missing
const qualifier.
------------------------------------------------------------------------
r6309 | marko | 2009-12-15 14:05:50 +0200 (Tue, 15 Dec 2009) | 3 lines
Changed paths:
M /branches/zip/lock/lock0lock.c
branches/zip: lock_rec_insert_check_and_lock(): Avoid casting away constness.
Use page_rec_get_next_const() instead. This silences a gcc 4.2.4 warning.
Reported by Sunny Bains.
------------------------------------------------------------------------
r6312 | marko | 2009-12-16 10:10:36 +0200 (Wed, 16 Dec 2009) | 6 lines
Changed paths:
M /branches/zip/fil/fil0fil.c
branches/zip: fil_close(): Add #ifndef UNIV_HOTBACKUP around a debug
assertion on mutex.magic_n. InnoDB Hot Backup is a single-threaded
program and does not contain mutexes. This change allows InnoDB Hot
Backup to be compiled with UNIV_DEBUG.
Suggested by Michael Izioumtchenko.
------------------------------------------------------------------------
r6321 | marko | 2009-12-16 16:16:33 +0200 (Wed, 16 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/row/row0merge.c
branches/zip: row_merge_drop_temp_indexes(): Revert a hack to
transaction isolation level that was made unnecessary by r5826 (Issue #337).
When this function is called, any active data dictionary transaction
should have been rolled back.
------------------------------------------------------------------------
r6345 | marko | 2009-12-21 10:46:14 +0200 (Mon, 21 Dec 2009) | 7 lines
Changed paths:
M /branches/zip/log/log0recv.c
branches/zip: recv_scan_log_recs(): Non-functional change: Replace a
debug assertion ut_ad(len > 0) with ut_ad(len >= OS_FILE_LOG_BLOCK_SIZE).
This change is only for readability, for Issue #428. Another
assertion on len being an integer multiple of OS_FILE_LOG_BLOCK_SIZE
already ensured together with the old ut_ad(len > 0) that actually len
must be at least OS_FILE_LOG_BLOCK_SIZE.
------------------------------------------------------------------------
r6346 | marko | 2009-12-21 12:03:25 +0200 (Mon, 21 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/log/log0recv.c
branches/zip: recv_recovery_from_checkpoint_finish():
Revert a change that was accidentally committed in r6345.
------------------------------------------------------------------------
r6348 | marko | 2009-12-22 11:04:34 +0200 (Tue, 22 Dec 2009) | 37 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/ha_prototypes.h
M /branches/zip/include/trx0trx.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/trx/trx0i_s.c
M /branches/zip/trx/trx0trx.c
branches/zip: Merge a change from MySQL:
------------------------------------------------------------
revno: 3236
committer: Satya B <satya.bn@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-12-01 17:48:57 +0530
message:
merge to mysql-5.1-bugteam
------------------------------------------------------------
revno: 3234.1.1
committer: Gleb Shchepa <gshchepa@mysql.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-12-01 14:38:40 +0400
message:
Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
manual merge 5.0-->5.1, updating InnoDB plugin.
------------------------------------------------------------
revno: 1810.3968.13
committer: Gleb Shchepa <gshchepa@mysql.com>
branch nick: mysql-5.0-bugteam
timestamp: Tue 2009-12-01 14:24:44 +0400
message:
Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
The bug 38816 changed the lock that protects THD::query from
LOCK_thread_count to LOCK_thd_data, but didn't update the associated
InnoDB functions.
1. The innobase_mysql_prepare_print_arbitrary_thd and the
innobase_mysql_end_print_arbitrary_thd InnoDB functions have been
removed, since now we have a per-thread mutex: now we don't need to wrap
several inter-thread access tries to THD::query with a single global
LOCK_thread_count lock, so we can simplify the code.
2. The innobase_mysql_print_thd function has been modified to lock
LOCK_thd_data in direct way.
------------------------------------------------------------------------
r6351 | marko | 2009-12-22 11:11:18 +0200 (Tue, 22 Dec 2009) | 1 line
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Remove an obsolete declaration of LOCK_thread_count.
------------------------------------------------------------------------
r6352 | marko | 2009-12-22 12:33:01 +0200 (Tue, 22 Dec 2009) | 104 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/lock0lock.h
M /branches/zip/include/srv0srv.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/log/log0log.c
M /branches/zip/mysql-test/innodb-autoinc.result
M /branches/zip/mysql-test/innodb-autoinc.test
M /branches/zip/row/row0sel.c
M /branches/zip/srv/srv0srv.c
M /branches/zip/srv/srv0start.c
branches/zip: Merge revisions 6206:6350 from branches/5.1,
except r6347, r6349, r6350 which were committed separately
to both branches, and r6310, which was backported from zip to 5.1.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Non-functional change, fix formatting.
------------------------------------------------------------------------
r6230 | sunny | 2009-11-24 23:52:43 +0200 (Tue, 24 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
branches/5.1: Fix autoinc failing test results.
(this should be skipped when merging 5.1 into zip)
------------------------------------------------------------------------
r6231 | sunny | 2009-11-25 10:26:27 +0200 (Wed, 25 Nov 2009) | 7 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: Fix BUG#49032 - auto_increment field does not initialize to last value in InnoDB Storage Engine.
We use the appropriate function to read the column value for non-integer
autoinc column types, namely float and double.
rb://208. Approved by Marko.
------------------------------------------------------------------------
r6232 | sunny | 2009-11-25 10:27:39 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix white space errors.
------------------------------------------------------------------------
r6233 | sunny | 2009-11-25 10:28:35 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/include/mach0data.h
M /branches/5.1/include/mach0data.ic
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix tests and make read float/double arg const.
------------------------------------------------------------------------
r6234 | sunny | 2009-11-25 10:29:03 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix whitepsace issues.
------------------------------------------------------------------------
r6235 | sunny | 2009-11-26 01:14:42 +0200 (Thu, 26 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Fix Bug#47720 - REPLACE INTO Autoincrement column with negative values.
This bug is similiar to the negative autoinc filter patch from earlier,
with the additional handling of filtering out the negative column values
set explicitly by the user.
rb://184
Approved by Heikki.
------------------------------------------------------------------------
r6242 | vasil | 2009-11-27 22:07:12 +0200 (Fri, 27 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/export.sh
branches/5.1:
Minor changes to support plugin snapshots.
------------------------------------------------------------------------
r6306 | calvin | 2009-12-14 15:12:46 +0200 (Mon, 14 Dec 2009) | 5 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: fix bug#49267: innodb-autoinc.test fails on windows
because of different case mode
There is no change to the InnoDB code, only to fix test case by
changing "T1" to "t1".
------------------------------------------------------------------------
r6324 | jyang | 2009-12-17 06:54:24 +0200 (Thu, 17 Dec 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/lock0lock.h
M /branches/5.1/include/srv0srv.h
M /branches/5.1/lock/lock0lock.c
M /branches/5.1/log/log0log.c
M /branches/5.1/srv/srv0srv.c
M /branches/5.1/srv/srv0start.c
branches/5.1: Fix bug #47814 - Diagnostics are frequently not
printed after a long lock wait in InnoDB. Separate out the
lock wait timeout check thread from monitor information
printing thread.
rb://200 Approved by Marko.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6364 | marko | 2009-12-26 21:06:31 +0200 (Sat, 26 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/ibuf/ibuf0ibuf.c
branches/zip: ibuf_bitmap_get_map_page():
Define a wrapper macro that passes __FILE__, __LINE__ of the caller
to buf_page_get_gen().
This will ease the diagnosis of the likes of Issue #135.
------------------------------------------------------------------------
16 years ago  branches/innodb+: Merge revisions 6130:6364 from branches/zip:
------------------------------------------------------------------------
r6130 | marko | 2009-11-02 11:42:56 +0200 (Mon, 02 Nov 2009) | 9 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/btr/btr0sea.c
M /branches/zip/buf/buf0buf.c
M /branches/zip/dict/dict0dict.c
M /branches/zip/fil/fil0fil.c
M /branches/zip/ibuf/ibuf0ibuf.c
M /branches/zip/include/btr0sea.h
M /branches/zip/include/dict0dict.h
M /branches/zip/include/fil0fil.h
M /branches/zip/include/ibuf0ibuf.h
M /branches/zip/include/lock0lock.h
M /branches/zip/include/log0log.h
M /branches/zip/include/log0recv.h
M /branches/zip/include/mem0mem.h
M /branches/zip/include/mem0pool.h
M /branches/zip/include/os0file.h
M /branches/zip/include/pars0pars.h
M /branches/zip/include/srv0srv.h
M /branches/zip/include/thr0loc.h
M /branches/zip/include/trx0i_s.h
M /branches/zip/include/trx0purge.h
M /branches/zip/include/trx0rseg.h
M /branches/zip/include/trx0sys.h
M /branches/zip/include/trx0undo.h
M /branches/zip/include/usr0sess.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/log/log0log.c
M /branches/zip/log/log0recv.c
M /branches/zip/mem/mem0dbg.c
M /branches/zip/mem/mem0pool.c
M /branches/zip/os/os0file.c
M /branches/zip/os/os0sync.c
M /branches/zip/os/os0thread.c
M /branches/zip/pars/lexyy.c
M /branches/zip/pars/pars0lex.l
M /branches/zip/que/que0que.c
M /branches/zip/srv/srv0srv.c
M /branches/zip/srv/srv0start.c
M /branches/zip/sync/sync0arr.c
M /branches/zip/sync/sync0sync.c
M /branches/zip/thr/thr0loc.c
M /branches/zip/trx/trx0i_s.c
M /branches/zip/trx/trx0purge.c
M /branches/zip/trx/trx0rseg.c
M /branches/zip/trx/trx0sys.c
M /branches/zip/trx/trx0undo.c
M /branches/zip/usr/usr0sess.c
M /branches/zip/ut/ut0mem.c
branches/zip: Free all resources at shutdown. Set pointers to NULL, so
that Valgrind will not complain about freed data structures that are
reachable via pointers. This addresses Bug #45992 and Bug #46656.
This patch is mostly based on changes copied from branches/embedded-1.0,
mainly c5432, c3439, c3134, c2994, c2978, but also some other code was
copied. Some added cleanup code is specific to MySQL/InnoDB.
rb://199 approved by Sunny Bains
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 09:57:29 +0200 (Wed, 04 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
r6137 | marko | 2009-11-04 15:24:28 +0200 (Wed, 04 Nov 2009) | 1 line
Changed paths:
M /branches/zip/dict/dict0dict.c
branches/zip: dict_index_too_big_for_undo(): Correct a typo.
------------------------------------------------------------------------
r6153 | vasil | 2009-11-10 15:33:22 +0200 (Tue, 10 Nov 2009) | 145 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Merge r6125:6152 from branches/5.1:
(everything except the last white-space change was skipped as it is already
in branches/zip)
------------------------------------------------------------------------
r6127 | vasil | 2009-10-30 11:18:25 +0200 (Fri, 30 Oct 2009) | 18 lines
Changed paths:
M /branches/5.1/Makefile.am
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Backport c6121 from branches/zip:
------------------------------------------------------------------------
r6121 | sunny | 2009-10-30 01:42:11 +0200 (Fri, 30 Oct 2009) | 7 lines
Changed paths:
M /branches/zip/mysql-test/innodb-autoinc.result
branches/zip: This test has been problematic for sometime now. The underlying
bug is that the data dictionaries get out of sync. In the AUTOINC code we
try and apply salve to the symptoms. In the past MySQL made some unrelated
change and the dictionaries stopped getting out of sync and this test started
to fail. Now, it seems they have reverted that changed and the test is
passing again. I suspect this is not he last time that this test will change.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6129 | vasil | 2009-10-30 17:14:22 +0200 (Fri, 30 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/Makefile.am
branches/5.1:
Revert a change to Makefile.am that sneaked unnoticed in c6127.
------------------------------------------------------------------------
r6136 | marko | 2009-11-04 12:28:10 +0200 (Wed, 04 Nov 2009) | 15 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/ha_prototypes.h
M /branches/5.1/ut/ut0ut.c
branches/5.1: Port r6134 from branches/zip:
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 07:57:29 +0000 (Wed, 04 Nov 2009) | 5 lines
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
innobase_print_identifier(): Replace with innobase_convert_name().
innobase_convert_identifier(): New function, called by innobase_convert_name().
------------------------------------------------------------------------
r6149 | vasil | 2009-11-09 11:15:01 +0200 (Mon, 09 Nov 2009) | 5 lines
Changed paths:
M /branches/5.1/CMakeLists.txt
branches/5.1:
Followup to r5700: Adjust the changes so they are the same as in the BZR
repository.
------------------------------------------------------------------------
r6150 | vasil | 2009-11-09 11:43:31 +0200 (Mon, 09 Nov 2009) | 58 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a part of r2911.5.5 from MySQL:
(the other part of this was merged in c5700)
------------------------------------------------------------
revno: 2911.5.5
committer: Vladislav Vaintroub <vvaintroub@mysql.com>
branch nick: 5.1-innodb_plugin
timestamp: Wed 2009-06-10 10:59:49 +0200
message:
Backport WL#3653 to 5.1 to enable bundled innodb plugin.
Remove custom DLL loader code from innodb plugin code, use
symbols exported from mysqld.
removed:
storage/innodb_plugin/handler/handler0vars.h
storage/innodb_plugin/handler/win_delay_loader.cc
added:
storage/mysql_storage_engine.cmake
win/create_def_file.js
modified:
CMakeLists.txt
include/m_ctype.h
include/my_global.h
include/my_sys.h
include/mysql/plugin.h
libmysqld/CMakeLists.txt
mysql-test/mysql-test-run.pl
mysql-test/t/plugin.test
mysql-test/t/plugin_load-master.opt
mysys/charset.c
sql/CMakeLists.txt
sql/handler.h
sql/mysql_priv.h
sql/mysqld.cc
sql/sql_class.cc
sql/sql_class.h
sql/sql_list.h
sql/sql_profile.h
storage/Makefile.am
storage/archive/CMakeLists.txt
storage/blackhole/CMakeLists.txt
storage/csv/CMakeLists.txt
storage/example/CMakeLists.txt
storage/federated/CMakeLists.txt
storage/heap/CMakeLists.txt
storage/innobase/CMakeLists.txt
storage/innobase/handler/ha_innodb.cc
storage/innodb_plugin/CMakeLists.txt
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/handler0alter.cc
storage/innodb_plugin/handler/i_s.cc
storage/innodb_plugin/plug.in
storage/myisam/CMakeLists.txt
storage/myisammrg/CMakeLists.txt
win/Makefile.am
win/configure.js
------------------------------------------------------------------------
r6152 | vasil | 2009-11-10 15:30:20 +0200 (Tue, 10 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
White space fixup.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6157 | jyang | 2009-11-11 14:27:09 +0200 (Wed, 11 Nov 2009) | 10 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
A /branches/zip/mysql-test/innodb_bug47167.result
A /branches/zip/mysql-test/innodb_bug47167.test
M /branches/zip/mysql-test/innodb_file_format.result
branches/zip: Fix an issue that a local variable defined
in innodb_file_format_check_validate() is being referenced
across function in innodb_file_format_check_update().
In addition, fix "set global innodb_file_format_check =
DEFAULT" call.
Bug #47167: "set global innodb_file_format_check" cannot
set value by User-Defined Variable."
rb://169 approved by Sunny Bains and Marko.
------------------------------------------------------------------------
r6159 | vasil | 2009-11-11 15:13:01 +0200 (Wed, 11 Nov 2009) | 37 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/handler/ha_innodb.h
branches/zip:
Merge a change from MySQL:
(this has been reviewed by Calvin and Marko, and Calvin says Luis has
incorporated Marko's suggestions)
------------------------------------------------------------
revno: 3092.5.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Thu 2009-09-24 15:52:52 +0100
message:
BUG#42829: binlogging enabled for all schemas regardless of
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
added:
mysql-test/suite/binlog/r/binlog_stm_do_db.result
mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt
mysql-test/suite/binlog/t/binlog_stm_do_db.test
modified:
sql/sql_base.cc
sql/sql_class.cc
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/ha_innodb.h
------------------------------------------------------------------------
r6160 | vasil | 2009-11-11 15:33:49 +0200 (Wed, 11 Nov 2009) | 72 lines
Changed paths:
M /branches/zip/include/os0file.h
M /branches/zip/os/os0file.c
branches/zip: Merge r6152:6159 from branches/5.1:
(r6158 was skipped as an equivallent change has already been merged from MySQL)
------------------------------------------------------------------------
r6154 | calvin | 2009-11-11 02:51:17 +0200 (Wed, 11 Nov 2009) | 17 lines
Changed paths:
M /branches/5.1/include/os0file.h
M /branches/5.1/os/os0file.c
branches/5.1: fix bug#3139: Mysql crashes: 'windows error 995'
after several selects on a large DB
During stress environment, Windows AIO may fail with error code
ERROR_OPERATION_ABORTED. InnoDB does not handle the error, rather
crashes. The cause of the error is unknown, but likely due to
faulty hardware or driver.
This patch introduces a new error code OS_FILE_OPERATION_ABORTED,
which maps to Windows ERROR_OPERATION_ABORTED (995). When the error
is detected during AIO, the InnoDB will issue a synchronous retry
(read/write).
This patch has been extensively tested by MySQL support.
Approved by: Marko
rb://196
------------------------------------------------------------------------
r6158 | vasil | 2009-11-11 14:52:14 +0200 (Wed, 11 Nov 2009) | 37 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/handler/ha_innodb.h
branches/5.1:
Merge a change from MySQL:
(this has been reviewed by Calvin and Marko, and Calvin says Luis has
incorporated Marko's suggestions)
------------------------------------------------------------
revno: 3092.5.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Thu 2009-09-24 15:52:52 +0100
message:
BUG#42829: binlogging enabled for all schemas regardless of
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
added:
mysql-test/suite/binlog/r/binlog_stm_do_db.result
mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt
mysql-test/suite/binlog/t/binlog_stm_do_db.test
modified:
sql/sql_base.cc
sql/sql_class.cc
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/ha_innodb.h
------------------------------------------------------------------------
------------------------------------------------------------------------
r6161 | vasil | 2009-11-11 15:36:16 +0200 (Wed, 11 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add changelog entry for r6160.
------------------------------------------------------------------------
r6162 | vasil | 2009-11-11 16:00:12 +0200 (Wed, 11 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog for r6157.
------------------------------------------------------------------------
r6163 | calvin | 2009-11-11 17:53:20 +0200 (Wed, 11 Nov 2009) | 8 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/handler/ha_innodb.h
branches/zip: Exclude thd_binlog_filter_ok() when building
with older version of MySQL.
thd_binlog_filter_ok() is introduced in MySQL 5.1.41. But the
plugin can be built with MySQL prior to 5.1.41.
Approved by Heikki (on IM).
------------------------------------------------------------------------
r6169 | calvin | 2009-11-12 14:40:43 +0200 (Thu, 12 Nov 2009) | 6 lines
Changed paths:
A /branches/zip/mysql-test/innodb_bug46676.result
A /branches/zip/mysql-test/innodb_bug46676.test
branches/zip: add test case for bug#46676
This crash is reproducible with InnoDB plugin 1.0.4 + MySQL 5.1.37.
But no longer reproducible after MySQL 5.1.38 (with plugin 1.0.5).
Add test case to catch future regression.
------------------------------------------------------------------------
r6170 | marko | 2009-11-12 15:49:08 +0200 (Thu, 12 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/db0err.h
M /branches/zip/row/row0merge.c
M /branches/zip/row/row0mysql.c
branches/zip: Allow CREATE INDEX to be interrupted. (Issue #354)
rb://183 approved by Heikki Tuuri
------------------------------------------------------------------------
r6175 | vasil | 2009-11-16 20:07:39 +0200 (Mon, 16 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Wrap line at 78th char in the ChangeLog
------------------------------------------------------------------------
r6177 | calvin | 2009-11-16 20:20:38 +0200 (Mon, 16 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip: add an entry to ChangeLog for r6065
------------------------------------------------------------------------
r6179 | marko | 2009-11-17 10:19:34 +0200 (Tue, 17 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: ha_innobase::change_active_index(): When the history is
missing, report it to the client, not to the error log.
------------------------------------------------------------------------
r6181 | vasil | 2009-11-17 12:21:41 +0200 (Tue, 17 Nov 2009) | 33 lines
Changed paths:
M /branches/zip/mysql-test/innodb-index.test
branches/zip:
At the end of innodb-index.test: restore the environment as it was before
the test was started to silence this warning:
MTR's internal check of the test case 'main.innodb-index' failed.
This means that the test case does not preserve the state that existed
before the test case was executed. Most likely the test case did not
do a proper clean-up.
This is the diff of the states of the servers before and after the
test case was executed:
mysqltest: Logging to '/tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.log'.
mysqltest: Results saved in '/tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result'.
mysqltest: Connecting to server localhost:13000 (socket /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/mysqld.1.sock) as 'root', connection 'default', attempt 0 ...
mysqltest: ... Connected.
mysqltest: Start processing test commands from './include/check-testcase.test' ...
mysqltest: ... Done processing test commands.
--- /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result 2009-11-17 13:10:40.000000000 +0300
+++ /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.reject 2009-11-17 13:10:54.000000000 +0300
@@ -84,7 +84,7 @@
INNODB_DOUBLEWRITE ON
INNODB_FAST_SHUTDOWN 1
INNODB_FILE_FORMAT Antelope
-INNODB_FILE_FORMAT_CHECK Antelope
+INNODB_FILE_FORMAT_CHECK Barracuda
INNODB_FILE_PER_TABLE OFF
INNODB_FLUSH_LOG_AT_TRX_COMMIT 1
INNODB_FLUSH_METHOD
mysqltest: Result content mismatch
not ok
------------------------------------------------------------------------
r6182 | marko | 2009-11-17 13:49:15 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-consistent-master.opt
M /branches/zip/mysql-test/innodb-consistent.result
M /branches/zip/mysql-test/innodb-consistent.test
M /branches/zip/mysql-test/innodb-use-sys-malloc-master.opt
M /branches/zip/mysql-test/innodb-use-sys-malloc.result
M /branches/zip/mysql-test/innodb-use-sys-malloc.test
M /branches/zip/mysql-test/innodb_bug21704.result
M /branches/zip/mysql-test/innodb_bug21704.test
M /branches/zip/mysql-test/innodb_bug40360.test
M /branches/zip/mysql-test/innodb_bug40565.result
M /branches/zip/mysql-test/innodb_bug40565.test
M /branches/zip/mysql-test/innodb_bug41904.result
M /branches/zip/mysql-test/innodb_bug41904.test
M /branches/zip/mysql-test/innodb_bug42101-nonzero-master.opt
M /branches/zip/mysql-test/innodb_bug42101-nonzero.result
M /branches/zip/mysql-test/innodb_bug42101-nonzero.test
M /branches/zip/mysql-test/innodb_bug42101.result
M /branches/zip/mysql-test/innodb_bug42101.test
M /branches/zip/mysql-test/innodb_bug44032.result
M /branches/zip/mysql-test/innodb_bug44032.test
M /branches/zip/mysql-test/innodb_bug44369.result
M /branches/zip/mysql-test/innodb_bug44369.test
M /branches/zip/mysql-test/innodb_bug44571.result
M /branches/zip/mysql-test/innodb_bug44571.test
M /branches/zip/mysql-test/innodb_bug45357.test
M /branches/zip/mysql-test/innodb_bug46000.result
M /branches/zip/mysql-test/innodb_bug46000.test
M /branches/zip/mysql-test/innodb_bug46676.result
M /branches/zip/mysql-test/innodb_bug46676.test
M /branches/zip/mysql-test/innodb_bug47167.result
M /branches/zip/mysql-test/innodb_bug47167.test
M /branches/zip/mysql-test/innodb_bug47777.result
M /branches/zip/mysql-test/innodb_bug47777.test
M /branches/zip/mysql-test/innodb_file_format.result
M /branches/zip/mysql-test/innodb_file_format.test
branches/zip: Set svn:eol-style on mysql-test files.
------------------------------------------------------------------------
r6183 | marko | 2009-11-17 13:51:16 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-consistent-master.opt
M /branches/zip/mysql-test/innodb-master.opt
M /branches/zip/mysql-test/innodb-semi-consistent-master.opt
M /branches/zip/mysql-test/innodb-use-sys-malloc-master.opt
M /branches/zip/mysql-test/innodb_bug42101-nonzero-master.opt
branches/zip: Prepend loose_ to plugin-only mysql-test options.
------------------------------------------------------------------------
r6184 | marko | 2009-11-17 13:52:01 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-index.result
M /branches/zip/mysql-test/innodb-index.test
branches/zip: innodb-index.test: Restore innodb_file_format_check.
------------------------------------------------------------------------
r6185 | marko | 2009-11-17 16:44:20 +0200 (Tue, 17 Nov 2009) | 16 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/mysql-test/innodb.result
M /branches/zip/mysql-test/innodb.test
M /branches/zip/mysql-test/innodb_bug44369.result
M /branches/zip/mysql-test/innodb_bug44369.test
D /branches/zip/mysql-test/patches/innodb-index.diff
M /branches/zip/row/row0mysql.c
branches/zip: Report duplicate table names
to the client connection, not to the error log. This change will allow
innodb-index.test to be re-enabled. It was previously disabled, because
mysql-test-run does not like output in the error log.
row_create_table_for_mysql(): Do not output anything to the error log
when reporting DB_DUPLICATE_KEY. Let the caller report the error.
Add a TODO comment that the dict_table_t object is apparently not freed
when an error occurs.
create_table_def(): Convert InnoDB table names to the character set
of the client connection for reporting. Use my_error(ER_WRONG_COLUMN_NAME)
for reporting reserved column names. Report my_error(ER_TABLE_EXISTS_ERROR)
when row_create_table_for_mysql() returns DB_DUPLICATE_KEY.
rb://206
------------------------------------------------------------------------
r6186 | vasil | 2009-11-17 16:48:14 +0200 (Tue, 17 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for r6185.
------------------------------------------------------------------------
r6189 | marko | 2009-11-18 11:36:18 +0200 (Wed, 18 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/handler0alter.cc
branches/zip: ha_innobase::add_index(): When creating the primary key
and the table is being locked by another transaction,
do not attempt to drop the table. (Bug #48782)
Approved by Sunny Bains over IM
------------------------------------------------------------------------
r6194 | vasil | 2009-11-19 09:24:45 +0200 (Thu, 19 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/include/univ.i
branches/zip:
Increment version number from 1.0.5 to 1.0.6 since 1.0.5 was just released
by MySQL and we will soon release 1.0.6.
------------------------------------------------------------------------
r6197 | calvin | 2009-11-19 09:32:55 +0200 (Thu, 19 Nov 2009) | 6 lines
Changed paths:
M /branches/zip/CMakeLists.txt
branches/zip: merge the fix of bug#48317 (CMake file)
Due to MySQL changes to the CMake, it is no longer able
to build InnoDB plugin as a static library on Windows.
The fix is proposed by Vlad of MySQL.
------------------------------------------------------------------------
r6198 | vasil | 2009-11-19 09:44:31 +0200 (Thu, 19 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for r6197.
------------------------------------------------------------------------
r6199 | vasil | 2009-11-19 12:10:12 +0200 (Thu, 19 Nov 2009) | 31 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/btr/btr0btr.c
M /branches/zip/data/data0type.c
branches/zip: Merge r6159:6198 from branches/5.1:
------------------------------------------------------------------------
r6187 | jyang | 2009-11-18 05:27:30 +0200 (Wed, 18 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1: Fix bug #48469 "when innodb tablespace is
configured too small, crash and corruption!". Function
btr_create() did not check the return status of fseg_create(),
and continue the index creation even there is no sufficient
space.
rb://205 Approved by Marko
------------------------------------------------------------------------
r6188 | jyang | 2009-11-18 07:14:23 +0200 (Wed, 18 Nov 2009) | 8 lines
Changed paths:
M /branches/5.1/data/data0type.c
branches/5.1: Fix bug #48526 "Data type for float and
double is incorrectly reported in InnoDB table monitor".
Certain datatypes are not printed correctly in
dtype_print().
rb://204 Approved by Marko.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6201 | marko | 2009-11-19 14:09:11 +0200 (Thu, 19 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/handler/handler0alter.cc
branches/zip: ha_innobase::add_index(): Clarify the comment
on orphaned tables when creating a primary key.
------------------------------------------------------------------------
r6202 | jyang | 2009-11-19 15:01:00 +0200 (Thu, 19 Nov 2009) | 8 lines
Changed paths:
M /branches/zip/btr/btr0btr.c
branches/zip: Function fseg_free() is no longer defined
in branches/zip. To port fix for bug #48469 to zip,
we can use btr_free_root() which frees the page,
and also does not require mini-transaction.
Approved by Marko.
------------------------------------------------------------------------
r6207 | vasil | 2009-11-20 10:19:14 +0200 (Fri, 20 Nov 2009) | 54 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Merge r6198:6206 from branches/5.1:
(r6203 was skipped as it is already in branches/zip)
------------------------------------------------------------------------
r6200 | vasil | 2009-11-19 12:14:23 +0200 (Thu, 19 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1:
White space fixup - indent under the opening (
------------------------------------------------------------------------
r6203 | jyang | 2009-11-19 15:12:22 +0200 (Thu, 19 Nov 2009) | 8 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1: Use btr_free_root() instead of fseg_free() for
the fix of bug #48469, because fseg_free() is not defined
in the zip branch. And we could save one mini-trasaction started
by fseg_free().
Approved by Marko.
------------------------------------------------------------------------
r6205 | jyang | 2009-11-20 07:55:48 +0200 (Fri, 20 Nov 2009) | 11 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Add a special case to handle the Duplicated Key error
and return DB_ERROR instead. This is to avoid a possible SIGSEGV
by mysql error handling re-entering the storage layer for dup key
info without proper table handle.
This is to prevent a server crash when error situation in bug
#45961 "DDL on partitioned innodb tables leaves data dictionary
in an inconsistent state" happens.
rb://157 approved by Sunny Bains.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 5 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Fix a minor code formating issue for
the parenthesis iplacement of the if condition in
rename_table().
------------------------------------------------------------------------
------------------------------------------------------------------------
r6208 | vasil | 2009-11-20 10:49:24 +0200 (Fri, 20 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for c6207.
------------------------------------------------------------------------
r6210 | vasil | 2009-11-20 23:39:48 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/zip/trx/trx0i_s.c
branches/zip:
Whitespace fixup.
------------------------------------------------------------------------
r6248 | marko | 2009-11-30 12:19:50 +0200 (Mon, 30 Nov 2009) | 1 line
Changed paths:
M /branches/zip/ChangeLog
branches/zip: ChangeLog: Document r4922 that was forgotten.
------------------------------------------------------------------------
r6252 | marko | 2009-11-30 12:50:11 +0200 (Mon, 30 Nov 2009) | 23 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/dict/dict0boot.c
M /branches/zip/dict/dict0crea.c
M /branches/zip/dict/dict0load.c
M /branches/zip/dict/dict0mem.c
M /branches/zip/fil/fil0fil.c
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/dict0mem.h
M /branches/zip/row/row0mysql.c
branches/zip: Suppress errors about non-found temporary tables.
Write the is_temp flag to SYS_TABLES.MIX_LEN.
dict_table_t::flags: Add a flag for is_temporary, DICT_TF2_TEMPORARY.
Unlike other flags, this will not be written to the tablespace flags
or SYS_TABLES.TYPE, but only to SYS_TABLES.MIX_LEN.
dict_build_table_def_step(): Only pass DICT_TF_BITS to tablespaces.
dict_check_tablespaces_and_store_max_id(), dict_load_table():
Suppress errors about temporary tables not being found.
dict_create_sys_tables_tuple(): Write the DICT_TF2_TEMPORARY flag
to SYS_TABLES.MIX_LEN.
fil_space_create(), fil_create_new_single_table_tablespace(): Add assertions
about space->flags.
row_drop_table_for_mysql(): Do not complain about non-found temporary tables.
rb://160 approved by Heikki Tuuri. This addresses the second part of
Bug #41609 Crash recovery does not work for InnoDB temporary tables.
------------------------------------------------------------------------
r6263 | vasil | 2009-12-01 14:49:05 +0200 (Tue, 01 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/include/univ.i
branches/zip: Increment version number from 1.0.6 to 1.0.7
1.0.6 has been released
------------------------------------------------------------------------
r6264 | vasil | 2009-12-01 16:19:44 +0200 (Tue, 01 Dec 2009) | 1 line
Changed paths:
M /branches/zip/ChangeLog
branches/zip: Add ChangeLog entry for the release of 1.0.6.
------------------------------------------------------------------------
r6269 | marko | 2009-12-02 11:35:22 +0200 (Wed, 02 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/srv/srv0start.c
branches/zip: innobase_start_or_create_for_mysql(): UNIV_IBUF_DEBUG
should not break crash recovery, but UNIV_IBUF_COUNT_DEBUG will.
------------------------------------------------------------------------
r6270 | marko | 2009-12-02 11:36:47 +0200 (Wed, 02 Dec 2009) | 1 line
Changed paths:
M /branches/zip/srv/srv0start.c
branches/zip: innobase_start_or_create_for_mysql(): Log the zlib version.
------------------------------------------------------------------------
r6271 | marko | 2009-12-02 11:43:49 +0200 (Wed, 02 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/Makefile.am
M /branches/zip/include/univ.i
M /branches/zip/plug.in
branches/zip: ChangeLog: Document that since r6270, the zlib version number
will be displayed at start-up.
------------------------------------------------------------------------
r6272 | marko | 2009-12-02 11:46:05 +0200 (Wed, 02 Dec 2009) | 1 line
Changed paths:
M /branches/zip/Makefile.am
M /branches/zip/include/univ.i
M /branches/zip/plug.in
branches/zip: Revert changes that were accidentally committed in r6271.
------------------------------------------------------------------------
r6274 | marko | 2009-12-03 14:47:12 +0200 (Thu, 03 Dec 2009) | 6 lines
Changed paths:
M /branches/zip/dict/dict0dict.c
branches/zip: dict_table_check_for_dup_indexes(): Assert that the
data dictionary mutex is being held while table->indexes is accessed.
This is already the case.
Currently, only dict_table_get_next_index() and dict_table_get_first_index()
are being invoked without holding dict_sys->mutex.
------------------------------------------------------------------------
r6275 | pekka | 2009-12-03 18:32:47 +0200 (Thu, 03 Dec 2009) | 10 lines
Changed paths:
M /branches/zip/include/log0recv.h
M /branches/zip/include/trx0sys.h
M /branches/zip/log/log0recv.c
M /branches/zip/trx/trx0sys.c
branches/zip: Minor changes which allow build with UNIV_HOTBACKUP
defined to succeed:
include/trx0sys.h: Allow Hot Backup build to see some
TRX_SYS_DOUBLEWRITE_... macros.
trx/trx0sys.c: Exclude trx_sys_close() function from Hot Backup build.
log/log0recv.[ch]: Exclude recv_sys_var_init() function from Hot Backup build.
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r6277 | marko | 2009-12-08 11:13:36 +0200 (Tue, 08 Dec 2009) | 1 line
Changed paths:
M /branches/zip/fsp/fsp0fsp.c
branches/zip: fsp0fsp.c: Add some missing in/out and const qualifiers.
------------------------------------------------------------------------
r6285 | marko | 2009-12-09 09:24:50 +0200 (Wed, 09 Dec 2009) | 13 lines
Changed paths:
M /branches/zip/row/row0sel.c
branches/zip: row_sel_fetch_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns,
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This redundant code was noticed by Sergey Petrunya on the MySQL
internals list.
------------------------------------------------------------------------
r6288 | marko | 2009-12-09 09:51:00 +0200 (Wed, 09 Dec 2009) | 15 lines
Changed paths:
M /branches/zip/row/row0upd.c
branches/zip: row_upd_copy_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns.
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This is similar to the redundant code in row_sel_fetch_columns() that
was noticed by Sergey Petrunya on the MySQL internals list and removed
in r6285. As far as I can tell, there are no redundant UNIV_SQL_NULL
assignments remaining after this change.
------------------------------------------------------------------------
r6305 | marko | 2009-12-14 13:03:57 +0200 (Mon, 14 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/row/row0umod.c
branches/zip: row_undo_mod_del_unmark_sec_and_undo_update(): Add a missing
const qualifier.
------------------------------------------------------------------------
r6309 | marko | 2009-12-15 14:05:50 +0200 (Tue, 15 Dec 2009) | 3 lines
Changed paths:
M /branches/zip/lock/lock0lock.c
branches/zip: lock_rec_insert_check_and_lock(): Avoid casting away constness.
Use page_rec_get_next_const() instead. This silences a gcc 4.2.4 warning.
Reported by Sunny Bains.
------------------------------------------------------------------------
r6312 | marko | 2009-12-16 10:10:36 +0200 (Wed, 16 Dec 2009) | 6 lines
Changed paths:
M /branches/zip/fil/fil0fil.c
branches/zip: fil_close(): Add #ifndef UNIV_HOTBACKUP around a debug
assertion on mutex.magic_n. InnoDB Hot Backup is a single-threaded
program and does not contain mutexes. This change allows InnoDB Hot
Backup to be compiled with UNIV_DEBUG.
Suggested by Michael Izioumtchenko.
------------------------------------------------------------------------
r6321 | marko | 2009-12-16 16:16:33 +0200 (Wed, 16 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/row/row0merge.c
branches/zip: row_merge_drop_temp_indexes(): Revert a hack to
transaction isolation level that was made unnecessary by r5826 (Issue #337).
When this function is called, any active data dictionary transaction
should have been rolled back.
------------------------------------------------------------------------
r6345 | marko | 2009-12-21 10:46:14 +0200 (Mon, 21 Dec 2009) | 7 lines
Changed paths:
M /branches/zip/log/log0recv.c
branches/zip: recv_scan_log_recs(): Non-functional change: Replace a
debug assertion ut_ad(len > 0) with ut_ad(len >= OS_FILE_LOG_BLOCK_SIZE).
This change is only for readability, for Issue #428. Another
assertion on len being an integer multiple of OS_FILE_LOG_BLOCK_SIZE
already ensured together with the old ut_ad(len > 0) that actually len
must be at least OS_FILE_LOG_BLOCK_SIZE.
------------------------------------------------------------------------
r6346 | marko | 2009-12-21 12:03:25 +0200 (Mon, 21 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/log/log0recv.c
branches/zip: recv_recovery_from_checkpoint_finish():
Revert a change that was accidentally committed in r6345.
------------------------------------------------------------------------
r6348 | marko | 2009-12-22 11:04:34 +0200 (Tue, 22 Dec 2009) | 37 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/ha_prototypes.h
M /branches/zip/include/trx0trx.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/trx/trx0i_s.c
M /branches/zip/trx/trx0trx.c
branches/zip: Merge a change from MySQL:
------------------------------------------------------------
revno: 3236
committer: Satya B <satya.bn@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-12-01 17:48:57 +0530
message:
merge to mysql-5.1-bugteam
------------------------------------------------------------
revno: 3234.1.1
committer: Gleb Shchepa <gshchepa@mysql.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-12-01 14:38:40 +0400
message:
Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
manual merge 5.0-->5.1, updating InnoDB plugin.
------------------------------------------------------------
revno: 1810.3968.13
committer: Gleb Shchepa <gshchepa@mysql.com>
branch nick: mysql-5.0-bugteam
timestamp: Tue 2009-12-01 14:24:44 +0400
message:
Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
The bug 38816 changed the lock that protects THD::query from
LOCK_thread_count to LOCK_thd_data, but didn't update the associated
InnoDB functions.
1. The innobase_mysql_prepare_print_arbitrary_thd and the
innobase_mysql_end_print_arbitrary_thd InnoDB functions have been
removed, since now we have a per-thread mutex: now we don't need to wrap
several inter-thread access tries to THD::query with a single global
LOCK_thread_count lock, so we can simplify the code.
2. The innobase_mysql_print_thd function has been modified to lock
LOCK_thd_data in direct way.
------------------------------------------------------------------------
r6351 | marko | 2009-12-22 11:11:18 +0200 (Tue, 22 Dec 2009) | 1 line
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Remove an obsolete declaration of LOCK_thread_count.
------------------------------------------------------------------------
r6352 | marko | 2009-12-22 12:33:01 +0200 (Tue, 22 Dec 2009) | 104 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/lock0lock.h
M /branches/zip/include/srv0srv.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/log/log0log.c
M /branches/zip/mysql-test/innodb-autoinc.result
M /branches/zip/mysql-test/innodb-autoinc.test
M /branches/zip/row/row0sel.c
M /branches/zip/srv/srv0srv.c
M /branches/zip/srv/srv0start.c
branches/zip: Merge revisions 6206:6350 from branches/5.1,
except r6347, r6349, r6350 which were committed separately
to both branches, and r6310, which was backported from zip to 5.1.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Non-functional change, fix formatting.
------------------------------------------------------------------------
r6230 | sunny | 2009-11-24 23:52:43 +0200 (Tue, 24 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
branches/5.1: Fix autoinc failing test results.
(this should be skipped when merging 5.1 into zip)
------------------------------------------------------------------------
r6231 | sunny | 2009-11-25 10:26:27 +0200 (Wed, 25 Nov 2009) | 7 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: Fix BUG#49032 - auto_increment field does not initialize to last value in InnoDB Storage Engine.
We use the appropriate function to read the column value for non-integer
autoinc column types, namely float and double.
rb://208. Approved by Marko.
------------------------------------------------------------------------
r6232 | sunny | 2009-11-25 10:27:39 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix white space errors.
------------------------------------------------------------------------
r6233 | sunny | 2009-11-25 10:28:35 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/include/mach0data.h
M /branches/5.1/include/mach0data.ic
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix tests and make read float/double arg const.
------------------------------------------------------------------------
r6234 | sunny | 2009-11-25 10:29:03 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix whitepsace issues.
------------------------------------------------------------------------
r6235 | sunny | 2009-11-26 01:14:42 +0200 (Thu, 26 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Fix Bug#47720 - REPLACE INTO Autoincrement column with negative values.
This bug is similiar to the negative autoinc filter patch from earlier,
with the additional handling of filtering out the negative column values
set explicitly by the user.
rb://184
Approved by Heikki.
------------------------------------------------------------------------
r6242 | vasil | 2009-11-27 22:07:12 +0200 (Fri, 27 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/export.sh
branches/5.1:
Minor changes to support plugin snapshots.
------------------------------------------------------------------------
r6306 | calvin | 2009-12-14 15:12:46 +0200 (Mon, 14 Dec 2009) | 5 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: fix bug#49267: innodb-autoinc.test fails on windows
because of different case mode
There is no change to the InnoDB code, only to fix test case by
changing "T1" to "t1".
------------------------------------------------------------------------
r6324 | jyang | 2009-12-17 06:54:24 +0200 (Thu, 17 Dec 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/lock0lock.h
M /branches/5.1/include/srv0srv.h
M /branches/5.1/lock/lock0lock.c
M /branches/5.1/log/log0log.c
M /branches/5.1/srv/srv0srv.c
M /branches/5.1/srv/srv0start.c
branches/5.1: Fix bug #47814 - Diagnostics are frequently not
printed after a long lock wait in InnoDB. Separate out the
lock wait timeout check thread from monitor information
printing thread.
rb://200 Approved by Marko.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6364 | marko | 2009-12-26 21:06:31 +0200 (Sat, 26 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/ibuf/ibuf0ibuf.c
branches/zip: ibuf_bitmap_get_map_page():
Define a wrapper macro that passes __FILE__, __LINE__ of the caller
to buf_page_get_gen().
This will ease the diagnosis of the likes of Issue #135.
------------------------------------------------------------------------
16 years ago  branches/innodb+: Merge revisions 6130:6364 from branches/zip:
------------------------------------------------------------------------
r6130 | marko | 2009-11-02 11:42:56 +0200 (Mon, 02 Nov 2009) | 9 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/btr/btr0sea.c
M /branches/zip/buf/buf0buf.c
M /branches/zip/dict/dict0dict.c
M /branches/zip/fil/fil0fil.c
M /branches/zip/ibuf/ibuf0ibuf.c
M /branches/zip/include/btr0sea.h
M /branches/zip/include/dict0dict.h
M /branches/zip/include/fil0fil.h
M /branches/zip/include/ibuf0ibuf.h
M /branches/zip/include/lock0lock.h
M /branches/zip/include/log0log.h
M /branches/zip/include/log0recv.h
M /branches/zip/include/mem0mem.h
M /branches/zip/include/mem0pool.h
M /branches/zip/include/os0file.h
M /branches/zip/include/pars0pars.h
M /branches/zip/include/srv0srv.h
M /branches/zip/include/thr0loc.h
M /branches/zip/include/trx0i_s.h
M /branches/zip/include/trx0purge.h
M /branches/zip/include/trx0rseg.h
M /branches/zip/include/trx0sys.h
M /branches/zip/include/trx0undo.h
M /branches/zip/include/usr0sess.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/log/log0log.c
M /branches/zip/log/log0recv.c
M /branches/zip/mem/mem0dbg.c
M /branches/zip/mem/mem0pool.c
M /branches/zip/os/os0file.c
M /branches/zip/os/os0sync.c
M /branches/zip/os/os0thread.c
M /branches/zip/pars/lexyy.c
M /branches/zip/pars/pars0lex.l
M /branches/zip/que/que0que.c
M /branches/zip/srv/srv0srv.c
M /branches/zip/srv/srv0start.c
M /branches/zip/sync/sync0arr.c
M /branches/zip/sync/sync0sync.c
M /branches/zip/thr/thr0loc.c
M /branches/zip/trx/trx0i_s.c
M /branches/zip/trx/trx0purge.c
M /branches/zip/trx/trx0rseg.c
M /branches/zip/trx/trx0sys.c
M /branches/zip/trx/trx0undo.c
M /branches/zip/usr/usr0sess.c
M /branches/zip/ut/ut0mem.c
branches/zip: Free all resources at shutdown. Set pointers to NULL, so
that Valgrind will not complain about freed data structures that are
reachable via pointers. This addresses Bug #45992 and Bug #46656.
This patch is mostly based on changes copied from branches/embedded-1.0,
mainly c5432, c3439, c3134, c2994, c2978, but also some other code was
copied. Some added cleanup code is specific to MySQL/InnoDB.
rb://199 approved by Sunny Bains
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 09:57:29 +0200 (Wed, 04 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
r6137 | marko | 2009-11-04 15:24:28 +0200 (Wed, 04 Nov 2009) | 1 line
Changed paths:
M /branches/zip/dict/dict0dict.c
branches/zip: dict_index_too_big_for_undo(): Correct a typo.
------------------------------------------------------------------------
r6153 | vasil | 2009-11-10 15:33:22 +0200 (Tue, 10 Nov 2009) | 145 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Merge r6125:6152 from branches/5.1:
(everything except the last white-space change was skipped as it is already
in branches/zip)
------------------------------------------------------------------------
r6127 | vasil | 2009-10-30 11:18:25 +0200 (Fri, 30 Oct 2009) | 18 lines
Changed paths:
M /branches/5.1/Makefile.am
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Backport c6121 from branches/zip:
------------------------------------------------------------------------
r6121 | sunny | 2009-10-30 01:42:11 +0200 (Fri, 30 Oct 2009) | 7 lines
Changed paths:
M /branches/zip/mysql-test/innodb-autoinc.result
branches/zip: This test has been problematic for sometime now. The underlying
bug is that the data dictionaries get out of sync. In the AUTOINC code we
try and apply salve to the symptoms. In the past MySQL made some unrelated
change and the dictionaries stopped getting out of sync and this test started
to fail. Now, it seems they have reverted that changed and the test is
passing again. I suspect this is not he last time that this test will change.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6129 | vasil | 2009-10-30 17:14:22 +0200 (Fri, 30 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/Makefile.am
branches/5.1:
Revert a change to Makefile.am that sneaked unnoticed in c6127.
------------------------------------------------------------------------
r6136 | marko | 2009-11-04 12:28:10 +0200 (Wed, 04 Nov 2009) | 15 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/ha_prototypes.h
M /branches/5.1/ut/ut0ut.c
branches/5.1: Port r6134 from branches/zip:
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 07:57:29 +0000 (Wed, 04 Nov 2009) | 5 lines
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
innobase_print_identifier(): Replace with innobase_convert_name().
innobase_convert_identifier(): New function, called by innobase_convert_name().
------------------------------------------------------------------------
r6149 | vasil | 2009-11-09 11:15:01 +0200 (Mon, 09 Nov 2009) | 5 lines
Changed paths:
M /branches/5.1/CMakeLists.txt
branches/5.1:
Followup to r5700: Adjust the changes so they are the same as in the BZR
repository.
------------------------------------------------------------------------
r6150 | vasil | 2009-11-09 11:43:31 +0200 (Mon, 09 Nov 2009) | 58 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a part of r2911.5.5 from MySQL:
(the other part of this was merged in c5700)
------------------------------------------------------------
revno: 2911.5.5
committer: Vladislav Vaintroub <vvaintroub@mysql.com>
branch nick: 5.1-innodb_plugin
timestamp: Wed 2009-06-10 10:59:49 +0200
message:
Backport WL#3653 to 5.1 to enable bundled innodb plugin.
Remove custom DLL loader code from innodb plugin code, use
symbols exported from mysqld.
removed:
storage/innodb_plugin/handler/handler0vars.h
storage/innodb_plugin/handler/win_delay_loader.cc
added:
storage/mysql_storage_engine.cmake
win/create_def_file.js
modified:
CMakeLists.txt
include/m_ctype.h
include/my_global.h
include/my_sys.h
include/mysql/plugin.h
libmysqld/CMakeLists.txt
mysql-test/mysql-test-run.pl
mysql-test/t/plugin.test
mysql-test/t/plugin_load-master.opt
mysys/charset.c
sql/CMakeLists.txt
sql/handler.h
sql/mysql_priv.h
sql/mysqld.cc
sql/sql_class.cc
sql/sql_class.h
sql/sql_list.h
sql/sql_profile.h
storage/Makefile.am
storage/archive/CMakeLists.txt
storage/blackhole/CMakeLists.txt
storage/csv/CMakeLists.txt
storage/example/CMakeLists.txt
storage/federated/CMakeLists.txt
storage/heap/CMakeLists.txt
storage/innobase/CMakeLists.txt
storage/innobase/handler/ha_innodb.cc
storage/innodb_plugin/CMakeLists.txt
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/handler0alter.cc
storage/innodb_plugin/handler/i_s.cc
storage/innodb_plugin/plug.in
storage/myisam/CMakeLists.txt
storage/myisammrg/CMakeLists.txt
win/Makefile.am
win/configure.js
------------------------------------------------------------------------
r6152 | vasil | 2009-11-10 15:30:20 +0200 (Tue, 10 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
White space fixup.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6157 | jyang | 2009-11-11 14:27:09 +0200 (Wed, 11 Nov 2009) | 10 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
A /branches/zip/mysql-test/innodb_bug47167.result
A /branches/zip/mysql-test/innodb_bug47167.test
M /branches/zip/mysql-test/innodb_file_format.result
branches/zip: Fix an issue that a local variable defined
in innodb_file_format_check_validate() is being referenced
across function in innodb_file_format_check_update().
In addition, fix "set global innodb_file_format_check =
DEFAULT" call.
Bug #47167: "set global innodb_file_format_check" cannot
set value by User-Defined Variable."
rb://169 approved by Sunny Bains and Marko.
------------------------------------------------------------------------
r6159 | vasil | 2009-11-11 15:13:01 +0200 (Wed, 11 Nov 2009) | 37 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/handler/ha_innodb.h
branches/zip:
Merge a change from MySQL:
(this has been reviewed by Calvin and Marko, and Calvin says Luis has
incorporated Marko's suggestions)
------------------------------------------------------------
revno: 3092.5.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Thu 2009-09-24 15:52:52 +0100
message:
BUG#42829: binlogging enabled for all schemas regardless of
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
added:
mysql-test/suite/binlog/r/binlog_stm_do_db.result
mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt
mysql-test/suite/binlog/t/binlog_stm_do_db.test
modified:
sql/sql_base.cc
sql/sql_class.cc
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/ha_innodb.h
------------------------------------------------------------------------
r6160 | vasil | 2009-11-11 15:33:49 +0200 (Wed, 11 Nov 2009) | 72 lines
Changed paths:
M /branches/zip/include/os0file.h
M /branches/zip/os/os0file.c
branches/zip: Merge r6152:6159 from branches/5.1:
(r6158 was skipped as an equivallent change has already been merged from MySQL)
------------------------------------------------------------------------
r6154 | calvin | 2009-11-11 02:51:17 +0200 (Wed, 11 Nov 2009) | 17 lines
Changed paths:
M /branches/5.1/include/os0file.h
M /branches/5.1/os/os0file.c
branches/5.1: fix bug#3139: Mysql crashes: 'windows error 995'
after several selects on a large DB
During stress environment, Windows AIO may fail with error code
ERROR_OPERATION_ABORTED. InnoDB does not handle the error, rather
crashes. The cause of the error is unknown, but likely due to
faulty hardware or driver.
This patch introduces a new error code OS_FILE_OPERATION_ABORTED,
which maps to Windows ERROR_OPERATION_ABORTED (995). When the error
is detected during AIO, the InnoDB will issue a synchronous retry
(read/write).
This patch has been extensively tested by MySQL support.
Approved by: Marko
rb://196
------------------------------------------------------------------------
r6158 | vasil | 2009-11-11 14:52:14 +0200 (Wed, 11 Nov 2009) | 37 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/handler/ha_innodb.h
branches/5.1:
Merge a change from MySQL:
(this has been reviewed by Calvin and Marko, and Calvin says Luis has
incorporated Marko's suggestions)
------------------------------------------------------------
revno: 3092.5.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Thu 2009-09-24 15:52:52 +0100
message:
BUG#42829: binlogging enabled for all schemas regardless of
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
added:
mysql-test/suite/binlog/r/binlog_stm_do_db.result
mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt
mysql-test/suite/binlog/t/binlog_stm_do_db.test
modified:
sql/sql_base.cc
sql/sql_class.cc
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/ha_innodb.h
------------------------------------------------------------------------
------------------------------------------------------------------------
r6161 | vasil | 2009-11-11 15:36:16 +0200 (Wed, 11 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add changelog entry for r6160.
------------------------------------------------------------------------
r6162 | vasil | 2009-11-11 16:00:12 +0200 (Wed, 11 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog for r6157.
------------------------------------------------------------------------
r6163 | calvin | 2009-11-11 17:53:20 +0200 (Wed, 11 Nov 2009) | 8 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/handler/ha_innodb.h
branches/zip: Exclude thd_binlog_filter_ok() when building
with older version of MySQL.
thd_binlog_filter_ok() is introduced in MySQL 5.1.41. But the
plugin can be built with MySQL prior to 5.1.41.
Approved by Heikki (on IM).
------------------------------------------------------------------------
r6169 | calvin | 2009-11-12 14:40:43 +0200 (Thu, 12 Nov 2009) | 6 lines
Changed paths:
A /branches/zip/mysql-test/innodb_bug46676.result
A /branches/zip/mysql-test/innodb_bug46676.test
branches/zip: add test case for bug#46676
This crash is reproducible with InnoDB plugin 1.0.4 + MySQL 5.1.37.
But no longer reproducible after MySQL 5.1.38 (with plugin 1.0.5).
Add test case to catch future regression.
------------------------------------------------------------------------
r6170 | marko | 2009-11-12 15:49:08 +0200 (Thu, 12 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/db0err.h
M /branches/zip/row/row0merge.c
M /branches/zip/row/row0mysql.c
branches/zip: Allow CREATE INDEX to be interrupted. (Issue #354)
rb://183 approved by Heikki Tuuri
------------------------------------------------------------------------
r6175 | vasil | 2009-11-16 20:07:39 +0200 (Mon, 16 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Wrap line at 78th char in the ChangeLog
------------------------------------------------------------------------
r6177 | calvin | 2009-11-16 20:20:38 +0200 (Mon, 16 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip: add an entry to ChangeLog for r6065
------------------------------------------------------------------------
r6179 | marko | 2009-11-17 10:19:34 +0200 (Tue, 17 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: ha_innobase::change_active_index(): When the history is
missing, report it to the client, not to the error log.
------------------------------------------------------------------------
r6181 | vasil | 2009-11-17 12:21:41 +0200 (Tue, 17 Nov 2009) | 33 lines
Changed paths:
M /branches/zip/mysql-test/innodb-index.test
branches/zip:
At the end of innodb-index.test: restore the environment as it was before
the test was started to silence this warning:
MTR's internal check of the test case 'main.innodb-index' failed.
This means that the test case does not preserve the state that existed
before the test case was executed. Most likely the test case did not
do a proper clean-up.
This is the diff of the states of the servers before and after the
test case was executed:
mysqltest: Logging to '/tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.log'.
mysqltest: Results saved in '/tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result'.
mysqltest: Connecting to server localhost:13000 (socket /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/mysqld.1.sock) as 'root', connection 'default', attempt 0 ...
mysqltest: ... Connected.
mysqltest: Start processing test commands from './include/check-testcase.test' ...
mysqltest: ... Done processing test commands.
--- /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result 2009-11-17 13:10:40.000000000 +0300
+++ /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.reject 2009-11-17 13:10:54.000000000 +0300
@@ -84,7 +84,7 @@
INNODB_DOUBLEWRITE ON
INNODB_FAST_SHUTDOWN 1
INNODB_FILE_FORMAT Antelope
-INNODB_FILE_FORMAT_CHECK Antelope
+INNODB_FILE_FORMAT_CHECK Barracuda
INNODB_FILE_PER_TABLE OFF
INNODB_FLUSH_LOG_AT_TRX_COMMIT 1
INNODB_FLUSH_METHOD
mysqltest: Result content mismatch
not ok
------------------------------------------------------------------------
r6182 | marko | 2009-11-17 13:49:15 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-consistent-master.opt
M /branches/zip/mysql-test/innodb-consistent.result
M /branches/zip/mysql-test/innodb-consistent.test
M /branches/zip/mysql-test/innodb-use-sys-malloc-master.opt
M /branches/zip/mysql-test/innodb-use-sys-malloc.result
M /branches/zip/mysql-test/innodb-use-sys-malloc.test
M /branches/zip/mysql-test/innodb_bug21704.result
M /branches/zip/mysql-test/innodb_bug21704.test
M /branches/zip/mysql-test/innodb_bug40360.test
M /branches/zip/mysql-test/innodb_bug40565.result
M /branches/zip/mysql-test/innodb_bug40565.test
M /branches/zip/mysql-test/innodb_bug41904.result
M /branches/zip/mysql-test/innodb_bug41904.test
M /branches/zip/mysql-test/innodb_bug42101-nonzero-master.opt
M /branches/zip/mysql-test/innodb_bug42101-nonzero.result
M /branches/zip/mysql-test/innodb_bug42101-nonzero.test
M /branches/zip/mysql-test/innodb_bug42101.result
M /branches/zip/mysql-test/innodb_bug42101.test
M /branches/zip/mysql-test/innodb_bug44032.result
M /branches/zip/mysql-test/innodb_bug44032.test
M /branches/zip/mysql-test/innodb_bug44369.result
M /branches/zip/mysql-test/innodb_bug44369.test
M /branches/zip/mysql-test/innodb_bug44571.result
M /branches/zip/mysql-test/innodb_bug44571.test
M /branches/zip/mysql-test/innodb_bug45357.test
M /branches/zip/mysql-test/innodb_bug46000.result
M /branches/zip/mysql-test/innodb_bug46000.test
M /branches/zip/mysql-test/innodb_bug46676.result
M /branches/zip/mysql-test/innodb_bug46676.test
M /branches/zip/mysql-test/innodb_bug47167.result
M /branches/zip/mysql-test/innodb_bug47167.test
M /branches/zip/mysql-test/innodb_bug47777.result
M /branches/zip/mysql-test/innodb_bug47777.test
M /branches/zip/mysql-test/innodb_file_format.result
M /branches/zip/mysql-test/innodb_file_format.test
branches/zip: Set svn:eol-style on mysql-test files.
------------------------------------------------------------------------
r6183 | marko | 2009-11-17 13:51:16 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-consistent-master.opt
M /branches/zip/mysql-test/innodb-master.opt
M /branches/zip/mysql-test/innodb-semi-consistent-master.opt
M /branches/zip/mysql-test/innodb-use-sys-malloc-master.opt
M /branches/zip/mysql-test/innodb_bug42101-nonzero-master.opt
branches/zip: Prepend loose_ to plugin-only mysql-test options.
------------------------------------------------------------------------
r6184 | marko | 2009-11-17 13:52:01 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-index.result
M /branches/zip/mysql-test/innodb-index.test
branches/zip: innodb-index.test: Restore innodb_file_format_check.
------------------------------------------------------------------------
r6185 | marko | 2009-11-17 16:44:20 +0200 (Tue, 17 Nov 2009) | 16 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/mysql-test/innodb.result
M /branches/zip/mysql-test/innodb.test
M /branches/zip/mysql-test/innodb_bug44369.result
M /branches/zip/mysql-test/innodb_bug44369.test
D /branches/zip/mysql-test/patches/innodb-index.diff
M /branches/zip/row/row0mysql.c
branches/zip: Report duplicate table names
to the client connection, not to the error log. This change will allow
innodb-index.test to be re-enabled. It was previously disabled, because
mysql-test-run does not like output in the error log.
row_create_table_for_mysql(): Do not output anything to the error log
when reporting DB_DUPLICATE_KEY. Let the caller report the error.
Add a TODO comment that the dict_table_t object is apparently not freed
when an error occurs.
create_table_def(): Convert InnoDB table names to the character set
of the client connection for reporting. Use my_error(ER_WRONG_COLUMN_NAME)
for reporting reserved column names. Report my_error(ER_TABLE_EXISTS_ERROR)
when row_create_table_for_mysql() returns DB_DUPLICATE_KEY.
rb://206
------------------------------------------------------------------------
r6186 | vasil | 2009-11-17 16:48:14 +0200 (Tue, 17 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for r6185.
------------------------------------------------------------------------
r6189 | marko | 2009-11-18 11:36:18 +0200 (Wed, 18 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/handler0alter.cc
branches/zip: ha_innobase::add_index(): When creating the primary key
and the table is being locked by another transaction,
do not attempt to drop the table. (Bug #48782)
Approved by Sunny Bains over IM
------------------------------------------------------------------------
r6194 | vasil | 2009-11-19 09:24:45 +0200 (Thu, 19 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/include/univ.i
branches/zip:
Increment version number from 1.0.5 to 1.0.6 since 1.0.5 was just released
by MySQL and we will soon release 1.0.6.
------------------------------------------------------------------------
r6197 | calvin | 2009-11-19 09:32:55 +0200 (Thu, 19 Nov 2009) | 6 lines
Changed paths:
M /branches/zip/CMakeLists.txt
branches/zip: merge the fix of bug#48317 (CMake file)
Due to MySQL changes to the CMake, it is no longer able
to build InnoDB plugin as a static library on Windows.
The fix is proposed by Vlad of MySQL.
------------------------------------------------------------------------
r6198 | vasil | 2009-11-19 09:44:31 +0200 (Thu, 19 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for r6197.
------------------------------------------------------------------------
r6199 | vasil | 2009-11-19 12:10:12 +0200 (Thu, 19 Nov 2009) | 31 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/btr/btr0btr.c
M /branches/zip/data/data0type.c
branches/zip: Merge r6159:6198 from branches/5.1:
------------------------------------------------------------------------
r6187 | jyang | 2009-11-18 05:27:30 +0200 (Wed, 18 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1: Fix bug #48469 "when innodb tablespace is
configured too small, crash and corruption!". Function
btr_create() did not check the return status of fseg_create(),
and continue the index creation even there is no sufficient
space.
rb://205 Approved by Marko
------------------------------------------------------------------------
r6188 | jyang | 2009-11-18 07:14:23 +0200 (Wed, 18 Nov 2009) | 8 lines
Changed paths:
M /branches/5.1/data/data0type.c
branches/5.1: Fix bug #48526 "Data type for float and
double is incorrectly reported in InnoDB table monitor".
Certain datatypes are not printed correctly in
dtype_print().
rb://204 Approved by Marko.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6201 | marko | 2009-11-19 14:09:11 +0200 (Thu, 19 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/handler/handler0alter.cc
branches/zip: ha_innobase::add_index(): Clarify the comment
on orphaned tables when creating a primary key.
------------------------------------------------------------------------
r6202 | jyang | 2009-11-19 15:01:00 +0200 (Thu, 19 Nov 2009) | 8 lines
Changed paths:
M /branches/zip/btr/btr0btr.c
branches/zip: Function fseg_free() is no longer defined
in branches/zip. To port fix for bug #48469 to zip,
we can use btr_free_root() which frees the page,
and also does not require mini-transaction.
Approved by Marko.
------------------------------------------------------------------------
r6207 | vasil | 2009-11-20 10:19:14 +0200 (Fri, 20 Nov 2009) | 54 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Merge r6198:6206 from branches/5.1:
(r6203 was skipped as it is already in branches/zip)
------------------------------------------------------------------------
r6200 | vasil | 2009-11-19 12:14:23 +0200 (Thu, 19 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1:
White space fixup - indent under the opening (
------------------------------------------------------------------------
r6203 | jyang | 2009-11-19 15:12:22 +0200 (Thu, 19 Nov 2009) | 8 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1: Use btr_free_root() instead of fseg_free() for
the fix of bug #48469, because fseg_free() is not defined
in the zip branch. And we could save one mini-trasaction started
by fseg_free().
Approved by Marko.
------------------------------------------------------------------------
r6205 | jyang | 2009-11-20 07:55:48 +0200 (Fri, 20 Nov 2009) | 11 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Add a special case to handle the Duplicated Key error
and return DB_ERROR instead. This is to avoid a possible SIGSEGV
by mysql error handling re-entering the storage layer for dup key
info without proper table handle.
This is to prevent a server crash when error situation in bug
#45961 "DDL on partitioned innodb tables leaves data dictionary
in an inconsistent state" happens.
rb://157 approved by Sunny Bains.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 5 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Fix a minor code formating issue for
the parenthesis iplacement of the if condition in
rename_table().
------------------------------------------------------------------------
------------------------------------------------------------------------
r6208 | vasil | 2009-11-20 10:49:24 +0200 (Fri, 20 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for c6207.
------------------------------------------------------------------------
r6210 | vasil | 2009-11-20 23:39:48 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/zip/trx/trx0i_s.c
branches/zip:
Whitespace fixup.
------------------------------------------------------------------------
r6248 | marko | 2009-11-30 12:19:50 +0200 (Mon, 30 Nov 2009) | 1 line
Changed paths:
M /branches/zip/ChangeLog
branches/zip: ChangeLog: Document r4922 that was forgotten.
------------------------------------------------------------------------
r6252 | marko | 2009-11-30 12:50:11 +0200 (Mon, 30 Nov 2009) | 23 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/dict/dict0boot.c
M /branches/zip/dict/dict0crea.c
M /branches/zip/dict/dict0load.c
M /branches/zip/dict/dict0mem.c
M /branches/zip/fil/fil0fil.c
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/dict0mem.h
M /branches/zip/row/row0mysql.c
branches/zip: Suppress errors about non-found temporary tables.
Write the is_temp flag to SYS_TABLES.MIX_LEN.
dict_table_t::flags: Add a flag for is_temporary, DICT_TF2_TEMPORARY.
Unlike other flags, this will not be written to the tablespace flags
or SYS_TABLES.TYPE, but only to SYS_TABLES.MIX_LEN.
dict_build_table_def_step(): Only pass DICT_TF_BITS to tablespaces.
dict_check_tablespaces_and_store_max_id(), dict_load_table():
Suppress errors about temporary tables not being found.
dict_create_sys_tables_tuple(): Write the DICT_TF2_TEMPORARY flag
to SYS_TABLES.MIX_LEN.
fil_space_create(), fil_create_new_single_table_tablespace(): Add assertions
about space->flags.
row_drop_table_for_mysql(): Do not complain about non-found temporary tables.
rb://160 approved by Heikki Tuuri. This addresses the second part of
Bug #41609 Crash recovery does not work for InnoDB temporary tables.
------------------------------------------------------------------------
r6263 | vasil | 2009-12-01 14:49:05 +0200 (Tue, 01 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/include/univ.i
branches/zip: Increment version number from 1.0.6 to 1.0.7
1.0.6 has been released
------------------------------------------------------------------------
r6264 | vasil | 2009-12-01 16:19:44 +0200 (Tue, 01 Dec 2009) | 1 line
Changed paths:
M /branches/zip/ChangeLog
branches/zip: Add ChangeLog entry for the release of 1.0.6.
------------------------------------------------------------------------
r6269 | marko | 2009-12-02 11:35:22 +0200 (Wed, 02 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/srv/srv0start.c
branches/zip: innobase_start_or_create_for_mysql(): UNIV_IBUF_DEBUG
should not break crash recovery, but UNIV_IBUF_COUNT_DEBUG will.
------------------------------------------------------------------------
r6270 | marko | 2009-12-02 11:36:47 +0200 (Wed, 02 Dec 2009) | 1 line
Changed paths:
M /branches/zip/srv/srv0start.c
branches/zip: innobase_start_or_create_for_mysql(): Log the zlib version.
------------------------------------------------------------------------
r6271 | marko | 2009-12-02 11:43:49 +0200 (Wed, 02 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/Makefile.am
M /branches/zip/include/univ.i
M /branches/zip/plug.in
branches/zip: ChangeLog: Document that since r6270, the zlib version number
will be displayed at start-up.
------------------------------------------------------------------------
r6272 | marko | 2009-12-02 11:46:05 +0200 (Wed, 02 Dec 2009) | 1 line
Changed paths:
M /branches/zip/Makefile.am
M /branches/zip/include/univ.i
M /branches/zip/plug.in
branches/zip: Revert changes that were accidentally committed in r6271.
------------------------------------------------------------------------
r6274 | marko | 2009-12-03 14:47:12 +0200 (Thu, 03 Dec 2009) | 6 lines
Changed paths:
M /branches/zip/dict/dict0dict.c
branches/zip: dict_table_check_for_dup_indexes(): Assert that the
data dictionary mutex is being held while table->indexes is accessed.
This is already the case.
Currently, only dict_table_get_next_index() and dict_table_get_first_index()
are being invoked without holding dict_sys->mutex.
------------------------------------------------------------------------
r6275 | pekka | 2009-12-03 18:32:47 +0200 (Thu, 03 Dec 2009) | 10 lines
Changed paths:
M /branches/zip/include/log0recv.h
M /branches/zip/include/trx0sys.h
M /branches/zip/log/log0recv.c
M /branches/zip/trx/trx0sys.c
branches/zip: Minor changes which allow build with UNIV_HOTBACKUP
defined to succeed:
include/trx0sys.h: Allow Hot Backup build to see some
TRX_SYS_DOUBLEWRITE_... macros.
trx/trx0sys.c: Exclude trx_sys_close() function from Hot Backup build.
log/log0recv.[ch]: Exclude recv_sys_var_init() function from Hot Backup build.
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r6277 | marko | 2009-12-08 11:13:36 +0200 (Tue, 08 Dec 2009) | 1 line
Changed paths:
M /branches/zip/fsp/fsp0fsp.c
branches/zip: fsp0fsp.c: Add some missing in/out and const qualifiers.
------------------------------------------------------------------------
r6285 | marko | 2009-12-09 09:24:50 +0200 (Wed, 09 Dec 2009) | 13 lines
Changed paths:
M /branches/zip/row/row0sel.c
branches/zip: row_sel_fetch_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns,
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This redundant code was noticed by Sergey Petrunya on the MySQL
internals list.
------------------------------------------------------------------------
r6288 | marko | 2009-12-09 09:51:00 +0200 (Wed, 09 Dec 2009) | 15 lines
Changed paths:
M /branches/zip/row/row0upd.c
branches/zip: row_upd_copy_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns.
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This is similar to the redundant code in row_sel_fetch_columns() that
was noticed by Sergey Petrunya on the MySQL internals list and removed
in r6285. As far as I can tell, there are no redundant UNIV_SQL_NULL
assignments remaining after this change.
------------------------------------------------------------------------
r6305 | marko | 2009-12-14 13:03:57 +0200 (Mon, 14 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/row/row0umod.c
branches/zip: row_undo_mod_del_unmark_sec_and_undo_update(): Add a missing
const qualifier.
------------------------------------------------------------------------
r6309 | marko | 2009-12-15 14:05:50 +0200 (Tue, 15 Dec 2009) | 3 lines
Changed paths:
M /branches/zip/lock/lock0lock.c
branches/zip: lock_rec_insert_check_and_lock(): Avoid casting away constness.
Use page_rec_get_next_const() instead. This silences a gcc 4.2.4 warning.
Reported by Sunny Bains.
------------------------------------------------------------------------
r6312 | marko | 2009-12-16 10:10:36 +0200 (Wed, 16 Dec 2009) | 6 lines
Changed paths:
M /branches/zip/fil/fil0fil.c
branches/zip: fil_close(): Add #ifndef UNIV_HOTBACKUP around a debug
assertion on mutex.magic_n. InnoDB Hot Backup is a single-threaded
program and does not contain mutexes. This change allows InnoDB Hot
Backup to be compiled with UNIV_DEBUG.
Suggested by Michael Izioumtchenko.
------------------------------------------------------------------------
r6321 | marko | 2009-12-16 16:16:33 +0200 (Wed, 16 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/row/row0merge.c
branches/zip: row_merge_drop_temp_indexes(): Revert a hack to
transaction isolation level that was made unnecessary by r5826 (Issue #337).
When this function is called, any active data dictionary transaction
should have been rolled back.
------------------------------------------------------------------------
r6345 | marko | 2009-12-21 10:46:14 +0200 (Mon, 21 Dec 2009) | 7 lines
Changed paths:
M /branches/zip/log/log0recv.c
branches/zip: recv_scan_log_recs(): Non-functional change: Replace a
debug assertion ut_ad(len > 0) with ut_ad(len >= OS_FILE_LOG_BLOCK_SIZE).
This change is only for readability, for Issue #428. Another
assertion on len being an integer multiple of OS_FILE_LOG_BLOCK_SIZE
already ensured together with the old ut_ad(len > 0) that actually len
must be at least OS_FILE_LOG_BLOCK_SIZE.
------------------------------------------------------------------------
r6346 | marko | 2009-12-21 12:03:25 +0200 (Mon, 21 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/log/log0recv.c
branches/zip: recv_recovery_from_checkpoint_finish():
Revert a change that was accidentally committed in r6345.
------------------------------------------------------------------------
r6348 | marko | 2009-12-22 11:04:34 +0200 (Tue, 22 Dec 2009) | 37 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/ha_prototypes.h
M /branches/zip/include/trx0trx.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/trx/trx0i_s.c
M /branches/zip/trx/trx0trx.c
branches/zip: Merge a change from MySQL:
------------------------------------------------------------
revno: 3236
committer: Satya B <satya.bn@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-12-01 17:48:57 +0530
message:
merge to mysql-5.1-bugteam
------------------------------------------------------------
revno: 3234.1.1
committer: Gleb Shchepa <gshchepa@mysql.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-12-01 14:38:40 +0400
message:
Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
manual merge 5.0-->5.1, updating InnoDB plugin.
------------------------------------------------------------
revno: 1810.3968.13
committer: Gleb Shchepa <gshchepa@mysql.com>
branch nick: mysql-5.0-bugteam
timestamp: Tue 2009-12-01 14:24:44 +0400
message:
Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
The bug 38816 changed the lock that protects THD::query from
LOCK_thread_count to LOCK_thd_data, but didn't update the associated
InnoDB functions.
1. The innobase_mysql_prepare_print_arbitrary_thd and the
innobase_mysql_end_print_arbitrary_thd InnoDB functions have been
removed, since now we have a per-thread mutex: now we don't need to wrap
several inter-thread access tries to THD::query with a single global
LOCK_thread_count lock, so we can simplify the code.
2. The innobase_mysql_print_thd function has been modified to lock
LOCK_thd_data in direct way.
------------------------------------------------------------------------
r6351 | marko | 2009-12-22 11:11:18 +0200 (Tue, 22 Dec 2009) | 1 line
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Remove an obsolete declaration of LOCK_thread_count.
------------------------------------------------------------------------
r6352 | marko | 2009-12-22 12:33:01 +0200 (Tue, 22 Dec 2009) | 104 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/lock0lock.h
M /branches/zip/include/srv0srv.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/log/log0log.c
M /branches/zip/mysql-test/innodb-autoinc.result
M /branches/zip/mysql-test/innodb-autoinc.test
M /branches/zip/row/row0sel.c
M /branches/zip/srv/srv0srv.c
M /branches/zip/srv/srv0start.c
branches/zip: Merge revisions 6206:6350 from branches/5.1,
except r6347, r6349, r6350 which were committed separately
to both branches, and r6310, which was backported from zip to 5.1.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Non-functional change, fix formatting.
------------------------------------------------------------------------
r6230 | sunny | 2009-11-24 23:52:43 +0200 (Tue, 24 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
branches/5.1: Fix autoinc failing test results.
(this should be skipped when merging 5.1 into zip)
------------------------------------------------------------------------
r6231 | sunny | 2009-11-25 10:26:27 +0200 (Wed, 25 Nov 2009) | 7 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: Fix BUG#49032 - auto_increment field does not initialize to last value in InnoDB Storage Engine.
We use the appropriate function to read the column value for non-integer
autoinc column types, namely float and double.
rb://208. Approved by Marko.
------------------------------------------------------------------------
r6232 | sunny | 2009-11-25 10:27:39 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix white space errors.
------------------------------------------------------------------------
r6233 | sunny | 2009-11-25 10:28:35 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/include/mach0data.h
M /branches/5.1/include/mach0data.ic
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix tests and make read float/double arg const.
------------------------------------------------------------------------
r6234 | sunny | 2009-11-25 10:29:03 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix whitepsace issues.
------------------------------------------------------------------------
r6235 | sunny | 2009-11-26 01:14:42 +0200 (Thu, 26 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Fix Bug#47720 - REPLACE INTO Autoincrement column with negative values.
This bug is similiar to the negative autoinc filter patch from earlier,
with the additional handling of filtering out the negative column values
set explicitly by the user.
rb://184
Approved by Heikki.
------------------------------------------------------------------------
r6242 | vasil | 2009-11-27 22:07:12 +0200 (Fri, 27 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/export.sh
branches/5.1:
Minor changes to support plugin snapshots.
------------------------------------------------------------------------
r6306 | calvin | 2009-12-14 15:12:46 +0200 (Mon, 14 Dec 2009) | 5 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: fix bug#49267: innodb-autoinc.test fails on windows
because of different case mode
There is no change to the InnoDB code, only to fix test case by
changing "T1" to "t1".
------------------------------------------------------------------------
r6324 | jyang | 2009-12-17 06:54:24 +0200 (Thu, 17 Dec 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/lock0lock.h
M /branches/5.1/include/srv0srv.h
M /branches/5.1/lock/lock0lock.c
M /branches/5.1/log/log0log.c
M /branches/5.1/srv/srv0srv.c
M /branches/5.1/srv/srv0start.c
branches/5.1: Fix bug #47814 - Diagnostics are frequently not
printed after a long lock wait in InnoDB. Separate out the
lock wait timeout check thread from monitor information
printing thread.
rb://200 Approved by Marko.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6364 | marko | 2009-12-26 21:06:31 +0200 (Sat, 26 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/ibuf/ibuf0ibuf.c
branches/zip: ibuf_bitmap_get_map_page():
Define a wrapper macro that passes __FILE__, __LINE__ of the caller
to buf_page_get_gen().
This will ease the diagnosis of the likes of Issue #135.
------------------------------------------------------------------------
16 years ago  branches/innodb+: Merge revisions 6130:6364 from branches/zip:
------------------------------------------------------------------------
r6130 | marko | 2009-11-02 11:42:56 +0200 (Mon, 02 Nov 2009) | 9 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/btr/btr0sea.c
M /branches/zip/buf/buf0buf.c
M /branches/zip/dict/dict0dict.c
M /branches/zip/fil/fil0fil.c
M /branches/zip/ibuf/ibuf0ibuf.c
M /branches/zip/include/btr0sea.h
M /branches/zip/include/dict0dict.h
M /branches/zip/include/fil0fil.h
M /branches/zip/include/ibuf0ibuf.h
M /branches/zip/include/lock0lock.h
M /branches/zip/include/log0log.h
M /branches/zip/include/log0recv.h
M /branches/zip/include/mem0mem.h
M /branches/zip/include/mem0pool.h
M /branches/zip/include/os0file.h
M /branches/zip/include/pars0pars.h
M /branches/zip/include/srv0srv.h
M /branches/zip/include/thr0loc.h
M /branches/zip/include/trx0i_s.h
M /branches/zip/include/trx0purge.h
M /branches/zip/include/trx0rseg.h
M /branches/zip/include/trx0sys.h
M /branches/zip/include/trx0undo.h
M /branches/zip/include/usr0sess.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/log/log0log.c
M /branches/zip/log/log0recv.c
M /branches/zip/mem/mem0dbg.c
M /branches/zip/mem/mem0pool.c
M /branches/zip/os/os0file.c
M /branches/zip/os/os0sync.c
M /branches/zip/os/os0thread.c
M /branches/zip/pars/lexyy.c
M /branches/zip/pars/pars0lex.l
M /branches/zip/que/que0que.c
M /branches/zip/srv/srv0srv.c
M /branches/zip/srv/srv0start.c
M /branches/zip/sync/sync0arr.c
M /branches/zip/sync/sync0sync.c
M /branches/zip/thr/thr0loc.c
M /branches/zip/trx/trx0i_s.c
M /branches/zip/trx/trx0purge.c
M /branches/zip/trx/trx0rseg.c
M /branches/zip/trx/trx0sys.c
M /branches/zip/trx/trx0undo.c
M /branches/zip/usr/usr0sess.c
M /branches/zip/ut/ut0mem.c
branches/zip: Free all resources at shutdown. Set pointers to NULL, so
that Valgrind will not complain about freed data structures that are
reachable via pointers. This addresses Bug #45992 and Bug #46656.
This patch is mostly based on changes copied from branches/embedded-1.0,
mainly c5432, c3439, c3134, c2994, c2978, but also some other code was
copied. Some added cleanup code is specific to MySQL/InnoDB.
rb://199 approved by Sunny Bains
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 09:57:29 +0200 (Wed, 04 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
r6137 | marko | 2009-11-04 15:24:28 +0200 (Wed, 04 Nov 2009) | 1 line
Changed paths:
M /branches/zip/dict/dict0dict.c
branches/zip: dict_index_too_big_for_undo(): Correct a typo.
------------------------------------------------------------------------
r6153 | vasil | 2009-11-10 15:33:22 +0200 (Tue, 10 Nov 2009) | 145 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Merge r6125:6152 from branches/5.1:
(everything except the last white-space change was skipped as it is already
in branches/zip)
------------------------------------------------------------------------
r6127 | vasil | 2009-10-30 11:18:25 +0200 (Fri, 30 Oct 2009) | 18 lines
Changed paths:
M /branches/5.1/Makefile.am
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Backport c6121 from branches/zip:
------------------------------------------------------------------------
r6121 | sunny | 2009-10-30 01:42:11 +0200 (Fri, 30 Oct 2009) | 7 lines
Changed paths:
M /branches/zip/mysql-test/innodb-autoinc.result
branches/zip: This test has been problematic for sometime now. The underlying
bug is that the data dictionaries get out of sync. In the AUTOINC code we
try and apply salve to the symptoms. In the past MySQL made some unrelated
change and the dictionaries stopped getting out of sync and this test started
to fail. Now, it seems they have reverted that changed and the test is
passing again. I suspect this is not he last time that this test will change.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6129 | vasil | 2009-10-30 17:14:22 +0200 (Fri, 30 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/Makefile.am
branches/5.1:
Revert a change to Makefile.am that sneaked unnoticed in c6127.
------------------------------------------------------------------------
r6136 | marko | 2009-11-04 12:28:10 +0200 (Wed, 04 Nov 2009) | 15 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/ha_prototypes.h
M /branches/5.1/ut/ut0ut.c
branches/5.1: Port r6134 from branches/zip:
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 07:57:29 +0000 (Wed, 04 Nov 2009) | 5 lines
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
innobase_print_identifier(): Replace with innobase_convert_name().
innobase_convert_identifier(): New function, called by innobase_convert_name().
------------------------------------------------------------------------
r6149 | vasil | 2009-11-09 11:15:01 +0200 (Mon, 09 Nov 2009) | 5 lines
Changed paths:
M /branches/5.1/CMakeLists.txt
branches/5.1:
Followup to r5700: Adjust the changes so they are the same as in the BZR
repository.
------------------------------------------------------------------------
r6150 | vasil | 2009-11-09 11:43:31 +0200 (Mon, 09 Nov 2009) | 58 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a part of r2911.5.5 from MySQL:
(the other part of this was merged in c5700)
------------------------------------------------------------
revno: 2911.5.5
committer: Vladislav Vaintroub <vvaintroub@mysql.com>
branch nick: 5.1-innodb_plugin
timestamp: Wed 2009-06-10 10:59:49 +0200
message:
Backport WL#3653 to 5.1 to enable bundled innodb plugin.
Remove custom DLL loader code from innodb plugin code, use
symbols exported from mysqld.
removed:
storage/innodb_plugin/handler/handler0vars.h
storage/innodb_plugin/handler/win_delay_loader.cc
added:
storage/mysql_storage_engine.cmake
win/create_def_file.js
modified:
CMakeLists.txt
include/m_ctype.h
include/my_global.h
include/my_sys.h
include/mysql/plugin.h
libmysqld/CMakeLists.txt
mysql-test/mysql-test-run.pl
mysql-test/t/plugin.test
mysql-test/t/plugin_load-master.opt
mysys/charset.c
sql/CMakeLists.txt
sql/handler.h
sql/mysql_priv.h
sql/mysqld.cc
sql/sql_class.cc
sql/sql_class.h
sql/sql_list.h
sql/sql_profile.h
storage/Makefile.am
storage/archive/CMakeLists.txt
storage/blackhole/CMakeLists.txt
storage/csv/CMakeLists.txt
storage/example/CMakeLists.txt
storage/federated/CMakeLists.txt
storage/heap/CMakeLists.txt
storage/innobase/CMakeLists.txt
storage/innobase/handler/ha_innodb.cc
storage/innodb_plugin/CMakeLists.txt
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/handler0alter.cc
storage/innodb_plugin/handler/i_s.cc
storage/innodb_plugin/plug.in
storage/myisam/CMakeLists.txt
storage/myisammrg/CMakeLists.txt
win/Makefile.am
win/configure.js
------------------------------------------------------------------------
r6152 | vasil | 2009-11-10 15:30:20 +0200 (Tue, 10 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
White space fixup.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6157 | jyang | 2009-11-11 14:27:09 +0200 (Wed, 11 Nov 2009) | 10 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
A /branches/zip/mysql-test/innodb_bug47167.result
A /branches/zip/mysql-test/innodb_bug47167.test
M /branches/zip/mysql-test/innodb_file_format.result
branches/zip: Fix an issue that a local variable defined
in innodb_file_format_check_validate() is being referenced
across function in innodb_file_format_check_update().
In addition, fix "set global innodb_file_format_check =
DEFAULT" call.
Bug #47167: "set global innodb_file_format_check" cannot
set value by User-Defined Variable."
rb://169 approved by Sunny Bains and Marko.
------------------------------------------------------------------------
r6159 | vasil | 2009-11-11 15:13:01 +0200 (Wed, 11 Nov 2009) | 37 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/handler/ha_innodb.h
branches/zip:
Merge a change from MySQL:
(this has been reviewed by Calvin and Marko, and Calvin says Luis has
incorporated Marko's suggestions)
------------------------------------------------------------
revno: 3092.5.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Thu 2009-09-24 15:52:52 +0100
message:
BUG#42829: binlogging enabled for all schemas regardless of
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
added:
mysql-test/suite/binlog/r/binlog_stm_do_db.result
mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt
mysql-test/suite/binlog/t/binlog_stm_do_db.test
modified:
sql/sql_base.cc
sql/sql_class.cc
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/ha_innodb.h
------------------------------------------------------------------------
r6160 | vasil | 2009-11-11 15:33:49 +0200 (Wed, 11 Nov 2009) | 72 lines
Changed paths:
M /branches/zip/include/os0file.h
M /branches/zip/os/os0file.c
branches/zip: Merge r6152:6159 from branches/5.1:
(r6158 was skipped as an equivallent change has already been merged from MySQL)
------------------------------------------------------------------------
r6154 | calvin | 2009-11-11 02:51:17 +0200 (Wed, 11 Nov 2009) | 17 lines
Changed paths:
M /branches/5.1/include/os0file.h
M /branches/5.1/os/os0file.c
branches/5.1: fix bug#3139: Mysql crashes: 'windows error 995'
after several selects on a large DB
During stress environment, Windows AIO may fail with error code
ERROR_OPERATION_ABORTED. InnoDB does not handle the error, rather
crashes. The cause of the error is unknown, but likely due to
faulty hardware or driver.
This patch introduces a new error code OS_FILE_OPERATION_ABORTED,
which maps to Windows ERROR_OPERATION_ABORTED (995). When the error
is detected during AIO, the InnoDB will issue a synchronous retry
(read/write).
This patch has been extensively tested by MySQL support.
Approved by: Marko
rb://196
------------------------------------------------------------------------
r6158 | vasil | 2009-11-11 14:52:14 +0200 (Wed, 11 Nov 2009) | 37 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/handler/ha_innodb.h
branches/5.1:
Merge a change from MySQL:
(this has been reviewed by Calvin and Marko, and Calvin says Luis has
incorporated Marko's suggestions)
------------------------------------------------------------
revno: 3092.5.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Thu 2009-09-24 15:52:52 +0100
message:
BUG#42829: binlogging enabled for all schemas regardless of
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
added:
mysql-test/suite/binlog/r/binlog_stm_do_db.result
mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt
mysql-test/suite/binlog/t/binlog_stm_do_db.test
modified:
sql/sql_base.cc
sql/sql_class.cc
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/ha_innodb.h
------------------------------------------------------------------------
------------------------------------------------------------------------
r6161 | vasil | 2009-11-11 15:36:16 +0200 (Wed, 11 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add changelog entry for r6160.
------------------------------------------------------------------------
r6162 | vasil | 2009-11-11 16:00:12 +0200 (Wed, 11 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog for r6157.
------------------------------------------------------------------------
r6163 | calvin | 2009-11-11 17:53:20 +0200 (Wed, 11 Nov 2009) | 8 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/handler/ha_innodb.h
branches/zip: Exclude thd_binlog_filter_ok() when building
with older version of MySQL.
thd_binlog_filter_ok() is introduced in MySQL 5.1.41. But the
plugin can be built with MySQL prior to 5.1.41.
Approved by Heikki (on IM).
------------------------------------------------------------------------
r6169 | calvin | 2009-11-12 14:40:43 +0200 (Thu, 12 Nov 2009) | 6 lines
Changed paths:
A /branches/zip/mysql-test/innodb_bug46676.result
A /branches/zip/mysql-test/innodb_bug46676.test
branches/zip: add test case for bug#46676
This crash is reproducible with InnoDB plugin 1.0.4 + MySQL 5.1.37.
But no longer reproducible after MySQL 5.1.38 (with plugin 1.0.5).
Add test case to catch future regression.
------------------------------------------------------------------------
r6170 | marko | 2009-11-12 15:49:08 +0200 (Thu, 12 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/db0err.h
M /branches/zip/row/row0merge.c
M /branches/zip/row/row0mysql.c
branches/zip: Allow CREATE INDEX to be interrupted. (Issue #354)
rb://183 approved by Heikki Tuuri
------------------------------------------------------------------------
r6175 | vasil | 2009-11-16 20:07:39 +0200 (Mon, 16 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Wrap line at 78th char in the ChangeLog
------------------------------------------------------------------------
r6177 | calvin | 2009-11-16 20:20:38 +0200 (Mon, 16 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip: add an entry to ChangeLog for r6065
------------------------------------------------------------------------
r6179 | marko | 2009-11-17 10:19:34 +0200 (Tue, 17 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: ha_innobase::change_active_index(): When the history is
missing, report it to the client, not to the error log.
------------------------------------------------------------------------
r6181 | vasil | 2009-11-17 12:21:41 +0200 (Tue, 17 Nov 2009) | 33 lines
Changed paths:
M /branches/zip/mysql-test/innodb-index.test
branches/zip:
At the end of innodb-index.test: restore the environment as it was before
the test was started to silence this warning:
MTR's internal check of the test case 'main.innodb-index' failed.
This means that the test case does not preserve the state that existed
before the test case was executed. Most likely the test case did not
do a proper clean-up.
This is the diff of the states of the servers before and after the
test case was executed:
mysqltest: Logging to '/tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.log'.
mysqltest: Results saved in '/tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result'.
mysqltest: Connecting to server localhost:13000 (socket /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/mysqld.1.sock) as 'root', connection 'default', attempt 0 ...
mysqltest: ... Connected.
mysqltest: Start processing test commands from './include/check-testcase.test' ...
mysqltest: ... Done processing test commands.
--- /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result 2009-11-17 13:10:40.000000000 +0300
+++ /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.reject 2009-11-17 13:10:54.000000000 +0300
@@ -84,7 +84,7 @@
INNODB_DOUBLEWRITE ON
INNODB_FAST_SHUTDOWN 1
INNODB_FILE_FORMAT Antelope
-INNODB_FILE_FORMAT_CHECK Antelope
+INNODB_FILE_FORMAT_CHECK Barracuda
INNODB_FILE_PER_TABLE OFF
INNODB_FLUSH_LOG_AT_TRX_COMMIT 1
INNODB_FLUSH_METHOD
mysqltest: Result content mismatch
not ok
------------------------------------------------------------------------
r6182 | marko | 2009-11-17 13:49:15 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-consistent-master.opt
M /branches/zip/mysql-test/innodb-consistent.result
M /branches/zip/mysql-test/innodb-consistent.test
M /branches/zip/mysql-test/innodb-use-sys-malloc-master.opt
M /branches/zip/mysql-test/innodb-use-sys-malloc.result
M /branches/zip/mysql-test/innodb-use-sys-malloc.test
M /branches/zip/mysql-test/innodb_bug21704.result
M /branches/zip/mysql-test/innodb_bug21704.test
M /branches/zip/mysql-test/innodb_bug40360.test
M /branches/zip/mysql-test/innodb_bug40565.result
M /branches/zip/mysql-test/innodb_bug40565.test
M /branches/zip/mysql-test/innodb_bug41904.result
M /branches/zip/mysql-test/innodb_bug41904.test
M /branches/zip/mysql-test/innodb_bug42101-nonzero-master.opt
M /branches/zip/mysql-test/innodb_bug42101-nonzero.result
M /branches/zip/mysql-test/innodb_bug42101-nonzero.test
M /branches/zip/mysql-test/innodb_bug42101.result
M /branches/zip/mysql-test/innodb_bug42101.test
M /branches/zip/mysql-test/innodb_bug44032.result
M /branches/zip/mysql-test/innodb_bug44032.test
M /branches/zip/mysql-test/innodb_bug44369.result
M /branches/zip/mysql-test/innodb_bug44369.test
M /branches/zip/mysql-test/innodb_bug44571.result
M /branches/zip/mysql-test/innodb_bug44571.test
M /branches/zip/mysql-test/innodb_bug45357.test
M /branches/zip/mysql-test/innodb_bug46000.result
M /branches/zip/mysql-test/innodb_bug46000.test
M /branches/zip/mysql-test/innodb_bug46676.result
M /branches/zip/mysql-test/innodb_bug46676.test
M /branches/zip/mysql-test/innodb_bug47167.result
M /branches/zip/mysql-test/innodb_bug47167.test
M /branches/zip/mysql-test/innodb_bug47777.result
M /branches/zip/mysql-test/innodb_bug47777.test
M /branches/zip/mysql-test/innodb_file_format.result
M /branches/zip/mysql-test/innodb_file_format.test
branches/zip: Set svn:eol-style on mysql-test files.
------------------------------------------------------------------------
r6183 | marko | 2009-11-17 13:51:16 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-consistent-master.opt
M /branches/zip/mysql-test/innodb-master.opt
M /branches/zip/mysql-test/innodb-semi-consistent-master.opt
M /branches/zip/mysql-test/innodb-use-sys-malloc-master.opt
M /branches/zip/mysql-test/innodb_bug42101-nonzero-master.opt
branches/zip: Prepend loose_ to plugin-only mysql-test options.
------------------------------------------------------------------------
r6184 | marko | 2009-11-17 13:52:01 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-index.result
M /branches/zip/mysql-test/innodb-index.test
branches/zip: innodb-index.test: Restore innodb_file_format_check.
------------------------------------------------------------------------
r6185 | marko | 2009-11-17 16:44:20 +0200 (Tue, 17 Nov 2009) | 16 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/mysql-test/innodb.result
M /branches/zip/mysql-test/innodb.test
M /branches/zip/mysql-test/innodb_bug44369.result
M /branches/zip/mysql-test/innodb_bug44369.test
D /branches/zip/mysql-test/patches/innodb-index.diff
M /branches/zip/row/row0mysql.c
branches/zip: Report duplicate table names
to the client connection, not to the error log. This change will allow
innodb-index.test to be re-enabled. It was previously disabled, because
mysql-test-run does not like output in the error log.
row_create_table_for_mysql(): Do not output anything to the error log
when reporting DB_DUPLICATE_KEY. Let the caller report the error.
Add a TODO comment that the dict_table_t object is apparently not freed
when an error occurs.
create_table_def(): Convert InnoDB table names to the character set
of the client connection for reporting. Use my_error(ER_WRONG_COLUMN_NAME)
for reporting reserved column names. Report my_error(ER_TABLE_EXISTS_ERROR)
when row_create_table_for_mysql() returns DB_DUPLICATE_KEY.
rb://206
------------------------------------------------------------------------
r6186 | vasil | 2009-11-17 16:48:14 +0200 (Tue, 17 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for r6185.
------------------------------------------------------------------------
r6189 | marko | 2009-11-18 11:36:18 +0200 (Wed, 18 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/handler0alter.cc
branches/zip: ha_innobase::add_index(): When creating the primary key
and the table is being locked by another transaction,
do not attempt to drop the table. (Bug #48782)
Approved by Sunny Bains over IM
------------------------------------------------------------------------
r6194 | vasil | 2009-11-19 09:24:45 +0200 (Thu, 19 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/include/univ.i
branches/zip:
Increment version number from 1.0.5 to 1.0.6 since 1.0.5 was just released
by MySQL and we will soon release 1.0.6.
------------------------------------------------------------------------
r6197 | calvin | 2009-11-19 09:32:55 +0200 (Thu, 19 Nov 2009) | 6 lines
Changed paths:
M /branches/zip/CMakeLists.txt
branches/zip: merge the fix of bug#48317 (CMake file)
Due to MySQL changes to the CMake, it is no longer able
to build InnoDB plugin as a static library on Windows.
The fix is proposed by Vlad of MySQL.
------------------------------------------------------------------------
r6198 | vasil | 2009-11-19 09:44:31 +0200 (Thu, 19 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for r6197.
------------------------------------------------------------------------
r6199 | vasil | 2009-11-19 12:10:12 +0200 (Thu, 19 Nov 2009) | 31 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/btr/btr0btr.c
M /branches/zip/data/data0type.c
branches/zip: Merge r6159:6198 from branches/5.1:
------------------------------------------------------------------------
r6187 | jyang | 2009-11-18 05:27:30 +0200 (Wed, 18 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1: Fix bug #48469 "when innodb tablespace is
configured too small, crash and corruption!". Function
btr_create() did not check the return status of fseg_create(),
and continue the index creation even there is no sufficient
space.
rb://205 Approved by Marko
------------------------------------------------------------------------
r6188 | jyang | 2009-11-18 07:14:23 +0200 (Wed, 18 Nov 2009) | 8 lines
Changed paths:
M /branches/5.1/data/data0type.c
branches/5.1: Fix bug #48526 "Data type for float and
double is incorrectly reported in InnoDB table monitor".
Certain datatypes are not printed correctly in
dtype_print().
rb://204 Approved by Marko.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6201 | marko | 2009-11-19 14:09:11 +0200 (Thu, 19 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/handler/handler0alter.cc
branches/zip: ha_innobase::add_index(): Clarify the comment
on orphaned tables when creating a primary key.
------------------------------------------------------------------------
r6202 | jyang | 2009-11-19 15:01:00 +0200 (Thu, 19 Nov 2009) | 8 lines
Changed paths:
M /branches/zip/btr/btr0btr.c
branches/zip: Function fseg_free() is no longer defined
in branches/zip. To port fix for bug #48469 to zip,
we can use btr_free_root() which frees the page,
and also does not require mini-transaction.
Approved by Marko.
------------------------------------------------------------------------
r6207 | vasil | 2009-11-20 10:19:14 +0200 (Fri, 20 Nov 2009) | 54 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Merge r6198:6206 from branches/5.1:
(r6203 was skipped as it is already in branches/zip)
------------------------------------------------------------------------
r6200 | vasil | 2009-11-19 12:14:23 +0200 (Thu, 19 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1:
White space fixup - indent under the opening (
------------------------------------------------------------------------
r6203 | jyang | 2009-11-19 15:12:22 +0200 (Thu, 19 Nov 2009) | 8 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1: Use btr_free_root() instead of fseg_free() for
the fix of bug #48469, because fseg_free() is not defined
in the zip branch. And we could save one mini-trasaction started
by fseg_free().
Approved by Marko.
------------------------------------------------------------------------
r6205 | jyang | 2009-11-20 07:55:48 +0200 (Fri, 20 Nov 2009) | 11 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Add a special case to handle the Duplicated Key error
and return DB_ERROR instead. This is to avoid a possible SIGSEGV
by mysql error handling re-entering the storage layer for dup key
info without proper table handle.
This is to prevent a server crash when error situation in bug
#45961 "DDL on partitioned innodb tables leaves data dictionary
in an inconsistent state" happens.
rb://157 approved by Sunny Bains.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 5 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Fix a minor code formating issue for
the parenthesis iplacement of the if condition in
rename_table().
------------------------------------------------------------------------
------------------------------------------------------------------------
r6208 | vasil | 2009-11-20 10:49:24 +0200 (Fri, 20 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for c6207.
------------------------------------------------------------------------
r6210 | vasil | 2009-11-20 23:39:48 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/zip/trx/trx0i_s.c
branches/zip:
Whitespace fixup.
------------------------------------------------------------------------
r6248 | marko | 2009-11-30 12:19:50 +0200 (Mon, 30 Nov 2009) | 1 line
Changed paths:
M /branches/zip/ChangeLog
branches/zip: ChangeLog: Document r4922 that was forgotten.
------------------------------------------------------------------------
r6252 | marko | 2009-11-30 12:50:11 +0200 (Mon, 30 Nov 2009) | 23 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/dict/dict0boot.c
M /branches/zip/dict/dict0crea.c
M /branches/zip/dict/dict0load.c
M /branches/zip/dict/dict0mem.c
M /branches/zip/fil/fil0fil.c
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/dict0mem.h
M /branches/zip/row/row0mysql.c
branches/zip: Suppress errors about non-found temporary tables.
Write the is_temp flag to SYS_TABLES.MIX_LEN.
dict_table_t::flags: Add a flag for is_temporary, DICT_TF2_TEMPORARY.
Unlike other flags, this will not be written to the tablespace flags
or SYS_TABLES.TYPE, but only to SYS_TABLES.MIX_LEN.
dict_build_table_def_step(): Only pass DICT_TF_BITS to tablespaces.
dict_check_tablespaces_and_store_max_id(), dict_load_table():
Suppress errors about temporary tables not being found.
dict_create_sys_tables_tuple(): Write the DICT_TF2_TEMPORARY flag
to SYS_TABLES.MIX_LEN.
fil_space_create(), fil_create_new_single_table_tablespace(): Add assertions
about space->flags.
row_drop_table_for_mysql(): Do not complain about non-found temporary tables.
rb://160 approved by Heikki Tuuri. This addresses the second part of
Bug #41609 Crash recovery does not work for InnoDB temporary tables.
------------------------------------------------------------------------
r6263 | vasil | 2009-12-01 14:49:05 +0200 (Tue, 01 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/include/univ.i
branches/zip: Increment version number from 1.0.6 to 1.0.7
1.0.6 has been released
------------------------------------------------------------------------
r6264 | vasil | 2009-12-01 16:19:44 +0200 (Tue, 01 Dec 2009) | 1 line
Changed paths:
M /branches/zip/ChangeLog
branches/zip: Add ChangeLog entry for the release of 1.0.6.
------------------------------------------------------------------------
r6269 | marko | 2009-12-02 11:35:22 +0200 (Wed, 02 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/srv/srv0start.c
branches/zip: innobase_start_or_create_for_mysql(): UNIV_IBUF_DEBUG
should not break crash recovery, but UNIV_IBUF_COUNT_DEBUG will.
------------------------------------------------------------------------
r6270 | marko | 2009-12-02 11:36:47 +0200 (Wed, 02 Dec 2009) | 1 line
Changed paths:
M /branches/zip/srv/srv0start.c
branches/zip: innobase_start_or_create_for_mysql(): Log the zlib version.
------------------------------------------------------------------------
r6271 | marko | 2009-12-02 11:43:49 +0200 (Wed, 02 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/Makefile.am
M /branches/zip/include/univ.i
M /branches/zip/plug.in
branches/zip: ChangeLog: Document that since r6270, the zlib version number
will be displayed at start-up.
------------------------------------------------------------------------
r6272 | marko | 2009-12-02 11:46:05 +0200 (Wed, 02 Dec 2009) | 1 line
Changed paths:
M /branches/zip/Makefile.am
M /branches/zip/include/univ.i
M /branches/zip/plug.in
branches/zip: Revert changes that were accidentally committed in r6271.
------------------------------------------------------------------------
r6274 | marko | 2009-12-03 14:47:12 +0200 (Thu, 03 Dec 2009) | 6 lines
Changed paths:
M /branches/zip/dict/dict0dict.c
branches/zip: dict_table_check_for_dup_indexes(): Assert that the
data dictionary mutex is being held while table->indexes is accessed.
This is already the case.
Currently, only dict_table_get_next_index() and dict_table_get_first_index()
are being invoked without holding dict_sys->mutex.
------------------------------------------------------------------------
r6275 | pekka | 2009-12-03 18:32:47 +0200 (Thu, 03 Dec 2009) | 10 lines
Changed paths:
M /branches/zip/include/log0recv.h
M /branches/zip/include/trx0sys.h
M /branches/zip/log/log0recv.c
M /branches/zip/trx/trx0sys.c
branches/zip: Minor changes which allow build with UNIV_HOTBACKUP
defined to succeed:
include/trx0sys.h: Allow Hot Backup build to see some
TRX_SYS_DOUBLEWRITE_... macros.
trx/trx0sys.c: Exclude trx_sys_close() function from Hot Backup build.
log/log0recv.[ch]: Exclude recv_sys_var_init() function from Hot Backup build.
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r6277 | marko | 2009-12-08 11:13:36 +0200 (Tue, 08 Dec 2009) | 1 line
Changed paths:
M /branches/zip/fsp/fsp0fsp.c
branches/zip: fsp0fsp.c: Add some missing in/out and const qualifiers.
------------------------------------------------------------------------
r6285 | marko | 2009-12-09 09:24:50 +0200 (Wed, 09 Dec 2009) | 13 lines
Changed paths:
M /branches/zip/row/row0sel.c
branches/zip: row_sel_fetch_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns,
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This redundant code was noticed by Sergey Petrunya on the MySQL
internals list.
------------------------------------------------------------------------
r6288 | marko | 2009-12-09 09:51:00 +0200 (Wed, 09 Dec 2009) | 15 lines
Changed paths:
M /branches/zip/row/row0upd.c
branches/zip: row_upd_copy_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns.
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This is similar to the redundant code in row_sel_fetch_columns() that
was noticed by Sergey Petrunya on the MySQL internals list and removed
in r6285. As far as I can tell, there are no redundant UNIV_SQL_NULL
assignments remaining after this change.
------------------------------------------------------------------------
r6305 | marko | 2009-12-14 13:03:57 +0200 (Mon, 14 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/row/row0umod.c
branches/zip: row_undo_mod_del_unmark_sec_and_undo_update(): Add a missing
const qualifier.
------------------------------------------------------------------------
r6309 | marko | 2009-12-15 14:05:50 +0200 (Tue, 15 Dec 2009) | 3 lines
Changed paths:
M /branches/zip/lock/lock0lock.c
branches/zip: lock_rec_insert_check_and_lock(): Avoid casting away constness.
Use page_rec_get_next_const() instead. This silences a gcc 4.2.4 warning.
Reported by Sunny Bains.
------------------------------------------------------------------------
r6312 | marko | 2009-12-16 10:10:36 +0200 (Wed, 16 Dec 2009) | 6 lines
Changed paths:
M /branches/zip/fil/fil0fil.c
branches/zip: fil_close(): Add #ifndef UNIV_HOTBACKUP around a debug
assertion on mutex.magic_n. InnoDB Hot Backup is a single-threaded
program and does not contain mutexes. This change allows InnoDB Hot
Backup to be compiled with UNIV_DEBUG.
Suggested by Michael Izioumtchenko.
------------------------------------------------------------------------
r6321 | marko | 2009-12-16 16:16:33 +0200 (Wed, 16 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/row/row0merge.c
branches/zip: row_merge_drop_temp_indexes(): Revert a hack to
transaction isolation level that was made unnecessary by r5826 (Issue #337).
When this function is called, any active data dictionary transaction
should have been rolled back.
------------------------------------------------------------------------
r6345 | marko | 2009-12-21 10:46:14 +0200 (Mon, 21 Dec 2009) | 7 lines
Changed paths:
M /branches/zip/log/log0recv.c
branches/zip: recv_scan_log_recs(): Non-functional change: Replace a
debug assertion ut_ad(len > 0) with ut_ad(len >= OS_FILE_LOG_BLOCK_SIZE).
This change is only for readability, for Issue #428. Another
assertion on len being an integer multiple of OS_FILE_LOG_BLOCK_SIZE
already ensured together with the old ut_ad(len > 0) that actually len
must be at least OS_FILE_LOG_BLOCK_SIZE.
------------------------------------------------------------------------
r6346 | marko | 2009-12-21 12:03:25 +0200 (Mon, 21 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/log/log0recv.c
branches/zip: recv_recovery_from_checkpoint_finish():
Revert a change that was accidentally committed in r6345.
------------------------------------------------------------------------
r6348 | marko | 2009-12-22 11:04:34 +0200 (Tue, 22 Dec 2009) | 37 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/ha_prototypes.h
M /branches/zip/include/trx0trx.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/trx/trx0i_s.c
M /branches/zip/trx/trx0trx.c
branches/zip: Merge a change from MySQL:
------------------------------------------------------------
revno: 3236
committer: Satya B <satya.bn@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-12-01 17:48:57 +0530
message:
merge to mysql-5.1-bugteam
------------------------------------------------------------
revno: 3234.1.1
committer: Gleb Shchepa <gshchepa@mysql.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-12-01 14:38:40 +0400
message:
Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
manual merge 5.0-->5.1, updating InnoDB plugin.
------------------------------------------------------------
revno: 1810.3968.13
committer: Gleb Shchepa <gshchepa@mysql.com>
branch nick: mysql-5.0-bugteam
timestamp: Tue 2009-12-01 14:24:44 +0400
message:
Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
The bug 38816 changed the lock that protects THD::query from
LOCK_thread_count to LOCK_thd_data, but didn't update the associated
InnoDB functions.
1. The innobase_mysql_prepare_print_arbitrary_thd and the
innobase_mysql_end_print_arbitrary_thd InnoDB functions have been
removed, since now we have a per-thread mutex: now we don't need to wrap
several inter-thread access tries to THD::query with a single global
LOCK_thread_count lock, so we can simplify the code.
2. The innobase_mysql_print_thd function has been modified to lock
LOCK_thd_data in direct way.
------------------------------------------------------------------------
r6351 | marko | 2009-12-22 11:11:18 +0200 (Tue, 22 Dec 2009) | 1 line
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Remove an obsolete declaration of LOCK_thread_count.
------------------------------------------------------------------------
r6352 | marko | 2009-12-22 12:33:01 +0200 (Tue, 22 Dec 2009) | 104 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/lock0lock.h
M /branches/zip/include/srv0srv.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/log/log0log.c
M /branches/zip/mysql-test/innodb-autoinc.result
M /branches/zip/mysql-test/innodb-autoinc.test
M /branches/zip/row/row0sel.c
M /branches/zip/srv/srv0srv.c
M /branches/zip/srv/srv0start.c
branches/zip: Merge revisions 6206:6350 from branches/5.1,
except r6347, r6349, r6350 which were committed separately
to both branches, and r6310, which was backported from zip to 5.1.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Non-functional change, fix formatting.
------------------------------------------------------------------------
r6230 | sunny | 2009-11-24 23:52:43 +0200 (Tue, 24 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
branches/5.1: Fix autoinc failing test results.
(this should be skipped when merging 5.1 into zip)
------------------------------------------------------------------------
r6231 | sunny | 2009-11-25 10:26:27 +0200 (Wed, 25 Nov 2009) | 7 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: Fix BUG#49032 - auto_increment field does not initialize to last value in InnoDB Storage Engine.
We use the appropriate function to read the column value for non-integer
autoinc column types, namely float and double.
rb://208. Approved by Marko.
------------------------------------------------------------------------
r6232 | sunny | 2009-11-25 10:27:39 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix white space errors.
------------------------------------------------------------------------
r6233 | sunny | 2009-11-25 10:28:35 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/include/mach0data.h
M /branches/5.1/include/mach0data.ic
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix tests and make read float/double arg const.
------------------------------------------------------------------------
r6234 | sunny | 2009-11-25 10:29:03 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix whitepsace issues.
------------------------------------------------------------------------
r6235 | sunny | 2009-11-26 01:14:42 +0200 (Thu, 26 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Fix Bug#47720 - REPLACE INTO Autoincrement column with negative values.
This bug is similiar to the negative autoinc filter patch from earlier,
with the additional handling of filtering out the negative column values
set explicitly by the user.
rb://184
Approved by Heikki.
------------------------------------------------------------------------
r6242 | vasil | 2009-11-27 22:07:12 +0200 (Fri, 27 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/export.sh
branches/5.1:
Minor changes to support plugin snapshots.
------------------------------------------------------------------------
r6306 | calvin | 2009-12-14 15:12:46 +0200 (Mon, 14 Dec 2009) | 5 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: fix bug#49267: innodb-autoinc.test fails on windows
because of different case mode
There is no change to the InnoDB code, only to fix test case by
changing "T1" to "t1".
------------------------------------------------------------------------
r6324 | jyang | 2009-12-17 06:54:24 +0200 (Thu, 17 Dec 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/lock0lock.h
M /branches/5.1/include/srv0srv.h
M /branches/5.1/lock/lock0lock.c
M /branches/5.1/log/log0log.c
M /branches/5.1/srv/srv0srv.c
M /branches/5.1/srv/srv0start.c
branches/5.1: Fix bug #47814 - Diagnostics are frequently not
printed after a long lock wait in InnoDB. Separate out the
lock wait timeout check thread from monitor information
printing thread.
rb://200 Approved by Marko.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6364 | marko | 2009-12-26 21:06:31 +0200 (Sat, 26 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/ibuf/ibuf0ibuf.c
branches/zip: ibuf_bitmap_get_map_page():
Define a wrapper macro that passes __FILE__, __LINE__ of the caller
to buf_page_get_gen().
This will ease the diagnosis of the likes of Issue #135.
------------------------------------------------------------------------
16 years ago  branches/innodb+: Merge revisions 6130:6364 from branches/zip:
------------------------------------------------------------------------
r6130 | marko | 2009-11-02 11:42:56 +0200 (Mon, 02 Nov 2009) | 9 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/btr/btr0sea.c
M /branches/zip/buf/buf0buf.c
M /branches/zip/dict/dict0dict.c
M /branches/zip/fil/fil0fil.c
M /branches/zip/ibuf/ibuf0ibuf.c
M /branches/zip/include/btr0sea.h
M /branches/zip/include/dict0dict.h
M /branches/zip/include/fil0fil.h
M /branches/zip/include/ibuf0ibuf.h
M /branches/zip/include/lock0lock.h
M /branches/zip/include/log0log.h
M /branches/zip/include/log0recv.h
M /branches/zip/include/mem0mem.h
M /branches/zip/include/mem0pool.h
M /branches/zip/include/os0file.h
M /branches/zip/include/pars0pars.h
M /branches/zip/include/srv0srv.h
M /branches/zip/include/thr0loc.h
M /branches/zip/include/trx0i_s.h
M /branches/zip/include/trx0purge.h
M /branches/zip/include/trx0rseg.h
M /branches/zip/include/trx0sys.h
M /branches/zip/include/trx0undo.h
M /branches/zip/include/usr0sess.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/log/log0log.c
M /branches/zip/log/log0recv.c
M /branches/zip/mem/mem0dbg.c
M /branches/zip/mem/mem0pool.c
M /branches/zip/os/os0file.c
M /branches/zip/os/os0sync.c
M /branches/zip/os/os0thread.c
M /branches/zip/pars/lexyy.c
M /branches/zip/pars/pars0lex.l
M /branches/zip/que/que0que.c
M /branches/zip/srv/srv0srv.c
M /branches/zip/srv/srv0start.c
M /branches/zip/sync/sync0arr.c
M /branches/zip/sync/sync0sync.c
M /branches/zip/thr/thr0loc.c
M /branches/zip/trx/trx0i_s.c
M /branches/zip/trx/trx0purge.c
M /branches/zip/trx/trx0rseg.c
M /branches/zip/trx/trx0sys.c
M /branches/zip/trx/trx0undo.c
M /branches/zip/usr/usr0sess.c
M /branches/zip/ut/ut0mem.c
branches/zip: Free all resources at shutdown. Set pointers to NULL, so
that Valgrind will not complain about freed data structures that are
reachable via pointers. This addresses Bug #45992 and Bug #46656.
This patch is mostly based on changes copied from branches/embedded-1.0,
mainly c5432, c3439, c3134, c2994, c2978, but also some other code was
copied. Some added cleanup code is specific to MySQL/InnoDB.
rb://199 approved by Sunny Bains
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 09:57:29 +0200 (Wed, 04 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
r6137 | marko | 2009-11-04 15:24:28 +0200 (Wed, 04 Nov 2009) | 1 line
Changed paths:
M /branches/zip/dict/dict0dict.c
branches/zip: dict_index_too_big_for_undo(): Correct a typo.
------------------------------------------------------------------------
r6153 | vasil | 2009-11-10 15:33:22 +0200 (Tue, 10 Nov 2009) | 145 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Merge r6125:6152 from branches/5.1:
(everything except the last white-space change was skipped as it is already
in branches/zip)
------------------------------------------------------------------------
r6127 | vasil | 2009-10-30 11:18:25 +0200 (Fri, 30 Oct 2009) | 18 lines
Changed paths:
M /branches/5.1/Makefile.am
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Backport c6121 from branches/zip:
------------------------------------------------------------------------
r6121 | sunny | 2009-10-30 01:42:11 +0200 (Fri, 30 Oct 2009) | 7 lines
Changed paths:
M /branches/zip/mysql-test/innodb-autoinc.result
branches/zip: This test has been problematic for sometime now. The underlying
bug is that the data dictionaries get out of sync. In the AUTOINC code we
try and apply salve to the symptoms. In the past MySQL made some unrelated
change and the dictionaries stopped getting out of sync and this test started
to fail. Now, it seems they have reverted that changed and the test is
passing again. I suspect this is not he last time that this test will change.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6129 | vasil | 2009-10-30 17:14:22 +0200 (Fri, 30 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/Makefile.am
branches/5.1:
Revert a change to Makefile.am that sneaked unnoticed in c6127.
------------------------------------------------------------------------
r6136 | marko | 2009-11-04 12:28:10 +0200 (Wed, 04 Nov 2009) | 15 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/ha_prototypes.h
M /branches/5.1/ut/ut0ut.c
branches/5.1: Port r6134 from branches/zip:
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 07:57:29 +0000 (Wed, 04 Nov 2009) | 5 lines
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
innobase_print_identifier(): Replace with innobase_convert_name().
innobase_convert_identifier(): New function, called by innobase_convert_name().
------------------------------------------------------------------------
r6149 | vasil | 2009-11-09 11:15:01 +0200 (Mon, 09 Nov 2009) | 5 lines
Changed paths:
M /branches/5.1/CMakeLists.txt
branches/5.1:
Followup to r5700: Adjust the changes so they are the same as in the BZR
repository.
------------------------------------------------------------------------
r6150 | vasil | 2009-11-09 11:43:31 +0200 (Mon, 09 Nov 2009) | 58 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a part of r2911.5.5 from MySQL:
(the other part of this was merged in c5700)
------------------------------------------------------------
revno: 2911.5.5
committer: Vladislav Vaintroub <vvaintroub@mysql.com>
branch nick: 5.1-innodb_plugin
timestamp: Wed 2009-06-10 10:59:49 +0200
message:
Backport WL#3653 to 5.1 to enable bundled innodb plugin.
Remove custom DLL loader code from innodb plugin code, use
symbols exported from mysqld.
removed:
storage/innodb_plugin/handler/handler0vars.h
storage/innodb_plugin/handler/win_delay_loader.cc
added:
storage/mysql_storage_engine.cmake
win/create_def_file.js
modified:
CMakeLists.txt
include/m_ctype.h
include/my_global.h
include/my_sys.h
include/mysql/plugin.h
libmysqld/CMakeLists.txt
mysql-test/mysql-test-run.pl
mysql-test/t/plugin.test
mysql-test/t/plugin_load-master.opt
mysys/charset.c
sql/CMakeLists.txt
sql/handler.h
sql/mysql_priv.h
sql/mysqld.cc
sql/sql_class.cc
sql/sql_class.h
sql/sql_list.h
sql/sql_profile.h
storage/Makefile.am
storage/archive/CMakeLists.txt
storage/blackhole/CMakeLists.txt
storage/csv/CMakeLists.txt
storage/example/CMakeLists.txt
storage/federated/CMakeLists.txt
storage/heap/CMakeLists.txt
storage/innobase/CMakeLists.txt
storage/innobase/handler/ha_innodb.cc
storage/innodb_plugin/CMakeLists.txt
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/handler0alter.cc
storage/innodb_plugin/handler/i_s.cc
storage/innodb_plugin/plug.in
storage/myisam/CMakeLists.txt
storage/myisammrg/CMakeLists.txt
win/Makefile.am
win/configure.js
------------------------------------------------------------------------
r6152 | vasil | 2009-11-10 15:30:20 +0200 (Tue, 10 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
White space fixup.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6157 | jyang | 2009-11-11 14:27:09 +0200 (Wed, 11 Nov 2009) | 10 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
A /branches/zip/mysql-test/innodb_bug47167.result
A /branches/zip/mysql-test/innodb_bug47167.test
M /branches/zip/mysql-test/innodb_file_format.result
branches/zip: Fix an issue that a local variable defined
in innodb_file_format_check_validate() is being referenced
across function in innodb_file_format_check_update().
In addition, fix "set global innodb_file_format_check =
DEFAULT" call.
Bug #47167: "set global innodb_file_format_check" cannot
set value by User-Defined Variable."
rb://169 approved by Sunny Bains and Marko.
------------------------------------------------------------------------
r6159 | vasil | 2009-11-11 15:13:01 +0200 (Wed, 11 Nov 2009) | 37 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/handler/ha_innodb.h
branches/zip:
Merge a change from MySQL:
(this has been reviewed by Calvin and Marko, and Calvin says Luis has
incorporated Marko's suggestions)
------------------------------------------------------------
revno: 3092.5.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Thu 2009-09-24 15:52:52 +0100
message:
BUG#42829: binlogging enabled for all schemas regardless of
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
added:
mysql-test/suite/binlog/r/binlog_stm_do_db.result
mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt
mysql-test/suite/binlog/t/binlog_stm_do_db.test
modified:
sql/sql_base.cc
sql/sql_class.cc
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/ha_innodb.h
------------------------------------------------------------------------
r6160 | vasil | 2009-11-11 15:33:49 +0200 (Wed, 11 Nov 2009) | 72 lines
Changed paths:
M /branches/zip/include/os0file.h
M /branches/zip/os/os0file.c
branches/zip: Merge r6152:6159 from branches/5.1:
(r6158 was skipped as an equivallent change has already been merged from MySQL)
------------------------------------------------------------------------
r6154 | calvin | 2009-11-11 02:51:17 +0200 (Wed, 11 Nov 2009) | 17 lines
Changed paths:
M /branches/5.1/include/os0file.h
M /branches/5.1/os/os0file.c
branches/5.1: fix bug#3139: Mysql crashes: 'windows error 995'
after several selects on a large DB
During stress environment, Windows AIO may fail with error code
ERROR_OPERATION_ABORTED. InnoDB does not handle the error, rather
crashes. The cause of the error is unknown, but likely due to
faulty hardware or driver.
This patch introduces a new error code OS_FILE_OPERATION_ABORTED,
which maps to Windows ERROR_OPERATION_ABORTED (995). When the error
is detected during AIO, the InnoDB will issue a synchronous retry
(read/write).
This patch has been extensively tested by MySQL support.
Approved by: Marko
rb://196
------------------------------------------------------------------------
r6158 | vasil | 2009-11-11 14:52:14 +0200 (Wed, 11 Nov 2009) | 37 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/handler/ha_innodb.h
branches/5.1:
Merge a change from MySQL:
(this has been reviewed by Calvin and Marko, and Calvin says Luis has
incorporated Marko's suggestions)
------------------------------------------------------------
revno: 3092.5.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Thu 2009-09-24 15:52:52 +0100
message:
BUG#42829: binlogging enabled for all schemas regardless of
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
added:
mysql-test/suite/binlog/r/binlog_stm_do_db.result
mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt
mysql-test/suite/binlog/t/binlog_stm_do_db.test
modified:
sql/sql_base.cc
sql/sql_class.cc
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/ha_innodb.h
------------------------------------------------------------------------
------------------------------------------------------------------------
r6161 | vasil | 2009-11-11 15:36:16 +0200 (Wed, 11 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add changelog entry for r6160.
------------------------------------------------------------------------
r6162 | vasil | 2009-11-11 16:00:12 +0200 (Wed, 11 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog for r6157.
------------------------------------------------------------------------
r6163 | calvin | 2009-11-11 17:53:20 +0200 (Wed, 11 Nov 2009) | 8 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/handler/ha_innodb.h
branches/zip: Exclude thd_binlog_filter_ok() when building
with older version of MySQL.
thd_binlog_filter_ok() is introduced in MySQL 5.1.41. But the
plugin can be built with MySQL prior to 5.1.41.
Approved by Heikki (on IM).
------------------------------------------------------------------------
r6169 | calvin | 2009-11-12 14:40:43 +0200 (Thu, 12 Nov 2009) | 6 lines
Changed paths:
A /branches/zip/mysql-test/innodb_bug46676.result
A /branches/zip/mysql-test/innodb_bug46676.test
branches/zip: add test case for bug#46676
This crash is reproducible with InnoDB plugin 1.0.4 + MySQL 5.1.37.
But no longer reproducible after MySQL 5.1.38 (with plugin 1.0.5).
Add test case to catch future regression.
------------------------------------------------------------------------
r6170 | marko | 2009-11-12 15:49:08 +0200 (Thu, 12 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/db0err.h
M /branches/zip/row/row0merge.c
M /branches/zip/row/row0mysql.c
branches/zip: Allow CREATE INDEX to be interrupted. (Issue #354)
rb://183 approved by Heikki Tuuri
------------------------------------------------------------------------
r6175 | vasil | 2009-11-16 20:07:39 +0200 (Mon, 16 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Wrap line at 78th char in the ChangeLog
------------------------------------------------------------------------
r6177 | calvin | 2009-11-16 20:20:38 +0200 (Mon, 16 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip: add an entry to ChangeLog for r6065
------------------------------------------------------------------------
r6179 | marko | 2009-11-17 10:19:34 +0200 (Tue, 17 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: ha_innobase::change_active_index(): When the history is
missing, report it to the client, not to the error log.
------------------------------------------------------------------------
r6181 | vasil | 2009-11-17 12:21:41 +0200 (Tue, 17 Nov 2009) | 33 lines
Changed paths:
M /branches/zip/mysql-test/innodb-index.test
branches/zip:
At the end of innodb-index.test: restore the environment as it was before
the test was started to silence this warning:
MTR's internal check of the test case 'main.innodb-index' failed.
This means that the test case does not preserve the state that existed
before the test case was executed. Most likely the test case did not
do a proper clean-up.
This is the diff of the states of the servers before and after the
test case was executed:
mysqltest: Logging to '/tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.log'.
mysqltest: Results saved in '/tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result'.
mysqltest: Connecting to server localhost:13000 (socket /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/mysqld.1.sock) as 'root', connection 'default', attempt 0 ...
mysqltest: ... Connected.
mysqltest: Start processing test commands from './include/check-testcase.test' ...
mysqltest: ... Done processing test commands.
--- /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result 2009-11-17 13:10:40.000000000 +0300
+++ /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.reject 2009-11-17 13:10:54.000000000 +0300
@@ -84,7 +84,7 @@
INNODB_DOUBLEWRITE ON
INNODB_FAST_SHUTDOWN 1
INNODB_FILE_FORMAT Antelope
-INNODB_FILE_FORMAT_CHECK Antelope
+INNODB_FILE_FORMAT_CHECK Barracuda
INNODB_FILE_PER_TABLE OFF
INNODB_FLUSH_LOG_AT_TRX_COMMIT 1
INNODB_FLUSH_METHOD
mysqltest: Result content mismatch
not ok
------------------------------------------------------------------------
r6182 | marko | 2009-11-17 13:49:15 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-consistent-master.opt
M /branches/zip/mysql-test/innodb-consistent.result
M /branches/zip/mysql-test/innodb-consistent.test
M /branches/zip/mysql-test/innodb-use-sys-malloc-master.opt
M /branches/zip/mysql-test/innodb-use-sys-malloc.result
M /branches/zip/mysql-test/innodb-use-sys-malloc.test
M /branches/zip/mysql-test/innodb_bug21704.result
M /branches/zip/mysql-test/innodb_bug21704.test
M /branches/zip/mysql-test/innodb_bug40360.test
M /branches/zip/mysql-test/innodb_bug40565.result
M /branches/zip/mysql-test/innodb_bug40565.test
M /branches/zip/mysql-test/innodb_bug41904.result
M /branches/zip/mysql-test/innodb_bug41904.test
M /branches/zip/mysql-test/innodb_bug42101-nonzero-master.opt
M /branches/zip/mysql-test/innodb_bug42101-nonzero.result
M /branches/zip/mysql-test/innodb_bug42101-nonzero.test
M /branches/zip/mysql-test/innodb_bug42101.result
M /branches/zip/mysql-test/innodb_bug42101.test
M /branches/zip/mysql-test/innodb_bug44032.result
M /branches/zip/mysql-test/innodb_bug44032.test
M /branches/zip/mysql-test/innodb_bug44369.result
M /branches/zip/mysql-test/innodb_bug44369.test
M /branches/zip/mysql-test/innodb_bug44571.result
M /branches/zip/mysql-test/innodb_bug44571.test
M /branches/zip/mysql-test/innodb_bug45357.test
M /branches/zip/mysql-test/innodb_bug46000.result
M /branches/zip/mysql-test/innodb_bug46000.test
M /branches/zip/mysql-test/innodb_bug46676.result
M /branches/zip/mysql-test/innodb_bug46676.test
M /branches/zip/mysql-test/innodb_bug47167.result
M /branches/zip/mysql-test/innodb_bug47167.test
M /branches/zip/mysql-test/innodb_bug47777.result
M /branches/zip/mysql-test/innodb_bug47777.test
M /branches/zip/mysql-test/innodb_file_format.result
M /branches/zip/mysql-test/innodb_file_format.test
branches/zip: Set svn:eol-style on mysql-test files.
------------------------------------------------------------------------
r6183 | marko | 2009-11-17 13:51:16 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-consistent-master.opt
M /branches/zip/mysql-test/innodb-master.opt
M /branches/zip/mysql-test/innodb-semi-consistent-master.opt
M /branches/zip/mysql-test/innodb-use-sys-malloc-master.opt
M /branches/zip/mysql-test/innodb_bug42101-nonzero-master.opt
branches/zip: Prepend loose_ to plugin-only mysql-test options.
------------------------------------------------------------------------
r6184 | marko | 2009-11-17 13:52:01 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-index.result
M /branches/zip/mysql-test/innodb-index.test
branches/zip: innodb-index.test: Restore innodb_file_format_check.
------------------------------------------------------------------------
r6185 | marko | 2009-11-17 16:44:20 +0200 (Tue, 17 Nov 2009) | 16 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/mysql-test/innodb.result
M /branches/zip/mysql-test/innodb.test
M /branches/zip/mysql-test/innodb_bug44369.result
M /branches/zip/mysql-test/innodb_bug44369.test
D /branches/zip/mysql-test/patches/innodb-index.diff
M /branches/zip/row/row0mysql.c
branches/zip: Report duplicate table names
to the client connection, not to the error log. This change will allow
innodb-index.test to be re-enabled. It was previously disabled, because
mysql-test-run does not like output in the error log.
row_create_table_for_mysql(): Do not output anything to the error log
when reporting DB_DUPLICATE_KEY. Let the caller report the error.
Add a TODO comment that the dict_table_t object is apparently not freed
when an error occurs.
create_table_def(): Convert InnoDB table names to the character set
of the client connection for reporting. Use my_error(ER_WRONG_COLUMN_NAME)
for reporting reserved column names. Report my_error(ER_TABLE_EXISTS_ERROR)
when row_create_table_for_mysql() returns DB_DUPLICATE_KEY.
rb://206
------------------------------------------------------------------------
r6186 | vasil | 2009-11-17 16:48:14 +0200 (Tue, 17 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for r6185.
------------------------------------------------------------------------
r6189 | marko | 2009-11-18 11:36:18 +0200 (Wed, 18 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/handler0alter.cc
branches/zip: ha_innobase::add_index(): When creating the primary key
and the table is being locked by another transaction,
do not attempt to drop the table. (Bug #48782)
Approved by Sunny Bains over IM
------------------------------------------------------------------------
r6194 | vasil | 2009-11-19 09:24:45 +0200 (Thu, 19 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/include/univ.i
branches/zip:
Increment version number from 1.0.5 to 1.0.6 since 1.0.5 was just released
by MySQL and we will soon release 1.0.6.
------------------------------------------------------------------------
r6197 | calvin | 2009-11-19 09:32:55 +0200 (Thu, 19 Nov 2009) | 6 lines
Changed paths:
M /branches/zip/CMakeLists.txt
branches/zip: merge the fix of bug#48317 (CMake file)
Due to MySQL changes to the CMake, it is no longer able
to build InnoDB plugin as a static library on Windows.
The fix is proposed by Vlad of MySQL.
------------------------------------------------------------------------
r6198 | vasil | 2009-11-19 09:44:31 +0200 (Thu, 19 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for r6197.
------------------------------------------------------------------------
r6199 | vasil | 2009-11-19 12:10:12 +0200 (Thu, 19 Nov 2009) | 31 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/btr/btr0btr.c
M /branches/zip/data/data0type.c
branches/zip: Merge r6159:6198 from branches/5.1:
------------------------------------------------------------------------
r6187 | jyang | 2009-11-18 05:27:30 +0200 (Wed, 18 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1: Fix bug #48469 "when innodb tablespace is
configured too small, crash and corruption!". Function
btr_create() did not check the return status of fseg_create(),
and continue the index creation even there is no sufficient
space.
rb://205 Approved by Marko
------------------------------------------------------------------------
r6188 | jyang | 2009-11-18 07:14:23 +0200 (Wed, 18 Nov 2009) | 8 lines
Changed paths:
M /branches/5.1/data/data0type.c
branches/5.1: Fix bug #48526 "Data type for float and
double is incorrectly reported in InnoDB table monitor".
Certain datatypes are not printed correctly in
dtype_print().
rb://204 Approved by Marko.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6201 | marko | 2009-11-19 14:09:11 +0200 (Thu, 19 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/handler/handler0alter.cc
branches/zip: ha_innobase::add_index(): Clarify the comment
on orphaned tables when creating a primary key.
------------------------------------------------------------------------
r6202 | jyang | 2009-11-19 15:01:00 +0200 (Thu, 19 Nov 2009) | 8 lines
Changed paths:
M /branches/zip/btr/btr0btr.c
branches/zip: Function fseg_free() is no longer defined
in branches/zip. To port fix for bug #48469 to zip,
we can use btr_free_root() which frees the page,
and also does not require mini-transaction.
Approved by Marko.
------------------------------------------------------------------------
r6207 | vasil | 2009-11-20 10:19:14 +0200 (Fri, 20 Nov 2009) | 54 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Merge r6198:6206 from branches/5.1:
(r6203 was skipped as it is already in branches/zip)
------------------------------------------------------------------------
r6200 | vasil | 2009-11-19 12:14:23 +0200 (Thu, 19 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1:
White space fixup - indent under the opening (
------------------------------------------------------------------------
r6203 | jyang | 2009-11-19 15:12:22 +0200 (Thu, 19 Nov 2009) | 8 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1: Use btr_free_root() instead of fseg_free() for
the fix of bug #48469, because fseg_free() is not defined
in the zip branch. And we could save one mini-trasaction started
by fseg_free().
Approved by Marko.
------------------------------------------------------------------------
r6205 | jyang | 2009-11-20 07:55:48 +0200 (Fri, 20 Nov 2009) | 11 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Add a special case to handle the Duplicated Key error
and return DB_ERROR instead. This is to avoid a possible SIGSEGV
by mysql error handling re-entering the storage layer for dup key
info without proper table handle.
This is to prevent a server crash when error situation in bug
#45961 "DDL on partitioned innodb tables leaves data dictionary
in an inconsistent state" happens.
rb://157 approved by Sunny Bains.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 5 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Fix a minor code formating issue for
the parenthesis iplacement of the if condition in
rename_table().
------------------------------------------------------------------------
------------------------------------------------------------------------
r6208 | vasil | 2009-11-20 10:49:24 +0200 (Fri, 20 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for c6207.
------------------------------------------------------------------------
r6210 | vasil | 2009-11-20 23:39:48 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/zip/trx/trx0i_s.c
branches/zip:
Whitespace fixup.
------------------------------------------------------------------------
r6248 | marko | 2009-11-30 12:19:50 +0200 (Mon, 30 Nov 2009) | 1 line
Changed paths:
M /branches/zip/ChangeLog
branches/zip: ChangeLog: Document r4922 that was forgotten.
------------------------------------------------------------------------
r6252 | marko | 2009-11-30 12:50:11 +0200 (Mon, 30 Nov 2009) | 23 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/dict/dict0boot.c
M /branches/zip/dict/dict0crea.c
M /branches/zip/dict/dict0load.c
M /branches/zip/dict/dict0mem.c
M /branches/zip/fil/fil0fil.c
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/dict0mem.h
M /branches/zip/row/row0mysql.c
branches/zip: Suppress errors about non-found temporary tables.
Write the is_temp flag to SYS_TABLES.MIX_LEN.
dict_table_t::flags: Add a flag for is_temporary, DICT_TF2_TEMPORARY.
Unlike other flags, this will not be written to the tablespace flags
or SYS_TABLES.TYPE, but only to SYS_TABLES.MIX_LEN.
dict_build_table_def_step(): Only pass DICT_TF_BITS to tablespaces.
dict_check_tablespaces_and_store_max_id(), dict_load_table():
Suppress errors about temporary tables not being found.
dict_create_sys_tables_tuple(): Write the DICT_TF2_TEMPORARY flag
to SYS_TABLES.MIX_LEN.
fil_space_create(), fil_create_new_single_table_tablespace(): Add assertions
about space->flags.
row_drop_table_for_mysql(): Do not complain about non-found temporary tables.
rb://160 approved by Heikki Tuuri. This addresses the second part of
Bug #41609 Crash recovery does not work for InnoDB temporary tables.
------------------------------------------------------------------------
r6263 | vasil | 2009-12-01 14:49:05 +0200 (Tue, 01 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/include/univ.i
branches/zip: Increment version number from 1.0.6 to 1.0.7
1.0.6 has been released
------------------------------------------------------------------------
r6264 | vasil | 2009-12-01 16:19:44 +0200 (Tue, 01 Dec 2009) | 1 line
Changed paths:
M /branches/zip/ChangeLog
branches/zip: Add ChangeLog entry for the release of 1.0.6.
------------------------------------------------------------------------
r6269 | marko | 2009-12-02 11:35:22 +0200 (Wed, 02 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/srv/srv0start.c
branches/zip: innobase_start_or_create_for_mysql(): UNIV_IBUF_DEBUG
should not break crash recovery, but UNIV_IBUF_COUNT_DEBUG will.
------------------------------------------------------------------------
r6270 | marko | 2009-12-02 11:36:47 +0200 (Wed, 02 Dec 2009) | 1 line
Changed paths:
M /branches/zip/srv/srv0start.c
branches/zip: innobase_start_or_create_for_mysql(): Log the zlib version.
------------------------------------------------------------------------
r6271 | marko | 2009-12-02 11:43:49 +0200 (Wed, 02 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/Makefile.am
M /branches/zip/include/univ.i
M /branches/zip/plug.in
branches/zip: ChangeLog: Document that since r6270, the zlib version number
will be displayed at start-up.
------------------------------------------------------------------------
r6272 | marko | 2009-12-02 11:46:05 +0200 (Wed, 02 Dec 2009) | 1 line
Changed paths:
M /branches/zip/Makefile.am
M /branches/zip/include/univ.i
M /branches/zip/plug.in
branches/zip: Revert changes that were accidentally committed in r6271.
------------------------------------------------------------------------
r6274 | marko | 2009-12-03 14:47:12 +0200 (Thu, 03 Dec 2009) | 6 lines
Changed paths:
M /branches/zip/dict/dict0dict.c
branches/zip: dict_table_check_for_dup_indexes(): Assert that the
data dictionary mutex is being held while table->indexes is accessed.
This is already the case.
Currently, only dict_table_get_next_index() and dict_table_get_first_index()
are being invoked without holding dict_sys->mutex.
------------------------------------------------------------------------
r6275 | pekka | 2009-12-03 18:32:47 +0200 (Thu, 03 Dec 2009) | 10 lines
Changed paths:
M /branches/zip/include/log0recv.h
M /branches/zip/include/trx0sys.h
M /branches/zip/log/log0recv.c
M /branches/zip/trx/trx0sys.c
branches/zip: Minor changes which allow build with UNIV_HOTBACKUP
defined to succeed:
include/trx0sys.h: Allow Hot Backup build to see some
TRX_SYS_DOUBLEWRITE_... macros.
trx/trx0sys.c: Exclude trx_sys_close() function from Hot Backup build.
log/log0recv.[ch]: Exclude recv_sys_var_init() function from Hot Backup build.
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r6277 | marko | 2009-12-08 11:13:36 +0200 (Tue, 08 Dec 2009) | 1 line
Changed paths:
M /branches/zip/fsp/fsp0fsp.c
branches/zip: fsp0fsp.c: Add some missing in/out and const qualifiers.
------------------------------------------------------------------------
r6285 | marko | 2009-12-09 09:24:50 +0200 (Wed, 09 Dec 2009) | 13 lines
Changed paths:
M /branches/zip/row/row0sel.c
branches/zip: row_sel_fetch_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns,
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This redundant code was noticed by Sergey Petrunya on the MySQL
internals list.
------------------------------------------------------------------------
r6288 | marko | 2009-12-09 09:51:00 +0200 (Wed, 09 Dec 2009) | 15 lines
Changed paths:
M /branches/zip/row/row0upd.c
branches/zip: row_upd_copy_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns.
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This is similar to the redundant code in row_sel_fetch_columns() that
was noticed by Sergey Petrunya on the MySQL internals list and removed
in r6285. As far as I can tell, there are no redundant UNIV_SQL_NULL
assignments remaining after this change.
------------------------------------------------------------------------
r6305 | marko | 2009-12-14 13:03:57 +0200 (Mon, 14 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/row/row0umod.c
branches/zip: row_undo_mod_del_unmark_sec_and_undo_update(): Add a missing
const qualifier.
------------------------------------------------------------------------
r6309 | marko | 2009-12-15 14:05:50 +0200 (Tue, 15 Dec 2009) | 3 lines
Changed paths:
M /branches/zip/lock/lock0lock.c
branches/zip: lock_rec_insert_check_and_lock(): Avoid casting away constness.
Use page_rec_get_next_const() instead. This silences a gcc 4.2.4 warning.
Reported by Sunny Bains.
------------------------------------------------------------------------
r6312 | marko | 2009-12-16 10:10:36 +0200 (Wed, 16 Dec 2009) | 6 lines
Changed paths:
M /branches/zip/fil/fil0fil.c
branches/zip: fil_close(): Add #ifndef UNIV_HOTBACKUP around a debug
assertion on mutex.magic_n. InnoDB Hot Backup is a single-threaded
program and does not contain mutexes. This change allows InnoDB Hot
Backup to be compiled with UNIV_DEBUG.
Suggested by Michael Izioumtchenko.
------------------------------------------------------------------------
r6321 | marko | 2009-12-16 16:16:33 +0200 (Wed, 16 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/row/row0merge.c
branches/zip: row_merge_drop_temp_indexes(): Revert a hack to
transaction isolation level that was made unnecessary by r5826 (Issue #337).
When this function is called, any active data dictionary transaction
should have been rolled back.
------------------------------------------------------------------------
r6345 | marko | 2009-12-21 10:46:14 +0200 (Mon, 21 Dec 2009) | 7 lines
Changed paths:
M /branches/zip/log/log0recv.c
branches/zip: recv_scan_log_recs(): Non-functional change: Replace a
debug assertion ut_ad(len > 0) with ut_ad(len >= OS_FILE_LOG_BLOCK_SIZE).
This change is only for readability, for Issue #428. Another
assertion on len being an integer multiple of OS_FILE_LOG_BLOCK_SIZE
already ensured together with the old ut_ad(len > 0) that actually len
must be at least OS_FILE_LOG_BLOCK_SIZE.
------------------------------------------------------------------------
r6346 | marko | 2009-12-21 12:03:25 +0200 (Mon, 21 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/log/log0recv.c
branches/zip: recv_recovery_from_checkpoint_finish():
Revert a change that was accidentally committed in r6345.
------------------------------------------------------------------------
r6348 | marko | 2009-12-22 11:04:34 +0200 (Tue, 22 Dec 2009) | 37 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/ha_prototypes.h
M /branches/zip/include/trx0trx.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/trx/trx0i_s.c
M /branches/zip/trx/trx0trx.c
branches/zip: Merge a change from MySQL:
------------------------------------------------------------
revno: 3236
committer: Satya B <satya.bn@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-12-01 17:48:57 +0530
message:
merge to mysql-5.1-bugteam
------------------------------------------------------------
revno: 3234.1.1
committer: Gleb Shchepa <gshchepa@mysql.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-12-01 14:38:40 +0400
message:
Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
manual merge 5.0-->5.1, updating InnoDB plugin.
------------------------------------------------------------
revno: 1810.3968.13
committer: Gleb Shchepa <gshchepa@mysql.com>
branch nick: mysql-5.0-bugteam
timestamp: Tue 2009-12-01 14:24:44 +0400
message:
Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
The bug 38816 changed the lock that protects THD::query from
LOCK_thread_count to LOCK_thd_data, but didn't update the associated
InnoDB functions.
1. The innobase_mysql_prepare_print_arbitrary_thd and the
innobase_mysql_end_print_arbitrary_thd InnoDB functions have been
removed, since now we have a per-thread mutex: now we don't need to wrap
several inter-thread access tries to THD::query with a single global
LOCK_thread_count lock, so we can simplify the code.
2. The innobase_mysql_print_thd function has been modified to lock
LOCK_thd_data in direct way.
------------------------------------------------------------------------
r6351 | marko | 2009-12-22 11:11:18 +0200 (Tue, 22 Dec 2009) | 1 line
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Remove an obsolete declaration of LOCK_thread_count.
------------------------------------------------------------------------
r6352 | marko | 2009-12-22 12:33:01 +0200 (Tue, 22 Dec 2009) | 104 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/lock0lock.h
M /branches/zip/include/srv0srv.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/log/log0log.c
M /branches/zip/mysql-test/innodb-autoinc.result
M /branches/zip/mysql-test/innodb-autoinc.test
M /branches/zip/row/row0sel.c
M /branches/zip/srv/srv0srv.c
M /branches/zip/srv/srv0start.c
branches/zip: Merge revisions 6206:6350 from branches/5.1,
except r6347, r6349, r6350 which were committed separately
to both branches, and r6310, which was backported from zip to 5.1.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Non-functional change, fix formatting.
------------------------------------------------------------------------
r6230 | sunny | 2009-11-24 23:52:43 +0200 (Tue, 24 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
branches/5.1: Fix autoinc failing test results.
(this should be skipped when merging 5.1 into zip)
------------------------------------------------------------------------
r6231 | sunny | 2009-11-25 10:26:27 +0200 (Wed, 25 Nov 2009) | 7 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: Fix BUG#49032 - auto_increment field does not initialize to last value in InnoDB Storage Engine.
We use the appropriate function to read the column value for non-integer
autoinc column types, namely float and double.
rb://208. Approved by Marko.
------------------------------------------------------------------------
r6232 | sunny | 2009-11-25 10:27:39 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix white space errors.
------------------------------------------------------------------------
r6233 | sunny | 2009-11-25 10:28:35 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/include/mach0data.h
M /branches/5.1/include/mach0data.ic
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix tests and make read float/double arg const.
------------------------------------------------------------------------
r6234 | sunny | 2009-11-25 10:29:03 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix whitepsace issues.
------------------------------------------------------------------------
r6235 | sunny | 2009-11-26 01:14:42 +0200 (Thu, 26 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Fix Bug#47720 - REPLACE INTO Autoincrement column with negative values.
This bug is similiar to the negative autoinc filter patch from earlier,
with the additional handling of filtering out the negative column values
set explicitly by the user.
rb://184
Approved by Heikki.
------------------------------------------------------------------------
r6242 | vasil | 2009-11-27 22:07:12 +0200 (Fri, 27 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/export.sh
branches/5.1:
Minor changes to support plugin snapshots.
------------------------------------------------------------------------
r6306 | calvin | 2009-12-14 15:12:46 +0200 (Mon, 14 Dec 2009) | 5 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: fix bug#49267: innodb-autoinc.test fails on windows
because of different case mode
There is no change to the InnoDB code, only to fix test case by
changing "T1" to "t1".
------------------------------------------------------------------------
r6324 | jyang | 2009-12-17 06:54:24 +0200 (Thu, 17 Dec 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/lock0lock.h
M /branches/5.1/include/srv0srv.h
M /branches/5.1/lock/lock0lock.c
M /branches/5.1/log/log0log.c
M /branches/5.1/srv/srv0srv.c
M /branches/5.1/srv/srv0start.c
branches/5.1: Fix bug #47814 - Diagnostics are frequently not
printed after a long lock wait in InnoDB. Separate out the
lock wait timeout check thread from monitor information
printing thread.
rb://200 Approved by Marko.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6364 | marko | 2009-12-26 21:06:31 +0200 (Sat, 26 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/ibuf/ibuf0ibuf.c
branches/zip: ibuf_bitmap_get_map_page():
Define a wrapper macro that passes __FILE__, __LINE__ of the caller
to buf_page_get_gen().
This will ease the diagnosis of the likes of Issue #135.
------------------------------------------------------------------------
16 years ago  MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress
The progress reporting of InnoDB crash recovery was rather intermittent.
Nothing was reported during the single-threaded log record parsing, which
could consume minutes when parsing a large log. During log application,
there only was progress reporting in background threads that would be
invoked on data page read completion.
The progress reporting here will be detailed like this:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1990840177; to recover: 124806 pages
InnoDB: Parsed redo log up to LSN=2729777071; to recover: 186123 pages
InnoDB: Parsed redo log up to LSN=3488599173; to recover: 248397 pages
InnoDB: Parsed redo log up to LSN=4177856618; to recover: 306469 pages
InnoDB: Multi-batch recovery needed at LSN 4189599815
InnoDB: End of log at LSN=4483551634
InnoDB: To recover: LSN 4189599815/4483551634; 307490 pages
InnoDB: To recover: LSN 4189599815/4483551634; 197159 pages
InnoDB: To recover: LSN 4189599815/4483551634; 67623 pages
InnoDB: Parsed redo log up to LSN=4353924218; to recover: 102083 pages
...
InnoDB: log sequence number 4483551634 ...
The previous messages "Starting a batch to recover" or
"Starting a final batch to recover" will be replaced by
"To recover: ... pages" messages.
If a batch lasts longer than 15 seconds, then there will be
progress reports every 15 seconds, showing the number of remaining pages.
For the non-final batch, the "To recover:" message includes two end LSN:
that of the batch, and of the recovered log. This is the primary measure
of progress. The batch will end once the number of pages to recover
reaches 0.
If recovery is possible in a single batch, the output will look like this,
with a shorter "To recover:" message that counts only the remaining pages:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1998701027; to recover: 125560 pages
InnoDB: Parsed redo log up to LSN=2734136874; to recover: 186446 pages
InnoDB: Parsed redo log up to LSN=3499505504; to recover: 249378 pages
InnoDB: Parsed redo log up to LSN=4183247844; to recover: 306964 pages
InnoDB: End of log at LSN=4483551634
...
InnoDB: To recover: 331797 pages
...
InnoDB: log sequence number 4483551634 ...
We will also speed up recovery by improving the memory management and
implementing multi-threaded recovery of data pages that will not need
to be read into the buffer pool ("fake read"). Log application in the
"fake read" threads will be protected by an atomic being_recovered field
and exclusive buf_page_t::latch.
Recovery will reserve for data pages two thirds of the buffer pool,
or 256 pages, whichever is smaller. Previously, we could only use at most
one third of the buffer pool for buffered log records. This would typically
mean that with large buffer pools, recovery unnecessary consisted of
multiple batches.
If recovery runs out of memory, it will "roll back" or "rewind" the current
mini-transaction. The recv_sys.lsn and recv_sys.pages will correspond
to the "out of memory LSN", at the end of the previous complete
mini-transaction.
If recovery runs out of memory while executing the final recovery batch,
we can simply invoke recv_sys.apply(false) to make room, and resume
parsing.
If recovery runs out of memory before the final batch, we will scan
the redo log to the end (recv_sys.scanned_lsn) and check for any missing
or inconsistent files. If recv_init_crash_recovery_spaces() does not
report any potentially missing tablespaces, we can make use of the
already stored recv_sys.pages and only rewind to the "out of memory LSN".
Else, we must keep parsing and invoking recv_validate_tablespace()
until an error has been found or everything has been resolved, and
ultimatily rewind to to the checkpoint LSN.
recv_sys_t::pages_it: A cached iterator to recv_sys.pages
recv_sys_t::parse_mtr(): Remove an ATTRIBUTE_NOINLINE that would
prevent tail call optimization in recv_sys_t::parse_pmem().
recv_sys_t::parse(), recv_sys_t::parse_mtr(), recv_sys_t::parse_pmem():
Add template<bool store> parameter. Redo log record parsing
(store=false) is better specialized from store=true
(with bool if_exists) so that we can avoid some conditional branches
in frequently invoked low-level code.
recv_sys_t::is_memory_exhausted(): Remove. The special parse() status
GOT_OOM will report out-of-memory situation at the low level.
recv_sys_t::rewind(), page_recv_t::recs_t::rewind():
Remove all log starting with a specific LSN.
recv_scan_log(): Separate some code for only parsing, not storing log.
In rewound_lsn, remember the LSN at which last_phase=false recovery
ran out of memory. This is where the next call to recv_scan_log()
will resume storing the log. This replaces recv_sys.last_stored_lsn.
recv_sys_t::parse(): Evaluate the template parameter store in a few more
cases, to allow dead code to be eliminated at compile time.
recv_sys_t::scanned_lsn: The end of the log found by recv_scan_log().
The special value 1 means that recv_sys has been initialized but
no log has been parsed.
IORequest::write_complete(), IORequest::read_complete():
Replaces fil_aio_callback().
read_io_callback(), write_io_callback(): Replaces io_callback().
IORequest::fake_read_complete(), fake_io_callback(), os_fake_read():
Process a "fake read" request for concurrent recovery.
recv_sys_t::apply_batch(): Choose a number of successive pages
for a recovery batch.
recv_sys_t::erase(recv_sys_t::map::iterator): Remove log records for a
page whose recovery is not in progress. Log application threads
will not invoke this; they will only set being_recovered=-1 to indicate
that the entry is no longer needed.
recv_sys_t::garbage_collect(): Remove all being_recovered=-1 entries.
recv_sys_t::wait_for_pool(): Wait for some space to become available
in the buffer pool.
mlog_init_t::mark_ibuf_exist(): Avoid calls to
recv_sys::recover_low() via ibuf_page_exists() and buf_page_get_low().
Such calls would lead to double locking of recv_sys.mutex, which
depending on implementation could cause a deadlock. We will use
lower-level calls to look up index pages.
buf_LRU_block_remove_hashed(): Disable consistency checks for freed
ROW_FORMAT=COMPRESSED pages. Their contents could be uninitialized garbage.
This fixes an occasional failure of the test
innodb.innodb_bulk_create_index_debug.
Tested by: Matthias Leich
2 years ago  MDEV-27593: Crashing on I/O error is unhelpful
buf_page_t::write_complete(), buf_page_write_complete(),
IORequest::write_complete(): Add a parameter for passing
an error code. If an error occurred, we will release the
io-fix, buffer-fix and page latch but not reset the
oldest_modification field. The block would remain in
buf_pool.LRU and possibly buf_pool.flush_list, to be written
again later, by buf_flush_page_cleaner(). If all page writes
start consistently failing, all write threads should eventually
hang in log_free_check() because the log checkpoint cannot
be advanced to make room in the circular write-ahead-log ib_logfile0.
IORequest::read_complete(): Add a parameter for passing
an error code. If a read operation fails, we report the error
and discard the page, just like we would do if the page checksum
was not validated or the page could not be decrypted.
This only affects asynchronous reads, due to linear or random read-ahead
or crash recovery. When buf_page_get_low() invokes buf_read_page(),
that will be a synchronous read, not involving this code.
This was tested by randomly injecting errors in
write_io_callback() and read_io_callback(), like this:
if (!ut_rnd_interval(100))
cb->m_err= 42;
2 years ago  MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress
The progress reporting of InnoDB crash recovery was rather intermittent.
Nothing was reported during the single-threaded log record parsing, which
could consume minutes when parsing a large log. During log application,
there only was progress reporting in background threads that would be
invoked on data page read completion.
The progress reporting here will be detailed like this:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1990840177; to recover: 124806 pages
InnoDB: Parsed redo log up to LSN=2729777071; to recover: 186123 pages
InnoDB: Parsed redo log up to LSN=3488599173; to recover: 248397 pages
InnoDB: Parsed redo log up to LSN=4177856618; to recover: 306469 pages
InnoDB: Multi-batch recovery needed at LSN 4189599815
InnoDB: End of log at LSN=4483551634
InnoDB: To recover: LSN 4189599815/4483551634; 307490 pages
InnoDB: To recover: LSN 4189599815/4483551634; 197159 pages
InnoDB: To recover: LSN 4189599815/4483551634; 67623 pages
InnoDB: Parsed redo log up to LSN=4353924218; to recover: 102083 pages
...
InnoDB: log sequence number 4483551634 ...
The previous messages "Starting a batch to recover" or
"Starting a final batch to recover" will be replaced by
"To recover: ... pages" messages.
If a batch lasts longer than 15 seconds, then there will be
progress reports every 15 seconds, showing the number of remaining pages.
For the non-final batch, the "To recover:" message includes two end LSN:
that of the batch, and of the recovered log. This is the primary measure
of progress. The batch will end once the number of pages to recover
reaches 0.
If recovery is possible in a single batch, the output will look like this,
with a shorter "To recover:" message that counts only the remaining pages:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1998701027; to recover: 125560 pages
InnoDB: Parsed redo log up to LSN=2734136874; to recover: 186446 pages
InnoDB: Parsed redo log up to LSN=3499505504; to recover: 249378 pages
InnoDB: Parsed redo log up to LSN=4183247844; to recover: 306964 pages
InnoDB: End of log at LSN=4483551634
...
InnoDB: To recover: 331797 pages
...
InnoDB: log sequence number 4483551634 ...
We will also speed up recovery by improving the memory management and
implementing multi-threaded recovery of data pages that will not need
to be read into the buffer pool ("fake read"). Log application in the
"fake read" threads will be protected by an atomic being_recovered field
and exclusive buf_page_t::latch.
Recovery will reserve for data pages two thirds of the buffer pool,
or 256 pages, whichever is smaller. Previously, we could only use at most
one third of the buffer pool for buffered log records. This would typically
mean that with large buffer pools, recovery unnecessary consisted of
multiple batches.
If recovery runs out of memory, it will "roll back" or "rewind" the current
mini-transaction. The recv_sys.lsn and recv_sys.pages will correspond
to the "out of memory LSN", at the end of the previous complete
mini-transaction.
If recovery runs out of memory while executing the final recovery batch,
we can simply invoke recv_sys.apply(false) to make room, and resume
parsing.
If recovery runs out of memory before the final batch, we will scan
the redo log to the end (recv_sys.scanned_lsn) and check for any missing
or inconsistent files. If recv_init_crash_recovery_spaces() does not
report any potentially missing tablespaces, we can make use of the
already stored recv_sys.pages and only rewind to the "out of memory LSN".
Else, we must keep parsing and invoking recv_validate_tablespace()
until an error has been found or everything has been resolved, and
ultimatily rewind to to the checkpoint LSN.
recv_sys_t::pages_it: A cached iterator to recv_sys.pages
recv_sys_t::parse_mtr(): Remove an ATTRIBUTE_NOINLINE that would
prevent tail call optimization in recv_sys_t::parse_pmem().
recv_sys_t::parse(), recv_sys_t::parse_mtr(), recv_sys_t::parse_pmem():
Add template<bool store> parameter. Redo log record parsing
(store=false) is better specialized from store=true
(with bool if_exists) so that we can avoid some conditional branches
in frequently invoked low-level code.
recv_sys_t::is_memory_exhausted(): Remove. The special parse() status
GOT_OOM will report out-of-memory situation at the low level.
recv_sys_t::rewind(), page_recv_t::recs_t::rewind():
Remove all log starting with a specific LSN.
recv_scan_log(): Separate some code for only parsing, not storing log.
In rewound_lsn, remember the LSN at which last_phase=false recovery
ran out of memory. This is where the next call to recv_scan_log()
will resume storing the log. This replaces recv_sys.last_stored_lsn.
recv_sys_t::parse(): Evaluate the template parameter store in a few more
cases, to allow dead code to be eliminated at compile time.
recv_sys_t::scanned_lsn: The end of the log found by recv_scan_log().
The special value 1 means that recv_sys has been initialized but
no log has been parsed.
IORequest::write_complete(), IORequest::read_complete():
Replaces fil_aio_callback().
read_io_callback(), write_io_callback(): Replaces io_callback().
IORequest::fake_read_complete(), fake_io_callback(), os_fake_read():
Process a "fake read" request for concurrent recovery.
recv_sys_t::apply_batch(): Choose a number of successive pages
for a recovery batch.
recv_sys_t::erase(recv_sys_t::map::iterator): Remove log records for a
page whose recovery is not in progress. Log application threads
will not invoke this; they will only set being_recovered=-1 to indicate
that the entry is no longer needed.
recv_sys_t::garbage_collect(): Remove all being_recovered=-1 entries.
recv_sys_t::wait_for_pool(): Wait for some space to become available
in the buffer pool.
mlog_init_t::mark_ibuf_exist(): Avoid calls to
recv_sys::recover_low() via ibuf_page_exists() and buf_page_get_low().
Such calls would lead to double locking of recv_sys.mutex, which
depending on implementation could cause a deadlock. We will use
lower-level calls to look up index pages.
buf_LRU_block_remove_hashed(): Disable consistency checks for freed
ROW_FORMAT=COMPRESSED pages. Their contents could be uninitialized garbage.
This fixes an occasional failure of the test
innodb.innodb_bulk_create_index_debug.
Tested by: Matthias Leich
2 years ago  MDEV-27593: Crashing on I/O error is unhelpful
buf_page_t::write_complete(), buf_page_write_complete(),
IORequest::write_complete(): Add a parameter for passing
an error code. If an error occurred, we will release the
io-fix, buffer-fix and page latch but not reset the
oldest_modification field. The block would remain in
buf_pool.LRU and possibly buf_pool.flush_list, to be written
again later, by buf_flush_page_cleaner(). If all page writes
start consistently failing, all write threads should eventually
hang in log_free_check() because the log checkpoint cannot
be advanced to make room in the circular write-ahead-log ib_logfile0.
IORequest::read_complete(): Add a parameter for passing
an error code. If a read operation fails, we report the error
and discard the page, just like we would do if the page checksum
was not validated or the page could not be decrypted.
This only affects asynchronous reads, due to linear or random read-ahead
or crash recovery. When buf_page_get_low() invokes buf_read_page(),
that will be a synchronous read, not involving this code.
This was tested by randomly injecting errors in
write_io_callback() and read_io_callback(), like this:
if (!ut_rnd_interval(100))
cb->m_err= 42;
2 years ago  MDEV-27593: Crashing on I/O error is unhelpful
buf_page_t::write_complete(), buf_page_write_complete(),
IORequest::write_complete(): Add a parameter for passing
an error code. If an error occurred, we will release the
io-fix, buffer-fix and page latch but not reset the
oldest_modification field. The block would remain in
buf_pool.LRU and possibly buf_pool.flush_list, to be written
again later, by buf_flush_page_cleaner(). If all page writes
start consistently failing, all write threads should eventually
hang in log_free_check() because the log checkpoint cannot
be advanced to make room in the circular write-ahead-log ib_logfile0.
IORequest::read_complete(): Add a parameter for passing
an error code. If a read operation fails, we report the error
and discard the page, just like we would do if the page checksum
was not validated or the page could not be decrypted.
This only affects asynchronous reads, due to linear or random read-ahead
or crash recovery. When buf_page_get_low() invokes buf_read_page(),
that will be a synchronous read, not involving this code.
This was tested by randomly injecting errors in
write_io_callback() and read_io_callback(), like this:
if (!ut_rnd_interval(100))
cb->m_err= 42;
2 years ago  MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress
The progress reporting of InnoDB crash recovery was rather intermittent.
Nothing was reported during the single-threaded log record parsing, which
could consume minutes when parsing a large log. During log application,
there only was progress reporting in background threads that would be
invoked on data page read completion.
The progress reporting here will be detailed like this:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1990840177; to recover: 124806 pages
InnoDB: Parsed redo log up to LSN=2729777071; to recover: 186123 pages
InnoDB: Parsed redo log up to LSN=3488599173; to recover: 248397 pages
InnoDB: Parsed redo log up to LSN=4177856618; to recover: 306469 pages
InnoDB: Multi-batch recovery needed at LSN 4189599815
InnoDB: End of log at LSN=4483551634
InnoDB: To recover: LSN 4189599815/4483551634; 307490 pages
InnoDB: To recover: LSN 4189599815/4483551634; 197159 pages
InnoDB: To recover: LSN 4189599815/4483551634; 67623 pages
InnoDB: Parsed redo log up to LSN=4353924218; to recover: 102083 pages
...
InnoDB: log sequence number 4483551634 ...
The previous messages "Starting a batch to recover" or
"Starting a final batch to recover" will be replaced by
"To recover: ... pages" messages.
If a batch lasts longer than 15 seconds, then there will be
progress reports every 15 seconds, showing the number of remaining pages.
For the non-final batch, the "To recover:" message includes two end LSN:
that of the batch, and of the recovered log. This is the primary measure
of progress. The batch will end once the number of pages to recover
reaches 0.
If recovery is possible in a single batch, the output will look like this,
with a shorter "To recover:" message that counts only the remaining pages:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1998701027; to recover: 125560 pages
InnoDB: Parsed redo log up to LSN=2734136874; to recover: 186446 pages
InnoDB: Parsed redo log up to LSN=3499505504; to recover: 249378 pages
InnoDB: Parsed redo log up to LSN=4183247844; to recover: 306964 pages
InnoDB: End of log at LSN=4483551634
...
InnoDB: To recover: 331797 pages
...
InnoDB: log sequence number 4483551634 ...
We will also speed up recovery by improving the memory management and
implementing multi-threaded recovery of data pages that will not need
to be read into the buffer pool ("fake read"). Log application in the
"fake read" threads will be protected by an atomic being_recovered field
and exclusive buf_page_t::latch.
Recovery will reserve for data pages two thirds of the buffer pool,
or 256 pages, whichever is smaller. Previously, we could only use at most
one third of the buffer pool for buffered log records. This would typically
mean that with large buffer pools, recovery unnecessary consisted of
multiple batches.
If recovery runs out of memory, it will "roll back" or "rewind" the current
mini-transaction. The recv_sys.lsn and recv_sys.pages will correspond
to the "out of memory LSN", at the end of the previous complete
mini-transaction.
If recovery runs out of memory while executing the final recovery batch,
we can simply invoke recv_sys.apply(false) to make room, and resume
parsing.
If recovery runs out of memory before the final batch, we will scan
the redo log to the end (recv_sys.scanned_lsn) and check for any missing
or inconsistent files. If recv_init_crash_recovery_spaces() does not
report any potentially missing tablespaces, we can make use of the
already stored recv_sys.pages and only rewind to the "out of memory LSN".
Else, we must keep parsing and invoking recv_validate_tablespace()
until an error has been found or everything has been resolved, and
ultimatily rewind to to the checkpoint LSN.
recv_sys_t::pages_it: A cached iterator to recv_sys.pages
recv_sys_t::parse_mtr(): Remove an ATTRIBUTE_NOINLINE that would
prevent tail call optimization in recv_sys_t::parse_pmem().
recv_sys_t::parse(), recv_sys_t::parse_mtr(), recv_sys_t::parse_pmem():
Add template<bool store> parameter. Redo log record parsing
(store=false) is better specialized from store=true
(with bool if_exists) so that we can avoid some conditional branches
in frequently invoked low-level code.
recv_sys_t::is_memory_exhausted(): Remove. The special parse() status
GOT_OOM will report out-of-memory situation at the low level.
recv_sys_t::rewind(), page_recv_t::recs_t::rewind():
Remove all log starting with a specific LSN.
recv_scan_log(): Separate some code for only parsing, not storing log.
In rewound_lsn, remember the LSN at which last_phase=false recovery
ran out of memory. This is where the next call to recv_scan_log()
will resume storing the log. This replaces recv_sys.last_stored_lsn.
recv_sys_t::parse(): Evaluate the template parameter store in a few more
cases, to allow dead code to be eliminated at compile time.
recv_sys_t::scanned_lsn: The end of the log found by recv_scan_log().
The special value 1 means that recv_sys has been initialized but
no log has been parsed.
IORequest::write_complete(), IORequest::read_complete():
Replaces fil_aio_callback().
read_io_callback(), write_io_callback(): Replaces io_callback().
IORequest::fake_read_complete(), fake_io_callback(), os_fake_read():
Process a "fake read" request for concurrent recovery.
recv_sys_t::apply_batch(): Choose a number of successive pages
for a recovery batch.
recv_sys_t::erase(recv_sys_t::map::iterator): Remove log records for a
page whose recovery is not in progress. Log application threads
will not invoke this; they will only set being_recovered=-1 to indicate
that the entry is no longer needed.
recv_sys_t::garbage_collect(): Remove all being_recovered=-1 entries.
recv_sys_t::wait_for_pool(): Wait for some space to become available
in the buffer pool.
mlog_init_t::mark_ibuf_exist(): Avoid calls to
recv_sys::recover_low() via ibuf_page_exists() and buf_page_get_low().
Such calls would lead to double locking of recv_sys.mutex, which
depending on implementation could cause a deadlock. We will use
lower-level calls to look up index pages.
buf_LRU_block_remove_hashed(): Disable consistency checks for freed
ROW_FORMAT=COMPRESSED pages. Their contents could be uninitialized garbage.
This fixes an occasional failure of the test
innodb.innodb_bulk_create_index_debug.
Tested by: Matthias Leich
2 years ago  MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress
The progress reporting of InnoDB crash recovery was rather intermittent.
Nothing was reported during the single-threaded log record parsing, which
could consume minutes when parsing a large log. During log application,
there only was progress reporting in background threads that would be
invoked on data page read completion.
The progress reporting here will be detailed like this:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1990840177; to recover: 124806 pages
InnoDB: Parsed redo log up to LSN=2729777071; to recover: 186123 pages
InnoDB: Parsed redo log up to LSN=3488599173; to recover: 248397 pages
InnoDB: Parsed redo log up to LSN=4177856618; to recover: 306469 pages
InnoDB: Multi-batch recovery needed at LSN 4189599815
InnoDB: End of log at LSN=4483551634
InnoDB: To recover: LSN 4189599815/4483551634; 307490 pages
InnoDB: To recover: LSN 4189599815/4483551634; 197159 pages
InnoDB: To recover: LSN 4189599815/4483551634; 67623 pages
InnoDB: Parsed redo log up to LSN=4353924218; to recover: 102083 pages
...
InnoDB: log sequence number 4483551634 ...
The previous messages "Starting a batch to recover" or
"Starting a final batch to recover" will be replaced by
"To recover: ... pages" messages.
If a batch lasts longer than 15 seconds, then there will be
progress reports every 15 seconds, showing the number of remaining pages.
For the non-final batch, the "To recover:" message includes two end LSN:
that of the batch, and of the recovered log. This is the primary measure
of progress. The batch will end once the number of pages to recover
reaches 0.
If recovery is possible in a single batch, the output will look like this,
with a shorter "To recover:" message that counts only the remaining pages:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1998701027; to recover: 125560 pages
InnoDB: Parsed redo log up to LSN=2734136874; to recover: 186446 pages
InnoDB: Parsed redo log up to LSN=3499505504; to recover: 249378 pages
InnoDB: Parsed redo log up to LSN=4183247844; to recover: 306964 pages
InnoDB: End of log at LSN=4483551634
...
InnoDB: To recover: 331797 pages
...
InnoDB: log sequence number 4483551634 ...
We will also speed up recovery by improving the memory management and
implementing multi-threaded recovery of data pages that will not need
to be read into the buffer pool ("fake read"). Log application in the
"fake read" threads will be protected by an atomic being_recovered field
and exclusive buf_page_t::latch.
Recovery will reserve for data pages two thirds of the buffer pool,
or 256 pages, whichever is smaller. Previously, we could only use at most
one third of the buffer pool for buffered log records. This would typically
mean that with large buffer pools, recovery unnecessary consisted of
multiple batches.
If recovery runs out of memory, it will "roll back" or "rewind" the current
mini-transaction. The recv_sys.lsn and recv_sys.pages will correspond
to the "out of memory LSN", at the end of the previous complete
mini-transaction.
If recovery runs out of memory while executing the final recovery batch,
we can simply invoke recv_sys.apply(false) to make room, and resume
parsing.
If recovery runs out of memory before the final batch, we will scan
the redo log to the end (recv_sys.scanned_lsn) and check for any missing
or inconsistent files. If recv_init_crash_recovery_spaces() does not
report any potentially missing tablespaces, we can make use of the
already stored recv_sys.pages and only rewind to the "out of memory LSN".
Else, we must keep parsing and invoking recv_validate_tablespace()
until an error has been found or everything has been resolved, and
ultimatily rewind to to the checkpoint LSN.
recv_sys_t::pages_it: A cached iterator to recv_sys.pages
recv_sys_t::parse_mtr(): Remove an ATTRIBUTE_NOINLINE that would
prevent tail call optimization in recv_sys_t::parse_pmem().
recv_sys_t::parse(), recv_sys_t::parse_mtr(), recv_sys_t::parse_pmem():
Add template<bool store> parameter. Redo log record parsing
(store=false) is better specialized from store=true
(with bool if_exists) so that we can avoid some conditional branches
in frequently invoked low-level code.
recv_sys_t::is_memory_exhausted(): Remove. The special parse() status
GOT_OOM will report out-of-memory situation at the low level.
recv_sys_t::rewind(), page_recv_t::recs_t::rewind():
Remove all log starting with a specific LSN.
recv_scan_log(): Separate some code for only parsing, not storing log.
In rewound_lsn, remember the LSN at which last_phase=false recovery
ran out of memory. This is where the next call to recv_scan_log()
will resume storing the log. This replaces recv_sys.last_stored_lsn.
recv_sys_t::parse(): Evaluate the template parameter store in a few more
cases, to allow dead code to be eliminated at compile time.
recv_sys_t::scanned_lsn: The end of the log found by recv_scan_log().
The special value 1 means that recv_sys has been initialized but
no log has been parsed.
IORequest::write_complete(), IORequest::read_complete():
Replaces fil_aio_callback().
read_io_callback(), write_io_callback(): Replaces io_callback().
IORequest::fake_read_complete(), fake_io_callback(), os_fake_read():
Process a "fake read" request for concurrent recovery.
recv_sys_t::apply_batch(): Choose a number of successive pages
for a recovery batch.
recv_sys_t::erase(recv_sys_t::map::iterator): Remove log records for a
page whose recovery is not in progress. Log application threads
will not invoke this; they will only set being_recovered=-1 to indicate
that the entry is no longer needed.
recv_sys_t::garbage_collect(): Remove all being_recovered=-1 entries.
recv_sys_t::wait_for_pool(): Wait for some space to become available
in the buffer pool.
mlog_init_t::mark_ibuf_exist(): Avoid calls to
recv_sys::recover_low() via ibuf_page_exists() and buf_page_get_low().
Such calls would lead to double locking of recv_sys.mutex, which
depending on implementation could cause a deadlock. We will use
lower-level calls to look up index pages.
buf_LRU_block_remove_hashed(): Disable consistency checks for freed
ROW_FORMAT=COMPRESSED pages. Their contents could be uninitialized garbage.
This fixes an occasional failure of the test
innodb.innodb_bulk_create_index_debug.
Tested by: Matthias Leich
2 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress
The progress reporting of InnoDB crash recovery was rather intermittent.
Nothing was reported during the single-threaded log record parsing, which
could consume minutes when parsing a large log. During log application,
there only was progress reporting in background threads that would be
invoked on data page read completion.
The progress reporting here will be detailed like this:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1990840177; to recover: 124806 pages
InnoDB: Parsed redo log up to LSN=2729777071; to recover: 186123 pages
InnoDB: Parsed redo log up to LSN=3488599173; to recover: 248397 pages
InnoDB: Parsed redo log up to LSN=4177856618; to recover: 306469 pages
InnoDB: Multi-batch recovery needed at LSN 4189599815
InnoDB: End of log at LSN=4483551634
InnoDB: To recover: LSN 4189599815/4483551634; 307490 pages
InnoDB: To recover: LSN 4189599815/4483551634; 197159 pages
InnoDB: To recover: LSN 4189599815/4483551634; 67623 pages
InnoDB: Parsed redo log up to LSN=4353924218; to recover: 102083 pages
...
InnoDB: log sequence number 4483551634 ...
The previous messages "Starting a batch to recover" or
"Starting a final batch to recover" will be replaced by
"To recover: ... pages" messages.
If a batch lasts longer than 15 seconds, then there will be
progress reports every 15 seconds, showing the number of remaining pages.
For the non-final batch, the "To recover:" message includes two end LSN:
that of the batch, and of the recovered log. This is the primary measure
of progress. The batch will end once the number of pages to recover
reaches 0.
If recovery is possible in a single batch, the output will look like this,
with a shorter "To recover:" message that counts only the remaining pages:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1998701027; to recover: 125560 pages
InnoDB: Parsed redo log up to LSN=2734136874; to recover: 186446 pages
InnoDB: Parsed redo log up to LSN=3499505504; to recover: 249378 pages
InnoDB: Parsed redo log up to LSN=4183247844; to recover: 306964 pages
InnoDB: End of log at LSN=4483551634
...
InnoDB: To recover: 331797 pages
...
InnoDB: log sequence number 4483551634 ...
We will also speed up recovery by improving the memory management and
implementing multi-threaded recovery of data pages that will not need
to be read into the buffer pool ("fake read"). Log application in the
"fake read" threads will be protected by an atomic being_recovered field
and exclusive buf_page_t::latch.
Recovery will reserve for data pages two thirds of the buffer pool,
or 256 pages, whichever is smaller. Previously, we could only use at most
one third of the buffer pool for buffered log records. This would typically
mean that with large buffer pools, recovery unnecessary consisted of
multiple batches.
If recovery runs out of memory, it will "roll back" or "rewind" the current
mini-transaction. The recv_sys.lsn and recv_sys.pages will correspond
to the "out of memory LSN", at the end of the previous complete
mini-transaction.
If recovery runs out of memory while executing the final recovery batch,
we can simply invoke recv_sys.apply(false) to make room, and resume
parsing.
If recovery runs out of memory before the final batch, we will scan
the redo log to the end (recv_sys.scanned_lsn) and check for any missing
or inconsistent files. If recv_init_crash_recovery_spaces() does not
report any potentially missing tablespaces, we can make use of the
already stored recv_sys.pages and only rewind to the "out of memory LSN".
Else, we must keep parsing and invoking recv_validate_tablespace()
until an error has been found or everything has been resolved, and
ultimatily rewind to to the checkpoint LSN.
recv_sys_t::pages_it: A cached iterator to recv_sys.pages
recv_sys_t::parse_mtr(): Remove an ATTRIBUTE_NOINLINE that would
prevent tail call optimization in recv_sys_t::parse_pmem().
recv_sys_t::parse(), recv_sys_t::parse_mtr(), recv_sys_t::parse_pmem():
Add template<bool store> parameter. Redo log record parsing
(store=false) is better specialized from store=true
(with bool if_exists) so that we can avoid some conditional branches
in frequently invoked low-level code.
recv_sys_t::is_memory_exhausted(): Remove. The special parse() status
GOT_OOM will report out-of-memory situation at the low level.
recv_sys_t::rewind(), page_recv_t::recs_t::rewind():
Remove all log starting with a specific LSN.
recv_scan_log(): Separate some code for only parsing, not storing log.
In rewound_lsn, remember the LSN at which last_phase=false recovery
ran out of memory. This is where the next call to recv_scan_log()
will resume storing the log. This replaces recv_sys.last_stored_lsn.
recv_sys_t::parse(): Evaluate the template parameter store in a few more
cases, to allow dead code to be eliminated at compile time.
recv_sys_t::scanned_lsn: The end of the log found by recv_scan_log().
The special value 1 means that recv_sys has been initialized but
no log has been parsed.
IORequest::write_complete(), IORequest::read_complete():
Replaces fil_aio_callback().
read_io_callback(), write_io_callback(): Replaces io_callback().
IORequest::fake_read_complete(), fake_io_callback(), os_fake_read():
Process a "fake read" request for concurrent recovery.
recv_sys_t::apply_batch(): Choose a number of successive pages
for a recovery batch.
recv_sys_t::erase(recv_sys_t::map::iterator): Remove log records for a
page whose recovery is not in progress. Log application threads
will not invoke this; they will only set being_recovered=-1 to indicate
that the entry is no longer needed.
recv_sys_t::garbage_collect(): Remove all being_recovered=-1 entries.
recv_sys_t::wait_for_pool(): Wait for some space to become available
in the buffer pool.
mlog_init_t::mark_ibuf_exist(): Avoid calls to
recv_sys::recover_low() via ibuf_page_exists() and buf_page_get_low().
Such calls would lead to double locking of recv_sys.mutex, which
depending on implementation could cause a deadlock. We will use
lower-level calls to look up index pages.
buf_LRU_block_remove_hashed(): Disable consistency checks for freed
ROW_FORMAT=COMPRESSED pages. Their contents could be uninitialized garbage.
This fixes an occasional failure of the test
innodb.innodb_bulk_create_index_debug.
Tested by: Matthias Leich
2 years ago  MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress
The progress reporting of InnoDB crash recovery was rather intermittent.
Nothing was reported during the single-threaded log record parsing, which
could consume minutes when parsing a large log. During log application,
there only was progress reporting in background threads that would be
invoked on data page read completion.
The progress reporting here will be detailed like this:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1990840177; to recover: 124806 pages
InnoDB: Parsed redo log up to LSN=2729777071; to recover: 186123 pages
InnoDB: Parsed redo log up to LSN=3488599173; to recover: 248397 pages
InnoDB: Parsed redo log up to LSN=4177856618; to recover: 306469 pages
InnoDB: Multi-batch recovery needed at LSN 4189599815
InnoDB: End of log at LSN=4483551634
InnoDB: To recover: LSN 4189599815/4483551634; 307490 pages
InnoDB: To recover: LSN 4189599815/4483551634; 197159 pages
InnoDB: To recover: LSN 4189599815/4483551634; 67623 pages
InnoDB: Parsed redo log up to LSN=4353924218; to recover: 102083 pages
...
InnoDB: log sequence number 4483551634 ...
The previous messages "Starting a batch to recover" or
"Starting a final batch to recover" will be replaced by
"To recover: ... pages" messages.
If a batch lasts longer than 15 seconds, then there will be
progress reports every 15 seconds, showing the number of remaining pages.
For the non-final batch, the "To recover:" message includes two end LSN:
that of the batch, and of the recovered log. This is the primary measure
of progress. The batch will end once the number of pages to recover
reaches 0.
If recovery is possible in a single batch, the output will look like this,
with a shorter "To recover:" message that counts only the remaining pages:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1998701027; to recover: 125560 pages
InnoDB: Parsed redo log up to LSN=2734136874; to recover: 186446 pages
InnoDB: Parsed redo log up to LSN=3499505504; to recover: 249378 pages
InnoDB: Parsed redo log up to LSN=4183247844; to recover: 306964 pages
InnoDB: End of log at LSN=4483551634
...
InnoDB: To recover: 331797 pages
...
InnoDB: log sequence number 4483551634 ...
We will also speed up recovery by improving the memory management and
implementing multi-threaded recovery of data pages that will not need
to be read into the buffer pool ("fake read"). Log application in the
"fake read" threads will be protected by an atomic being_recovered field
and exclusive buf_page_t::latch.
Recovery will reserve for data pages two thirds of the buffer pool,
or 256 pages, whichever is smaller. Previously, we could only use at most
one third of the buffer pool for buffered log records. This would typically
mean that with large buffer pools, recovery unnecessary consisted of
multiple batches.
If recovery runs out of memory, it will "roll back" or "rewind" the current
mini-transaction. The recv_sys.lsn and recv_sys.pages will correspond
to the "out of memory LSN", at the end of the previous complete
mini-transaction.
If recovery runs out of memory while executing the final recovery batch,
we can simply invoke recv_sys.apply(false) to make room, and resume
parsing.
If recovery runs out of memory before the final batch, we will scan
the redo log to the end (recv_sys.scanned_lsn) and check for any missing
or inconsistent files. If recv_init_crash_recovery_spaces() does not
report any potentially missing tablespaces, we can make use of the
already stored recv_sys.pages and only rewind to the "out of memory LSN".
Else, we must keep parsing and invoking recv_validate_tablespace()
until an error has been found or everything has been resolved, and
ultimatily rewind to to the checkpoint LSN.
recv_sys_t::pages_it: A cached iterator to recv_sys.pages
recv_sys_t::parse_mtr(): Remove an ATTRIBUTE_NOINLINE that would
prevent tail call optimization in recv_sys_t::parse_pmem().
recv_sys_t::parse(), recv_sys_t::parse_mtr(), recv_sys_t::parse_pmem():
Add template<bool store> parameter. Redo log record parsing
(store=false) is better specialized from store=true
(with bool if_exists) so that we can avoid some conditional branches
in frequently invoked low-level code.
recv_sys_t::is_memory_exhausted(): Remove. The special parse() status
GOT_OOM will report out-of-memory situation at the low level.
recv_sys_t::rewind(), page_recv_t::recs_t::rewind():
Remove all log starting with a specific LSN.
recv_scan_log(): Separate some code for only parsing, not storing log.
In rewound_lsn, remember the LSN at which last_phase=false recovery
ran out of memory. This is where the next call to recv_scan_log()
will resume storing the log. This replaces recv_sys.last_stored_lsn.
recv_sys_t::parse(): Evaluate the template parameter store in a few more
cases, to allow dead code to be eliminated at compile time.
recv_sys_t::scanned_lsn: The end of the log found by recv_scan_log().
The special value 1 means that recv_sys has been initialized but
no log has been parsed.
IORequest::write_complete(), IORequest::read_complete():
Replaces fil_aio_callback().
read_io_callback(), write_io_callback(): Replaces io_callback().
IORequest::fake_read_complete(), fake_io_callback(), os_fake_read():
Process a "fake read" request for concurrent recovery.
recv_sys_t::apply_batch(): Choose a number of successive pages
for a recovery batch.
recv_sys_t::erase(recv_sys_t::map::iterator): Remove log records for a
page whose recovery is not in progress. Log application threads
will not invoke this; they will only set being_recovered=-1 to indicate
that the entry is no longer needed.
recv_sys_t::garbage_collect(): Remove all being_recovered=-1 entries.
recv_sys_t::wait_for_pool(): Wait for some space to become available
in the buffer pool.
mlog_init_t::mark_ibuf_exist(): Avoid calls to
recv_sys::recover_low() via ibuf_page_exists() and buf_page_get_low().
Such calls would lead to double locking of recv_sys.mutex, which
depending on implementation could cause a deadlock. We will use
lower-level calls to look up index pages.
buf_LRU_block_remove_hashed(): Disable consistency checks for freed
ROW_FORMAT=COMPRESSED pages. Their contents could be uninitialized garbage.
This fixes an occasional failure of the test
innodb.innodb_bulk_create_index_debug.
Tested by: Matthias Leich
2 years ago  MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress
The progress reporting of InnoDB crash recovery was rather intermittent.
Nothing was reported during the single-threaded log record parsing, which
could consume minutes when parsing a large log. During log application,
there only was progress reporting in background threads that would be
invoked on data page read completion.
The progress reporting here will be detailed like this:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1990840177; to recover: 124806 pages
InnoDB: Parsed redo log up to LSN=2729777071; to recover: 186123 pages
InnoDB: Parsed redo log up to LSN=3488599173; to recover: 248397 pages
InnoDB: Parsed redo log up to LSN=4177856618; to recover: 306469 pages
InnoDB: Multi-batch recovery needed at LSN 4189599815
InnoDB: End of log at LSN=4483551634
InnoDB: To recover: LSN 4189599815/4483551634; 307490 pages
InnoDB: To recover: LSN 4189599815/4483551634; 197159 pages
InnoDB: To recover: LSN 4189599815/4483551634; 67623 pages
InnoDB: Parsed redo log up to LSN=4353924218; to recover: 102083 pages
...
InnoDB: log sequence number 4483551634 ...
The previous messages "Starting a batch to recover" or
"Starting a final batch to recover" will be replaced by
"To recover: ... pages" messages.
If a batch lasts longer than 15 seconds, then there will be
progress reports every 15 seconds, showing the number of remaining pages.
For the non-final batch, the "To recover:" message includes two end LSN:
that of the batch, and of the recovered log. This is the primary measure
of progress. The batch will end once the number of pages to recover
reaches 0.
If recovery is possible in a single batch, the output will look like this,
with a shorter "To recover:" message that counts only the remaining pages:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1998701027; to recover: 125560 pages
InnoDB: Parsed redo log up to LSN=2734136874; to recover: 186446 pages
InnoDB: Parsed redo log up to LSN=3499505504; to recover: 249378 pages
InnoDB: Parsed redo log up to LSN=4183247844; to recover: 306964 pages
InnoDB: End of log at LSN=4483551634
...
InnoDB: To recover: 331797 pages
...
InnoDB: log sequence number 4483551634 ...
We will also speed up recovery by improving the memory management and
implementing multi-threaded recovery of data pages that will not need
to be read into the buffer pool ("fake read"). Log application in the
"fake read" threads will be protected by an atomic being_recovered field
and exclusive buf_page_t::latch.
Recovery will reserve for data pages two thirds of the buffer pool,
or 256 pages, whichever is smaller. Previously, we could only use at most
one third of the buffer pool for buffered log records. This would typically
mean that with large buffer pools, recovery unnecessary consisted of
multiple batches.
If recovery runs out of memory, it will "roll back" or "rewind" the current
mini-transaction. The recv_sys.lsn and recv_sys.pages will correspond
to the "out of memory LSN", at the end of the previous complete
mini-transaction.
If recovery runs out of memory while executing the final recovery batch,
we can simply invoke recv_sys.apply(false) to make room, and resume
parsing.
If recovery runs out of memory before the final batch, we will scan
the redo log to the end (recv_sys.scanned_lsn) and check for any missing
or inconsistent files. If recv_init_crash_recovery_spaces() does not
report any potentially missing tablespaces, we can make use of the
already stored recv_sys.pages and only rewind to the "out of memory LSN".
Else, we must keep parsing and invoking recv_validate_tablespace()
until an error has been found or everything has been resolved, and
ultimatily rewind to to the checkpoint LSN.
recv_sys_t::pages_it: A cached iterator to recv_sys.pages
recv_sys_t::parse_mtr(): Remove an ATTRIBUTE_NOINLINE that would
prevent tail call optimization in recv_sys_t::parse_pmem().
recv_sys_t::parse(), recv_sys_t::parse_mtr(), recv_sys_t::parse_pmem():
Add template<bool store> parameter. Redo log record parsing
(store=false) is better specialized from store=true
(with bool if_exists) so that we can avoid some conditional branches
in frequently invoked low-level code.
recv_sys_t::is_memory_exhausted(): Remove. The special parse() status
GOT_OOM will report out-of-memory situation at the low level.
recv_sys_t::rewind(), page_recv_t::recs_t::rewind():
Remove all log starting with a specific LSN.
recv_scan_log(): Separate some code for only parsing, not storing log.
In rewound_lsn, remember the LSN at which last_phase=false recovery
ran out of memory. This is where the next call to recv_scan_log()
will resume storing the log. This replaces recv_sys.last_stored_lsn.
recv_sys_t::parse(): Evaluate the template parameter store in a few more
cases, to allow dead code to be eliminated at compile time.
recv_sys_t::scanned_lsn: The end of the log found by recv_scan_log().
The special value 1 means that recv_sys has been initialized but
no log has been parsed.
IORequest::write_complete(), IORequest::read_complete():
Replaces fil_aio_callback().
read_io_callback(), write_io_callback(): Replaces io_callback().
IORequest::fake_read_complete(), fake_io_callback(), os_fake_read():
Process a "fake read" request for concurrent recovery.
recv_sys_t::apply_batch(): Choose a number of successive pages
for a recovery batch.
recv_sys_t::erase(recv_sys_t::map::iterator): Remove log records for a
page whose recovery is not in progress. Log application threads
will not invoke this; they will only set being_recovered=-1 to indicate
that the entry is no longer needed.
recv_sys_t::garbage_collect(): Remove all being_recovered=-1 entries.
recv_sys_t::wait_for_pool(): Wait for some space to become available
in the buffer pool.
mlog_init_t::mark_ibuf_exist(): Avoid calls to
recv_sys::recover_low() via ibuf_page_exists() and buf_page_get_low().
Such calls would lead to double locking of recv_sys.mutex, which
depending on implementation could cause a deadlock. We will use
lower-level calls to look up index pages.
buf_LRU_block_remove_hashed(): Disable consistency checks for freed
ROW_FORMAT=COMPRESSED pages. Their contents could be uninitialized garbage.
This fixes an occasional failure of the test
innodb.innodb_bulk_create_index_debug.
Tested by: Matthias Leich
2 years ago  MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress
The progress reporting of InnoDB crash recovery was rather intermittent.
Nothing was reported during the single-threaded log record parsing, which
could consume minutes when parsing a large log. During log application,
there only was progress reporting in background threads that would be
invoked on data page read completion.
The progress reporting here will be detailed like this:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1990840177; to recover: 124806 pages
InnoDB: Parsed redo log up to LSN=2729777071; to recover: 186123 pages
InnoDB: Parsed redo log up to LSN=3488599173; to recover: 248397 pages
InnoDB: Parsed redo log up to LSN=4177856618; to recover: 306469 pages
InnoDB: Multi-batch recovery needed at LSN 4189599815
InnoDB: End of log at LSN=4483551634
InnoDB: To recover: LSN 4189599815/4483551634; 307490 pages
InnoDB: To recover: LSN 4189599815/4483551634; 197159 pages
InnoDB: To recover: LSN 4189599815/4483551634; 67623 pages
InnoDB: Parsed redo log up to LSN=4353924218; to recover: 102083 pages
...
InnoDB: log sequence number 4483551634 ...
The previous messages "Starting a batch to recover" or
"Starting a final batch to recover" will be replaced by
"To recover: ... pages" messages.
If a batch lasts longer than 15 seconds, then there will be
progress reports every 15 seconds, showing the number of remaining pages.
For the non-final batch, the "To recover:" message includes two end LSN:
that of the batch, and of the recovered log. This is the primary measure
of progress. The batch will end once the number of pages to recover
reaches 0.
If recovery is possible in a single batch, the output will look like this,
with a shorter "To recover:" message that counts only the remaining pages:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1998701027; to recover: 125560 pages
InnoDB: Parsed redo log up to LSN=2734136874; to recover: 186446 pages
InnoDB: Parsed redo log up to LSN=3499505504; to recover: 249378 pages
InnoDB: Parsed redo log up to LSN=4183247844; to recover: 306964 pages
InnoDB: End of log at LSN=4483551634
...
InnoDB: To recover: 331797 pages
...
InnoDB: log sequence number 4483551634 ...
We will also speed up recovery by improving the memory management and
implementing multi-threaded recovery of data pages that will not need
to be read into the buffer pool ("fake read"). Log application in the
"fake read" threads will be protected by an atomic being_recovered field
and exclusive buf_page_t::latch.
Recovery will reserve for data pages two thirds of the buffer pool,
or 256 pages, whichever is smaller. Previously, we could only use at most
one third of the buffer pool for buffered log records. This would typically
mean that with large buffer pools, recovery unnecessary consisted of
multiple batches.
If recovery runs out of memory, it will "roll back" or "rewind" the current
mini-transaction. The recv_sys.lsn and recv_sys.pages will correspond
to the "out of memory LSN", at the end of the previous complete
mini-transaction.
If recovery runs out of memory while executing the final recovery batch,
we can simply invoke recv_sys.apply(false) to make room, and resume
parsing.
If recovery runs out of memory before the final batch, we will scan
the redo log to the end (recv_sys.scanned_lsn) and check for any missing
or inconsistent files. If recv_init_crash_recovery_spaces() does not
report any potentially missing tablespaces, we can make use of the
already stored recv_sys.pages and only rewind to the "out of memory LSN".
Else, we must keep parsing and invoking recv_validate_tablespace()
until an error has been found or everything has been resolved, and
ultimatily rewind to to the checkpoint LSN.
recv_sys_t::pages_it: A cached iterator to recv_sys.pages
recv_sys_t::parse_mtr(): Remove an ATTRIBUTE_NOINLINE that would
prevent tail call optimization in recv_sys_t::parse_pmem().
recv_sys_t::parse(), recv_sys_t::parse_mtr(), recv_sys_t::parse_pmem():
Add template<bool store> parameter. Redo log record parsing
(store=false) is better specialized from store=true
(with bool if_exists) so that we can avoid some conditional branches
in frequently invoked low-level code.
recv_sys_t::is_memory_exhausted(): Remove. The special parse() status
GOT_OOM will report out-of-memory situation at the low level.
recv_sys_t::rewind(), page_recv_t::recs_t::rewind():
Remove all log starting with a specific LSN.
recv_scan_log(): Separate some code for only parsing, not storing log.
In rewound_lsn, remember the LSN at which last_phase=false recovery
ran out of memory. This is where the next call to recv_scan_log()
will resume storing the log. This replaces recv_sys.last_stored_lsn.
recv_sys_t::parse(): Evaluate the template parameter store in a few more
cases, to allow dead code to be eliminated at compile time.
recv_sys_t::scanned_lsn: The end of the log found by recv_scan_log().
The special value 1 means that recv_sys has been initialized but
no log has been parsed.
IORequest::write_complete(), IORequest::read_complete():
Replaces fil_aio_callback().
read_io_callback(), write_io_callback(): Replaces io_callback().
IORequest::fake_read_complete(), fake_io_callback(), os_fake_read():
Process a "fake read" request for concurrent recovery.
recv_sys_t::apply_batch(): Choose a number of successive pages
for a recovery batch.
recv_sys_t::erase(recv_sys_t::map::iterator): Remove log records for a
page whose recovery is not in progress. Log application threads
will not invoke this; they will only set being_recovered=-1 to indicate
that the entry is no longer needed.
recv_sys_t::garbage_collect(): Remove all being_recovered=-1 entries.
recv_sys_t::wait_for_pool(): Wait for some space to become available
in the buffer pool.
mlog_init_t::mark_ibuf_exist(): Avoid calls to
recv_sys::recover_low() via ibuf_page_exists() and buf_page_get_low().
Such calls would lead to double locking of recv_sys.mutex, which
depending on implementation could cause a deadlock. We will use
lower-level calls to look up index pages.
buf_LRU_block_remove_hashed(): Disable consistency checks for freed
ROW_FORMAT=COMPRESSED pages. Their contents could be uninitialized garbage.
This fixes an occasional failure of the test
innodb.innodb_bulk_create_index_debug.
Tested by: Matthias Leich
2 years ago  MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress
The progress reporting of InnoDB crash recovery was rather intermittent.
Nothing was reported during the single-threaded log record parsing, which
could consume minutes when parsing a large log. During log application,
there only was progress reporting in background threads that would be
invoked on data page read completion.
The progress reporting here will be detailed like this:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1990840177; to recover: 124806 pages
InnoDB: Parsed redo log up to LSN=2729777071; to recover: 186123 pages
InnoDB: Parsed redo log up to LSN=3488599173; to recover: 248397 pages
InnoDB: Parsed redo log up to LSN=4177856618; to recover: 306469 pages
InnoDB: Multi-batch recovery needed at LSN 4189599815
InnoDB: End of log at LSN=4483551634
InnoDB: To recover: LSN 4189599815/4483551634; 307490 pages
InnoDB: To recover: LSN 4189599815/4483551634; 197159 pages
InnoDB: To recover: LSN 4189599815/4483551634; 67623 pages
InnoDB: Parsed redo log up to LSN=4353924218; to recover: 102083 pages
...
InnoDB: log sequence number 4483551634 ...
The previous messages "Starting a batch to recover" or
"Starting a final batch to recover" will be replaced by
"To recover: ... pages" messages.
If a batch lasts longer than 15 seconds, then there will be
progress reports every 15 seconds, showing the number of remaining pages.
For the non-final batch, the "To recover:" message includes two end LSN:
that of the batch, and of the recovered log. This is the primary measure
of progress. The batch will end once the number of pages to recover
reaches 0.
If recovery is possible in a single batch, the output will look like this,
with a shorter "To recover:" message that counts only the remaining pages:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1998701027; to recover: 125560 pages
InnoDB: Parsed redo log up to LSN=2734136874; to recover: 186446 pages
InnoDB: Parsed redo log up to LSN=3499505504; to recover: 249378 pages
InnoDB: Parsed redo log up to LSN=4183247844; to recover: 306964 pages
InnoDB: End of log at LSN=4483551634
...
InnoDB: To recover: 331797 pages
...
InnoDB: log sequence number 4483551634 ...
We will also speed up recovery by improving the memory management and
implementing multi-threaded recovery of data pages that will not need
to be read into the buffer pool ("fake read"). Log application in the
"fake read" threads will be protected by an atomic being_recovered field
and exclusive buf_page_t::latch.
Recovery will reserve for data pages two thirds of the buffer pool,
or 256 pages, whichever is smaller. Previously, we could only use at most
one third of the buffer pool for buffered log records. This would typically
mean that with large buffer pools, recovery unnecessary consisted of
multiple batches.
If recovery runs out of memory, it will "roll back" or "rewind" the current
mini-transaction. The recv_sys.lsn and recv_sys.pages will correspond
to the "out of memory LSN", at the end of the previous complete
mini-transaction.
If recovery runs out of memory while executing the final recovery batch,
we can simply invoke recv_sys.apply(false) to make room, and resume
parsing.
If recovery runs out of memory before the final batch, we will scan
the redo log to the end (recv_sys.scanned_lsn) and check for any missing
or inconsistent files. If recv_init_crash_recovery_spaces() does not
report any potentially missing tablespaces, we can make use of the
already stored recv_sys.pages and only rewind to the "out of memory LSN".
Else, we must keep parsing and invoking recv_validate_tablespace()
until an error has been found or everything has been resolved, and
ultimatily rewind to to the checkpoint LSN.
recv_sys_t::pages_it: A cached iterator to recv_sys.pages
recv_sys_t::parse_mtr(): Remove an ATTRIBUTE_NOINLINE that would
prevent tail call optimization in recv_sys_t::parse_pmem().
recv_sys_t::parse(), recv_sys_t::parse_mtr(), recv_sys_t::parse_pmem():
Add template<bool store> parameter. Redo log record parsing
(store=false) is better specialized from store=true
(with bool if_exists) so that we can avoid some conditional branches
in frequently invoked low-level code.
recv_sys_t::is_memory_exhausted(): Remove. The special parse() status
GOT_OOM will report out-of-memory situation at the low level.
recv_sys_t::rewind(), page_recv_t::recs_t::rewind():
Remove all log starting with a specific LSN.
recv_scan_log(): Separate some code for only parsing, not storing log.
In rewound_lsn, remember the LSN at which last_phase=false recovery
ran out of memory. This is where the next call to recv_scan_log()
will resume storing the log. This replaces recv_sys.last_stored_lsn.
recv_sys_t::parse(): Evaluate the template parameter store in a few more
cases, to allow dead code to be eliminated at compile time.
recv_sys_t::scanned_lsn: The end of the log found by recv_scan_log().
The special value 1 means that recv_sys has been initialized but
no log has been parsed.
IORequest::write_complete(), IORequest::read_complete():
Replaces fil_aio_callback().
read_io_callback(), write_io_callback(): Replaces io_callback().
IORequest::fake_read_complete(), fake_io_callback(), os_fake_read():
Process a "fake read" request for concurrent recovery.
recv_sys_t::apply_batch(): Choose a number of successive pages
for a recovery batch.
recv_sys_t::erase(recv_sys_t::map::iterator): Remove log records for a
page whose recovery is not in progress. Log application threads
will not invoke this; they will only set being_recovered=-1 to indicate
that the entry is no longer needed.
recv_sys_t::garbage_collect(): Remove all being_recovered=-1 entries.
recv_sys_t::wait_for_pool(): Wait for some space to become available
in the buffer pool.
mlog_init_t::mark_ibuf_exist(): Avoid calls to
recv_sys::recover_low() via ibuf_page_exists() and buf_page_get_low().
Such calls would lead to double locking of recv_sys.mutex, which
depending on implementation could cause a deadlock. We will use
lower-level calls to look up index pages.
buf_LRU_block_remove_hashed(): Disable consistency checks for freed
ROW_FORMAT=COMPRESSED pages. Their contents could be uninitialized garbage.
This fixes an occasional failure of the test
innodb.innodb_bulk_create_index_debug.
Tested by: Matthias Leich
2 years ago  MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress
The progress reporting of InnoDB crash recovery was rather intermittent.
Nothing was reported during the single-threaded log record parsing, which
could consume minutes when parsing a large log. During log application,
there only was progress reporting in background threads that would be
invoked on data page read completion.
The progress reporting here will be detailed like this:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1990840177; to recover: 124806 pages
InnoDB: Parsed redo log up to LSN=2729777071; to recover: 186123 pages
InnoDB: Parsed redo log up to LSN=3488599173; to recover: 248397 pages
InnoDB: Parsed redo log up to LSN=4177856618; to recover: 306469 pages
InnoDB: Multi-batch recovery needed at LSN 4189599815
InnoDB: End of log at LSN=4483551634
InnoDB: To recover: LSN 4189599815/4483551634; 307490 pages
InnoDB: To recover: LSN 4189599815/4483551634; 197159 pages
InnoDB: To recover: LSN 4189599815/4483551634; 67623 pages
InnoDB: Parsed redo log up to LSN=4353924218; to recover: 102083 pages
...
InnoDB: log sequence number 4483551634 ...
The previous messages "Starting a batch to recover" or
"Starting a final batch to recover" will be replaced by
"To recover: ... pages" messages.
If a batch lasts longer than 15 seconds, then there will be
progress reports every 15 seconds, showing the number of remaining pages.
For the non-final batch, the "To recover:" message includes two end LSN:
that of the batch, and of the recovered log. This is the primary measure
of progress. The batch will end once the number of pages to recover
reaches 0.
If recovery is possible in a single batch, the output will look like this,
with a shorter "To recover:" message that counts only the remaining pages:
InnoDB: Starting crash recovery from checkpoint LSN=503549688
InnoDB: Parsed redo log up to LSN=1998701027; to recover: 125560 pages
InnoDB: Parsed redo log up to LSN=2734136874; to recover: 186446 pages
InnoDB: Parsed redo log up to LSN=3499505504; to recover: 249378 pages
InnoDB: Parsed redo log up to LSN=4183247844; to recover: 306964 pages
InnoDB: End of log at LSN=4483551634
...
InnoDB: To recover: 331797 pages
...
InnoDB: log sequence number 4483551634 ...
We will also speed up recovery by improving the memory management and
implementing multi-threaded recovery of data pages that will not need
to be read into the buffer pool ("fake read"). Log application in the
"fake read" threads will be protected by an atomic being_recovered field
and exclusive buf_page_t::latch.
Recovery will reserve for data pages two thirds of the buffer pool,
or 256 pages, whichever is smaller. Previously, we could only use at most
one third of the buffer pool for buffered log records. This would typically
mean that with large buffer pools, recovery unnecessary consisted of
multiple batches.
If recovery runs out of memory, it will "roll back" or "rewind" the current
mini-transaction. The recv_sys.lsn and recv_sys.pages will correspond
to the "out of memory LSN", at the end of the previous complete
mini-transaction.
If recovery runs out of memory while executing the final recovery batch,
we can simply invoke recv_sys.apply(false) to make room, and resume
parsing.
If recovery runs out of memory before the final batch, we will scan
the redo log to the end (recv_sys.scanned_lsn) and check for any missing
or inconsistent files. If recv_init_crash_recovery_spaces() does not
report any potentially missing tablespaces, we can make use of the
already stored recv_sys.pages and only rewind to the "out of memory LSN".
Else, we must keep parsing and invoking recv_validate_tablespace()
until an error has been found or everything has been resolved, and
ultimatily rewind to to the checkpoint LSN.
recv_sys_t::pages_it: A cached iterator to recv_sys.pages
recv_sys_t::parse_mtr(): Remove an ATTRIBUTE_NOINLINE that would
prevent tail call optimization in recv_sys_t::parse_pmem().
recv_sys_t::parse(), recv_sys_t::parse_mtr(), recv_sys_t::parse_pmem():
Add template<bool store> parameter. Redo log record parsing
(store=false) is better specialized from store=true
(with bool if_exists) so that we can avoid some conditional branches
in frequently invoked low-level code.
recv_sys_t::is_memory_exhausted(): Remove. The special parse() status
GOT_OOM will report out-of-memory situation at the low level.
recv_sys_t::rewind(), page_recv_t::recs_t::rewind():
Remove all log starting with a specific LSN.
recv_scan_log(): Separate some code for only parsing, not storing log.
In rewound_lsn, remember the LSN at which last_phase=false recovery
ran out of memory. This is where the next call to recv_scan_log()
will resume storing the log. This replaces recv_sys.last_stored_lsn.
recv_sys_t::parse(): Evaluate the template parameter store in a few more
cases, to allow dead code to be eliminated at compile time.
recv_sys_t::scanned_lsn: The end of the log found by recv_scan_log().
The special value 1 means that recv_sys has been initialized but
no log has been parsed.
IORequest::write_complete(), IORequest::read_complete():
Replaces fil_aio_callback().
read_io_callback(), write_io_callback(): Replaces io_callback().
IORequest::fake_read_complete(), fake_io_callback(), os_fake_read():
Process a "fake read" request for concurrent recovery.
recv_sys_t::apply_batch(): Choose a number of successive pages
for a recovery batch.
recv_sys_t::erase(recv_sys_t::map::iterator): Remove log records for a
page whose recovery is not in progress. Log application threads
will not invoke this; they will only set being_recovered=-1 to indicate
that the entry is no longer needed.
recv_sys_t::garbage_collect(): Remove all being_recovered=-1 entries.
recv_sys_t::wait_for_pool(): Wait for some space to become available
in the buffer pool.
mlog_init_t::mark_ibuf_exist(): Avoid calls to
recv_sys::recover_low() via ibuf_page_exists() and buf_page_get_low().
Such calls would lead to double locking of recv_sys.mutex, which
depending on implementation could cause a deadlock. We will use
lower-level calls to look up index pages.
buf_LRU_block_remove_hashed(): Disable consistency checks for freed
ROW_FORMAT=COMPRESSED pages. Their contents could be uninitialized garbage.
This fixes an occasional failure of the test
innodb.innodb_bulk_create_index_debug.
Tested by: Matthias Leich
2 years ago  MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.
We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.
Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.
An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.
An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.
Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.
The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.
The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.
We extend the MDEV-12353 redo log record format as follows.
(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.
In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.
Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.
The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.
The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).
The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:
os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes
The following status variables will be removed:
Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)
log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.
If the file system buffers can be bypassed, a message like the following
will be issued:
InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)
This has been tested on Linux and Microsoft Windows with both sizes.
On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.
The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.
When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:
InnoDB: log sequence number 0 (memory-mapped); transaction id 3
On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().
mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.
backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.
xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.
trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).
recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.
recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.
recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.
recv_sys_t, log_sys_t: Removed many data members.
recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.
recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.
recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.
recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)
recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().
recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.
mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.
log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.
log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.
log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.
recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).
We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.
Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.
Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
4 years ago  TSAN: unprotected global variable
WARNING: ThreadSanitizer: data race (pid=1510842)
Write of size 8 at 0x0000067b1e98 by main thread:
#0 os_file_pwrite(IORequest const&, int, unsigned char const*, unsigned long, unsigned long, dberr_t*) /storage/innobase/os/os0file.cc:2928:2 (mariadbd+0x234c5ac)
#1 os_file_write_func(IORequest const&, char const*, int, void const*, unsigned long, unsigned long) /storage/innobase/os/os0file.cc:2963:20 (mariadbd+0x234c019)
#2 file_os_io::write(char const*, unsigned long, st_::span<unsigned char const>) /storage/innobase/log/log0log.cc:320:10 (mariadbd+0x22eaa50)
#3 log_file_t::write(unsigned long, st_::span<unsigned char const>) /storage/innobase/log/log0log.cc:434:18 (mariadbd+0x22eb1d8)
#4 log_t::file::write(unsigned long, st_::span<unsigned char>) /storage/innobase/log/log0log.cc:496:29 (mariadbd+0x22ebb55)
#5 log_write_buf(unsigned char*, unsigned long, unsigned long, unsigned long, unsigned long) /storage/innobase/log/log0log.cc:614:14 (mariadbd+0x22f1b51)
#6 log_write(bool) /storage/innobase/log/log0log.cc:755:2 (mariadbd+0x22ed2ec)
#7 log_write_up_to(unsigned long, bool, bool, completion_callback const*) /storage/innobase/log/log0log.cc:817:5 (mariadbd+0x22eca44)
#8 log_checkpoint_low(unsigned long, unsigned long) /storage/innobase/buf/buf0flu.cc:1734:5 (mariadbd+0x20d37c1)
#9 log_checkpoint() /storage/innobase/buf/buf0flu.cc:1787:10 (mariadbd+0x20cd155)
#10 buf_flush_wait_flushed(unsigned long) /storage/innobase/buf/buf0flu.cc:1867:5 (mariadbd+0x20ccf8f)
#11 log_make_checkpoint() /storage/innobase/buf/buf0flu.cc:1793:3 (mariadbd+0x20cc4c9)
#12 buf_dblwr_t::create() /storage/innobase/buf/buf0dblwr.cc:216:3 (mariadbd+0x209076a)
#13 srv_start(bool) /storage/innobase/srv/srv0start.cc:1685:20 (mariadbd+0x256b4aa)
#14 innodb_init(void*) /storage/innobase/handler/ha_innodb.cc:4188:8 (mariadbd+0x1ed40da)
#15 ha_initialize_handlerton(st_plugin_int*) /sql/handler.cc:659:31 (mariadbd+0xf7c2b6)
#16 plugin_initialize(st_mem_root*, st_plugin_int*, int*, char**, bool) /sql/sql_plugin.cc:1463:9 (mariadbd+0x160fedb)
#17 plugin_init(int*, char**, int) /sql/sql_plugin.cc:1756:15 (mariadbd+0x160f53f)
#18 init_server_components() /sql/mysqld.cc:5043:7 (mariadbd+0xd71462)
#19 mysqld_main(int, char**) /sql/mysqld.cc:5655:7 (mariadbd+0xd6ae87)
#20 main /sql/main.cc:34:10 (mariadbd+0xd661c8)
Previous write of size 8 at 0x0000067b1e98 by thread T3:
#0 os_file_pwrite(IORequest const&, int, unsigned char const*, unsigned long, unsigned long, dberr_t*) /storage/innobase/os/os0file.cc:2928:2 (mariadbd+0x234c5ac)
#1 os_file_write_func(IORequest const&, char const*, int, void const*, unsigned long, unsigned long) /storage/innobase/os/os0file.cc:2963:20 (mariadbd+0x234c019)
#2 file_os_io::write(char const*, unsigned long, st_::span<unsigned char const>) /storage/innobase/log/log0log.cc:320:10 (mariadbd+0x22eaa50)
#3 log_file_t::write(unsigned long, st_::span<unsigned char const>) /storage/innobase/log/log0log.cc:434:18 (mariadbd+0x22eb1d8)
#4 log_t::file::write(unsigned long, st_::span<unsigned char>) /storage/innobase/log/log0log.cc:496:29 (mariadbd+0x22ebb55)
#5 log_write_checkpoint_info(unsigned long) /storage/innobase/log/log0log.cc:911:14 (mariadbd+0x22edd4e)
#6 log_checkpoint_low(unsigned long, unsigned long) /storage/innobase/buf/buf0flu.cc:1755:3 (mariadbd+0x20d3a3d)
#7 buf_flush_sync_for_checkpoint(unsigned long) /storage/innobase/buf/buf0flu.cc:1947:7 (mariadbd+0x20d4163)
#8 buf_flush_page_cleaner() /storage/innobase/buf/buf0flu.cc:2186:9 (mariadbd+0x20cdab1)
#9 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14 (mariadbd+0x20c3aaa)
#10 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 (mariadbd+0x20c39bd)
#11 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:253:13 (mariadbd+0x20c3965)
#12 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:260:11 (mariadbd+0x20c3905)
#13 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > >::_M_run() /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13 (mariadbd+0x20c37f9)
#14 <null> <null> (libstdc++.so.6+0xd230f)
Location is global 'os_n_file_writes' of size 8 at 0x0000067b1e98 (mariadbd+0x67b1e98)
Make variable atomic.
4 years ago  TSAN: unprotected global variable
WARNING: ThreadSanitizer: data race (pid=1510842)
Write of size 8 at 0x0000067b1e98 by main thread:
#0 os_file_pwrite(IORequest const&, int, unsigned char const*, unsigned long, unsigned long, dberr_t*) /storage/innobase/os/os0file.cc:2928:2 (mariadbd+0x234c5ac)
#1 os_file_write_func(IORequest const&, char const*, int, void const*, unsigned long, unsigned long) /storage/innobase/os/os0file.cc:2963:20 (mariadbd+0x234c019)
#2 file_os_io::write(char const*, unsigned long, st_::span<unsigned char const>) /storage/innobase/log/log0log.cc:320:10 (mariadbd+0x22eaa50)
#3 log_file_t::write(unsigned long, st_::span<unsigned char const>) /storage/innobase/log/log0log.cc:434:18 (mariadbd+0x22eb1d8)
#4 log_t::file::write(unsigned long, st_::span<unsigned char>) /storage/innobase/log/log0log.cc:496:29 (mariadbd+0x22ebb55)
#5 log_write_buf(unsigned char*, unsigned long, unsigned long, unsigned long, unsigned long) /storage/innobase/log/log0log.cc:614:14 (mariadbd+0x22f1b51)
#6 log_write(bool) /storage/innobase/log/log0log.cc:755:2 (mariadbd+0x22ed2ec)
#7 log_write_up_to(unsigned long, bool, bool, completion_callback const*) /storage/innobase/log/log0log.cc:817:5 (mariadbd+0x22eca44)
#8 log_checkpoint_low(unsigned long, unsigned long) /storage/innobase/buf/buf0flu.cc:1734:5 (mariadbd+0x20d37c1)
#9 log_checkpoint() /storage/innobase/buf/buf0flu.cc:1787:10 (mariadbd+0x20cd155)
#10 buf_flush_wait_flushed(unsigned long) /storage/innobase/buf/buf0flu.cc:1867:5 (mariadbd+0x20ccf8f)
#11 log_make_checkpoint() /storage/innobase/buf/buf0flu.cc:1793:3 (mariadbd+0x20cc4c9)
#12 buf_dblwr_t::create() /storage/innobase/buf/buf0dblwr.cc:216:3 (mariadbd+0x209076a)
#13 srv_start(bool) /storage/innobase/srv/srv0start.cc:1685:20 (mariadbd+0x256b4aa)
#14 innodb_init(void*) /storage/innobase/handler/ha_innodb.cc:4188:8 (mariadbd+0x1ed40da)
#15 ha_initialize_handlerton(st_plugin_int*) /sql/handler.cc:659:31 (mariadbd+0xf7c2b6)
#16 plugin_initialize(st_mem_root*, st_plugin_int*, int*, char**, bool) /sql/sql_plugin.cc:1463:9 (mariadbd+0x160fedb)
#17 plugin_init(int*, char**, int) /sql/sql_plugin.cc:1756:15 (mariadbd+0x160f53f)
#18 init_server_components() /sql/mysqld.cc:5043:7 (mariadbd+0xd71462)
#19 mysqld_main(int, char**) /sql/mysqld.cc:5655:7 (mariadbd+0xd6ae87)
#20 main /sql/main.cc:34:10 (mariadbd+0xd661c8)
Previous write of size 8 at 0x0000067b1e98 by thread T3:
#0 os_file_pwrite(IORequest const&, int, unsigned char const*, unsigned long, unsigned long, dberr_t*) /storage/innobase/os/os0file.cc:2928:2 (mariadbd+0x234c5ac)
#1 os_file_write_func(IORequest const&, char const*, int, void const*, unsigned long, unsigned long) /storage/innobase/os/os0file.cc:2963:20 (mariadbd+0x234c019)
#2 file_os_io::write(char const*, unsigned long, st_::span<unsigned char const>) /storage/innobase/log/log0log.cc:320:10 (mariadbd+0x22eaa50)
#3 log_file_t::write(unsigned long, st_::span<unsigned char const>) /storage/innobase/log/log0log.cc:434:18 (mariadbd+0x22eb1d8)
#4 log_t::file::write(unsigned long, st_::span<unsigned char>) /storage/innobase/log/log0log.cc:496:29 (mariadbd+0x22ebb55)
#5 log_write_checkpoint_info(unsigned long) /storage/innobase/log/log0log.cc:911:14 (mariadbd+0x22edd4e)
#6 log_checkpoint_low(unsigned long, unsigned long) /storage/innobase/buf/buf0flu.cc:1755:3 (mariadbd+0x20d3a3d)
#7 buf_flush_sync_for_checkpoint(unsigned long) /storage/innobase/buf/buf0flu.cc:1947:7 (mariadbd+0x20d4163)
#8 buf_flush_page_cleaner() /storage/innobase/buf/buf0flu.cc:2186:9 (mariadbd+0x20cdab1)
#9 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14 (mariadbd+0x20c3aaa)
#10 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 (mariadbd+0x20c39bd)
#11 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:253:13 (mariadbd+0x20c3965)
#12 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:260:11 (mariadbd+0x20c3905)
#13 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > >::_M_run() /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13 (mariadbd+0x20c37f9)
#14 <null> <null> (libstdc++.so.6+0xd230f)
Location is global 'os_n_file_writes' of size 8 at 0x0000067b1e98 (mariadbd+0x67b1e98)
Make variable atomic.
4 years ago  TSAN: data race on a global counter
WARNING: ThreadSanitizer: data race (pid=1503350)
Write of size 8 at 0x0000067b1f20 by thread T3:
#0 os_file_sync_posix(int) /storage/innobase/os/os0file.cc:895:5 (mariadbd+0x23493f6)
#1 os_file_flush_func(int) /storage/innobase/os/os0file.cc:983:8 (mariadbd+0x2349204)
#2 file_os_io::flush() /storage/innobase/log/log0log.cc:326:10 (mariadbd+0x22eaaa9)
#3 log_file_t::flush() /storage/innobase/log/log0log.cc:440:18 (mariadbd+0x22eb2d0)
#4 log_t::file::flush() /storage/innobase/log/log0log.cc:507:29 (mariadbd+0x22ebe69)
#5 log_write_flush_to_disk_low(unsigned long) /storage/innobase/log/log0log.cc:629:17 (mariadbd+0x22ed3f3)
#6 log_write_up_to(unsigned long, bool, bool, completion_callback const*) /storage/innobase/log/log0log.cc:829:3 (mariadbd+0x22ecb04)
#7 log_checkpoint_low(unsigned long, unsigned long) /storage/innobase/buf/buf0flu.cc:1734:5 (mariadbd+0x20d37f1)
#8 buf_flush_sync_for_checkpoint(unsigned long) /storage/innobase/buf/buf0flu.cc:1947:7 (mariadbd+0x20d4193)
#9 buf_flush_page_cleaner() /storage/innobase/buf/buf0flu.cc:2186:9 (mariadbd+0x20cdad7)
#10 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14 (mariadbd+0x20c3aaa)
#11 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 (mariadbd+0x20c39bd)
#12 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:253:13 (mariadbd+0x20c3965)
#13 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:260:11 (mariadbd+0x20c3905)
#14 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > >::_M_run() /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13 (mariadbd+0x20c37f9)
#15 <null> <null> (libstdc++.so.6+0xd230f)
Previous write of size 8 at 0x0000067b1f20 by main thread:
#0 os_file_sync_posix(int) /storage/innobase/os/os0file.cc:895:5 (mariadbd+0x23493f6)
#1 os_file_flush_func(int) /storage/innobase/os/os0file.cc:983:8 (mariadbd+0x2349204)
#2 fil_space_t::flush_low() /storage/innobase/fil/fil0fil.cc:504:5 (mariadbd+0x205cad5)
#3 fil_flush_file_spaces() /storage/innobase/fil/fil0fil.cc:2947:13 (mariadbd+0x206523f)
#4 log_checkpoint() /storage/innobase/buf/buf0flu.cc:1777:5 (mariadbd+0x20cd069)
#5 buf_flush_wait_flushed(unsigned long) /storage/innobase/buf/buf0flu.cc:1867:5 (mariadbd+0x20ccf95)
#6 log_make_checkpoint() /storage/innobase/buf/buf0flu.cc:1793:3 (mariadbd+0x20cc4c9)
#7 buf_dblwr_t::create() /storage/innobase/buf/buf0dblwr.cc:216:3 (mariadbd+0x209076a)
#8 srv_start(bool) /storage/innobase/srv/srv0start.cc:1685:20 (mariadbd+0x256b514)
#9 innodb_init(void*) /storage/innobase/handler/ha_innodb.cc:4188:8 (mariadbd+0x1ed406a)
#10 ha_initialize_handlerton(st_plugin_int*) /sql/handler.cc:659:31 (mariadbd+0xf7c246)
#11 plugin_initialize(st_mem_root*, st_plugin_int*, int*, char**, bool) /sql/sql_plugin.cc:1463:9 (mariadbd+0x160fe6b)
#12 plugin_init(int*, char**, int) /sql/sql_plugin.cc:1756:15 (mariadbd+0x160f4cf)
#13 init_server_components() /sql/mysqld.cc:5043:7 (mariadbd+0xd713f2)
#14 mysqld_main(int, char**) /sql/mysqld.cc:5655:7 (mariadbd+0xd6ae17)
#15 main /sql/main.cc:34:10 (mariadbd+0xd66158)
This is a correct report by TSAN for an obvious case: unprotected global
counter. Fix it by making counter std::atomic.
4 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-32939 If tables are frequently created, renamed, dropped, a backup cannot be restored
During mariadb-backup --backup, a table could be renamed, created and
dropped. We could have both oldname.ibd and oldname.new, and one of
the files would be deleted before the InnoDB recovery starts. The desired
end result would be that we will recover both oldname.ibd and newname.ibd.
During normal crash recovery, at most one file operation (create, rename,
delete) may require to be replayed from the write-ahead log before the
DDL recovery starts.
deferred_spaces.create(): In mariadb-backup --prepare, try to create the
file in case it does not exist.
fil_name_process(): Display a message about not found files not only
if innodb_force_recovery is set, but also in mariadb-backup --prepare.
If we are processing a FILE_RENAME for a tablespace whose recovery is
deferred, suppress the message and adjust the file name in case
fil_ibd_load() returns FIL_LOAD_NOT_FOUND or FIL_LOAD_DEFER.
fil_ibd_load(): Remove a redundant file name comparison.
The caller already compared that the file names are different.
We used to wrongly return FIL_LOAD_OK instead of FIL_LOAD_ID_CHANGED
if only the schema name differed, such as a/t1.ibd and b/t1.ibd.
Tested by: Matthias Leich
Reviewed by: Thirunarayanan Balathandayuthapani
2 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-32939 If tables are frequently created, renamed, dropped, a backup cannot be restored
During mariadb-backup --backup, a table could be renamed, created and
dropped. We could have both oldname.ibd and oldname.new, and one of
the files would be deleted before the InnoDB recovery starts. The desired
end result would be that we will recover both oldname.ibd and newname.ibd.
During normal crash recovery, at most one file operation (create, rename,
delete) may require to be replayed from the write-ahead log before the
DDL recovery starts.
deferred_spaces.create(): In mariadb-backup --prepare, try to create the
file in case it does not exist.
fil_name_process(): Display a message about not found files not only
if innodb_force_recovery is set, but also in mariadb-backup --prepare.
If we are processing a FILE_RENAME for a tablespace whose recovery is
deferred, suppress the message and adjust the file name in case
fil_ibd_load() returns FIL_LOAD_NOT_FOUND or FIL_LOAD_DEFER.
fil_ibd_load(): Remove a redundant file name comparison.
The caller already compared that the file names are different.
We used to wrongly return FIL_LOAD_OK instead of FIL_LOAD_ID_CHANGED
if only the schema name differed, such as a/t1.ibd and b/t1.ibd.
Tested by: Matthias Leich
Reviewed by: Thirunarayanan Balathandayuthapani
2 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-32939 If tables are frequently created, renamed, dropped, a backup cannot be restored
During mariadb-backup --backup, a table could be renamed, created and
dropped. We could have both oldname.ibd and oldname.new, and one of
the files would be deleted before the InnoDB recovery starts. The desired
end result would be that we will recover both oldname.ibd and newname.ibd.
During normal crash recovery, at most one file operation (create, rename,
delete) may require to be replayed from the write-ahead log before the
DDL recovery starts.
deferred_spaces.create(): In mariadb-backup --prepare, try to create the
file in case it does not exist.
fil_name_process(): Display a message about not found files not only
if innodb_force_recovery is set, but also in mariadb-backup --prepare.
If we are processing a FILE_RENAME for a tablespace whose recovery is
deferred, suppress the message and adjust the file name in case
fil_ibd_load() returns FIL_LOAD_NOT_FOUND or FIL_LOAD_DEFER.
fil_ibd_load(): Remove a redundant file name comparison.
The caller already compared that the file names are different.
We used to wrongly return FIL_LOAD_OK instead of FIL_LOAD_ID_CHANGED
if only the schema name differed, such as a/t1.ibd and b/t1.ibd.
Tested by: Matthias Leich
Reviewed by: Thirunarayanan Balathandayuthapani
2 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-32939 If tables are frequently created, renamed, dropped, a backup cannot be restored
During mariadb-backup --backup, a table could be renamed, created and
dropped. We could have both oldname.ibd and oldname.new, and one of
the files would be deleted before the InnoDB recovery starts. The desired
end result would be that we will recover both oldname.ibd and newname.ibd.
During normal crash recovery, at most one file operation (create, rename,
delete) may require to be replayed from the write-ahead log before the
DDL recovery starts.
deferred_spaces.create(): In mariadb-backup --prepare, try to create the
file in case it does not exist.
fil_name_process(): Display a message about not found files not only
if innodb_force_recovery is set, but also in mariadb-backup --prepare.
If we are processing a FILE_RENAME for a tablespace whose recovery is
deferred, suppress the message and adjust the file name in case
fil_ibd_load() returns FIL_LOAD_NOT_FOUND or FIL_LOAD_DEFER.
fil_ibd_load(): Remove a redundant file name comparison.
The caller already compared that the file names are different.
We used to wrongly return FIL_LOAD_OK instead of FIL_LOAD_ID_CHANGED
if only the schema name differed, such as a/t1.ibd and b/t1.ibd.
Tested by: Matthias Leich
Reviewed by: Thirunarayanan Balathandayuthapani
2 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago  MDEV-17380 innodb_flush_neighbors=ON should be ignored on SSD
For tablespaces that do not reside on spinning storage, it does
not make sense to attempt to write nearby pages when writing out
dirty pages from the InnoDB buffer pool. It is actually detrimental
to performance and to the life span of flash ROM storage.
With this change, MariaDB will detect whether an InnoDB file resides
on solid-state storage. The detection has been implemented for Linux
and Microsoft Windows. For other systems, we will err on the safe side
and assume that files reside on SSD.
As part of this change, we will reduce the number of fstat() calls
when opening data files on POSIX systems and slightly clean up some
file I/O code.
FIXME: os_is_sparse_file_supported() on POSIX works in a destructive
manner. Thus, we can only invoke it when creating files, not when
opening them.
For diagnostics, we introduce the column ON_SSD to the table
INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING. The table
INNODB_SYS_TABLESPACES might seem more appropriate, but its purpose
is to reflect the contents of the InnoDB system table SYS_TABLESPACES,
which we would like to remove at some point.
On Microsoft Windows, querying StorageDeviceSeekPenaltyProperty
sometimes returns ERROR_GEN_FAILURE instead of ERROR_INVALID_FUNCTION
or ERROR_NOT_SUPPORTED. We will silently ignore also this error,
and assume that the file does not reside on SSD.
On Linux, the detection will be based on the files
/sys/block/*/queue/rotational and /sys/block/*/dev.
Especially for USB storage, it is possible that
/sys/block/*/queue/rotational will wrongly report 1 instead of 0.
fil_node_t::on_ssd: Whether the InnoDB data file resides on
solid-state storage.
fil_system_t::ssd: Collection of Linux block devices that reside on
non-rotational storage.
fil_system_t::create(): Detect ssd on Linux based on the contents
of /sys/block/*/queue/rotational and /sys/block/*/dev.
fil_system_t::is_ssd(dev_t): Determine if a Linux block device is
non-rotational. Partitions will be identified with the containing
block device by assuming that the least significant 4 bits of the
minor number identify a partition, and that the "partition number"
of the entire device is 0.
7 years ago |
|
/***********************************************************************
Copyright (c) 1995, 2019, Oracle and/or its affiliates. All Rights Reserved. Copyright (c) 2009, Percona Inc. Copyright (c) 2013, 2022, MariaDB Corporation.
Portions of this file contain modifications contributed and copyrighted by Percona Inc.. Those modifications are gratefully acknowledged and are described briefly in the InnoDB documentation. The contributions by Percona Inc. are incorporated with their permission, and subject to the conditions contained in the file COPYING.Percona.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1335 USA
***********************************************************************/
/**************************************************//**
@file os/os0file.cc The interface to the operating system file i/o primitives
Created 10/21/1995 Heikki Tuuri *******************************************************/
#include "os0file.h"
#include "sql_const.h"
#include "log.h"
#ifdef __linux__
# include <sys/types.h>
# include <sys/stat.h>
# include <sys/sysmacros.h>
#endif
#include "srv0mon.h"
#include "srv0srv.h"
#include "srv0start.h"
#include "fil0fil.h"
#include "fsp0fsp.h"
#ifdef HAVE_LINUX_UNISTD_H
#include "unistd.h"
#endif
#include "buf0dblwr.h"
#include <tpool_structs.h>
#ifdef LINUX_NATIVE_AIO
#include <libaio.h>
#endif /* LINUX_NATIVE_AIO */
#ifdef HAVE_FALLOC_PUNCH_HOLE_AND_KEEP_SIZE
# include <fcntl.h>
# include <linux/falloc.h>
#endif /* HAVE_FALLOC_PUNCH_HOLE_AND_KEEP_SIZE */
#ifdef _WIN32
#include <winioctl.h>
#endif
// my_test_if_atomic_write() , my_win_secattr()
#include <my_sys.h>
#include <thread>
#include <chrono>
/* Per-IO operation environment*/ class io_slots { private: tpool::cache<tpool::aiocb> m_cache; tpool::task_group m_group; int m_max_aio; public: io_slots(int max_submitted_io, int max_callback_concurrency) : m_cache(max_submitted_io), m_group(max_callback_concurrency, false), m_max_aio(max_submitted_io) { } /* Get cached AIO control block */ tpool::aiocb* acquire() { return m_cache.get(); } /* Release AIO control block back to cache */ void release(tpool::aiocb* aiocb) { m_cache.put(aiocb); }
bool contains(tpool::aiocb* aiocb) { return m_cache.contains(aiocb); }
/* Wait for completions of all AIO operations */ void wait(mysql_mutex_t &m) { m_cache.wait(m); }
void wait() { m_cache.wait(); }
size_t pending_io_count() { return m_cache.pos(); }
tpool::task_group* get_task_group() { return &m_group; }
~io_slots() { wait(); }
mysql_mutex_t& mutex() { return m_cache.mutex(); }
void resize(int max_submitted_io, int max_callback_concurrency) { m_cache.resize(max_submitted_io); m_group.set_max_tasks(max_callback_concurrency); m_max_aio = max_submitted_io; }
tpool::task_group& task_group() { return m_group; } };
static io_slots *read_slots; static io_slots *write_slots;
/** Number of retries for partial I/O's */ constexpr ulint NUM_RETRIES_ON_PARTIAL_IO = 10;
/* This specifies the file permissions InnoDB uses when it creates files in
Unix; the value of os_innodb_umask is initialized in ha_innodb.cc to my_umask */
#ifndef _WIN32
/** Umask for creating files */ static ulint os_innodb_umask = S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP; #else
/** Umask for creating files */ static ulint os_innodb_umask = 0; #endif /* _WIN32 */
Atomic_counter<ulint> os_n_file_reads; static ulint os_bytes_read_since_printout; Atomic_counter<size_t> os_n_file_writes; Atomic_counter<size_t> os_n_fsyncs; static ulint os_n_file_reads_old; static ulint os_n_file_writes_old; static ulint os_n_fsyncs_old;
static time_t os_last_printout; bool os_has_said_disk_full;
/** Default Zip compression level */ extern uint page_zip_level;
#ifdef UNIV_PFS_IO
/* Keys to register InnoDB I/O with performance schema */ mysql_pfs_key_t innodb_data_file_key; mysql_pfs_key_t innodb_temp_file_key; #endif
/** Handle errors for file operations.
@param[in] name name of a file or NULL @param[in] operation operation @param[in] should_abort whether to abort on an unknown error @param[in] on_error_silent whether to suppress reports of non-fatal errors @return true if we should retry the operation */ static bool os_file_handle_error_cond_exit( const char* name, const char* operation, bool should_abort, bool on_error_silent);
/** Does error handling when a file operation fails.
@param operation name of operation that failed */ static void os_file_handle_error(const char *operation) { os_file_handle_error_cond_exit(nullptr, operation, true, false); }
/** Does error handling when a file operation fails.
@param[in] name name of a file or NULL @param[in] operation operation name that failed @param[in] on_error_silent if true then don't print any message to the log. @return true if we should retry the operation */ static bool os_file_handle_error_no_exit( const char* name, const char* operation, bool on_error_silent) { /* Don't exit in case of unknown error */ return(os_file_handle_error_cond_exit( name, operation, false, on_error_silent)); }
/** Handle RENAME error.
@param name old name of the file @param new_name new name of the file */ static void os_file_handle_rename_error(const char* name, const char* new_name) { if (os_file_get_last_error(true) != OS_FILE_DISK_FULL) { ib::error() << "Cannot rename file '" << name << "' to '" << new_name << "'"; } else if (!os_has_said_disk_full) { os_has_said_disk_full = true; /* Disk full error is reported irrespective of the
on_error_silent setting. */ ib::error() << "Full disk prevents renaming file '" << name << "' to '" << new_name << "'"; } }
#ifdef _WIN32
/**
Wrapper around Windows DeviceIoControl() function.
Works synchronously, also in case for handle opened for async access (i.e with FILE_FLAG_OVERLAPPED).
Accepts the same parameters as DeviceIoControl(),except last parameter (OVERLAPPED). */ static BOOL os_win32_device_io_control( HANDLE handle, DWORD code, LPVOID inbuf, DWORD inbuf_size, LPVOID outbuf, DWORD outbuf_size, LPDWORD bytes_returned ) { OVERLAPPED overlapped = { 0 }; overlapped.hEvent = tpool::win_get_syncio_event(); BOOL result = DeviceIoControl(handle, code, inbuf, inbuf_size, outbuf, outbuf_size, NULL, &overlapped);
if (result || (GetLastError() == ERROR_IO_PENDING)) { /* Wait for async io to complete */ result = GetOverlappedResult(handle, &overlapped, bytes_returned, TRUE); }
return result; }
#endif
/** Helper class for doing synchronous file IO. Currently, the objective
is to hide the OS specific code, so that the higher level functions aren't peppered with #ifdef. Makes the code flow difficult to follow. */ class SyncFileIO { public: /** Constructor
@param[in] fh File handle @param[in,out] buf Buffer to read/write @param[in] n Number of bytes to read/write @param[in] offset Offset where to read or write */ SyncFileIO(os_file_t fh, void *buf, ulint n, os_offset_t offset) : m_fh(fh), m_buf(buf), m_n(static_cast<ssize_t>(n)), m_offset(offset) { ut_ad(m_n > 0); }
/** Do the read/write
@param[in] request The IO context and type @return the number of bytes read/written or negative value on error */ ssize_t execute(const IORequest &request);
/** Move the read/write offset up to where the partial IO succeeded.
@param[in] n_bytes The number of bytes to advance */ void advance(ssize_t n_bytes) { m_offset+= n_bytes; ut_ad(m_n >= n_bytes); m_n-= n_bytes; m_buf= reinterpret_cast<uchar*>(m_buf) + n_bytes; }
private: /** Open file handle */ const os_file_t m_fh; /** Buffer to read/write */ void *m_buf; /** Number of bytes to read/write */ ssize_t m_n; /** Offset from where to read/write */ os_offset_t m_offset;
/** Do the read/write
@param request The IO context and type @param n Number of bytes to read/write @return the number of bytes read/written or negative value on error */ ssize_t execute_low(const IORequest& request, ssize_t n); };
#ifndef _WIN32 /* On Microsoft Windows, mandatory locking is used */
/** Obtain an exclusive lock on a file.
@param fd file descriptor @param name file name @return 0 on success */ int os_file_lock(int fd, const char *name) { struct flock lk;
lk.l_type = F_WRLCK; lk.l_whence = SEEK_SET; lk.l_start = lk.l_len = 0;
if (fcntl(fd, F_SETLK, &lk) == -1) {
ib::error() << "Unable to lock " << name << " error: " << errno;
if (errno == EAGAIN || errno == EACCES) {
ib::info() << "Check that you do not already have" " another mariadbd process using the" " same InnoDB data or log files."; }
return(-1); }
return(0); } #endif /* !_WIN32 */
/** Create a temporary file. This function is like tmpfile(3), but
the temporary file is created in the in the mysql server configuration parameter (--tmpdir). @return temporary file handle, or NULL on error */ FILE* os_file_create_tmpfile() { FILE* file = NULL; File fd = mysql_tmpfile("ib");
if (fd >= 0) { file = my_fdopen(fd, 0, O_RDWR|O_TRUNC|O_CREAT|FILE_BINARY, MYF(MY_WME)); if (!file) { my_close(fd, MYF(MY_WME)); } }
if (file == NULL) {
ib::error() << "Unable to create temporary file; errno: " << errno; }
return(file); }
/** Rewind file to its start, read at most size - 1 bytes from it to str, and
NUL-terminate str. All errors are silently ignored. This function is mostly meant to be used with temporary files. @param[in,out] file File to read from @param[in,out] str Buffer where to read @param[in] size Size of buffer */ void os_file_read_string( FILE* file, char* str, ulint size) { if (size != 0) { rewind(file);
size_t flen = fread(str, 1, size - 1, file);
str[flen] = '\0'; } }
/** This function reduces a null-terminated full remote path name into
the path that is sent by MySQL for DATA DIRECTORY clause. It replaces the 'databasename/tablename.ibd' found at the end of the path with just 'tablename'.
Since the result is always smaller than the path sent in, no new memory is allocated. The caller should allocate memory for the path sent in. This function manipulates that path in place.
If the path format is not as expected, just return. The result is used to inform a SHOW CREATE TABLE command. @param[in,out] data_dir_path Full path/data_dir_path */ void os_file_make_data_dir_path( char* data_dir_path) { /* Replace the period before the extension with a null byte. */ char* ptr = strrchr(data_dir_path, '.');
if (ptr == NULL) { return; }
ptr[0] = '\0';
/* The tablename starts after the last slash. */ ptr = strrchr(data_dir_path, '/');
if (ptr == NULL) { return; }
ptr[0] = '\0';
char* tablename = ptr + 1;
/* The databasename starts after the next to last slash. */ ptr = strrchr(data_dir_path, '/'); #ifdef _WIN32
if (char *aptr = strrchr(data_dir_path, '\\')) { if (aptr > ptr) { ptr = aptr; } } #endif
if (ptr == NULL) { return; }
ulint tablename_len = strlen(tablename);
memmove(++ptr, tablename, tablename_len);
ptr[tablename_len] = '\0'; }
/** Check if the path refers to the root of a drive using a pointer
to the last directory separator that the caller has fixed. @param[in] path path name @param[in] path last directory separator in the path @return true if this path is a drive root, false if not */ UNIV_INLINE bool os_file_is_root( const char* path, const char* last_slash) { return( #ifdef _WIN32
(last_slash == path + 2 && path[1] == ':') || #endif /* _WIN32 */
last_slash == path); }
/** Return the parent directory component of a null-terminated path.
Return a new buffer containing the string up to, but not including, the final component of the path. The path returned will not contain a trailing separator. Do not return a root path, return NULL instead. The final component trimmed off may be a filename or a directory name. If the final component is the only component of the path, return NULL. It is the caller's responsibility to free the returned string after it is no longer needed. @param[in] path Path name @return own: parent directory of the path */ static char* os_file_get_parent_dir( const char* path) { /* Find the offset of the last slash */ const char* last_slash = strrchr(path, '/');
#ifdef _WIN32
if (const char *last = strrchr(path, '\\')) { if (last > last_slash) { last_slash = last; } } #endif
if (!last_slash) { /* No slash in the path, return NULL */ return(NULL); }
/* Ok, there is a slash. Is there anything after it? */ const bool has_trailing_slash = last_slash[1] == '\0';
/* Reduce repetitive slashes. */ while (last_slash > path && (IF_WIN(last_slash[-1] == '\\' ||,) last_slash[-1] == '/')) { last_slash--; }
/* Check for the root of a drive. */ if (os_file_is_root(path, last_slash)) { return(NULL); }
/* If a trailing slash prevented the first strrchr() from trimming
the last component of the path, trim that component now. */ if (has_trailing_slash) { /* Back up to the previous slash. */ last_slash--; while (last_slash > path && (IF_WIN(last_slash[0] != '\\' &&,) last_slash[0] != '/')) { last_slash--; }
/* Reduce repetitive slashes. */ while (last_slash > path && (IF_WIN(last_slash[-1] == '\\' ||,) last_slash[-1] == '/')) { last_slash--; } }
/* Check for the root of a drive. */ if (os_file_is_root(path, last_slash)) { return(NULL); }
if (last_slash - path < 0) { /* Sanity check, it prevents gcc from trying to handle this case which
* results in warnings for some optimized builds */ return (NULL); }
/* Non-trivial directory component */
return(mem_strdupl(path, ulint(last_slash - path))); } #ifdef UNIV_ENABLE_UNIT_TEST_GET_PARENT_DIR
/* Test the function os_file_get_parent_dir. */ void test_os_file_get_parent_dir( const char* child_dir, const char* expected_dir) { char* child = mem_strdup(child_dir); char* expected = expected_dir == NULL ? NULL : mem_strdup(expected_dir);
char* parent = os_file_get_parent_dir(child);
bool unexpected = (expected == NULL ? (parent != NULL) : (0 != strcmp(parent, expected))); if (unexpected) { ib::fatal() << "os_file_get_parent_dir('" << child << "') returned '" << parent << "', instead of '" << expected << "'."; } ut_free(parent); ut_free(child); ut_free(expected); }
/* Test the function os_file_get_parent_dir. */ void unit_test_os_file_get_parent_dir() { test_os_file_get_parent_dir("/usr/lib/a", "/usr/lib"); test_os_file_get_parent_dir("/usr/", NULL); test_os_file_get_parent_dir("//usr//", NULL); test_os_file_get_parent_dir("usr", NULL); test_os_file_get_parent_dir("usr//", NULL); test_os_file_get_parent_dir("/", NULL); test_os_file_get_parent_dir("//", NULL); test_os_file_get_parent_dir(".", NULL); test_os_file_get_parent_dir("..", NULL); # ifdef _WIN32
test_os_file_get_parent_dir("D:", NULL); test_os_file_get_parent_dir("D:/", NULL); test_os_file_get_parent_dir("D:\\", NULL); test_os_file_get_parent_dir("D:/data", NULL); test_os_file_get_parent_dir("D:/data/", NULL); test_os_file_get_parent_dir("D:\\data\\", NULL); test_os_file_get_parent_dir("D:///data/////", NULL); test_os_file_get_parent_dir("D:\\\\\\data\\\\\\\\", NULL); test_os_file_get_parent_dir("D:/data//a", "D:/data"); test_os_file_get_parent_dir("D:\\data\\\\a", "D:\\data"); test_os_file_get_parent_dir("D:///data//a///b/", "D:///data//a"); test_os_file_get_parent_dir("D:\\\\\\data\\\\a\\\\\\b\\", "D:\\\\\\data\\\\a"); #endif /* _WIN32 */
} #endif /* UNIV_ENABLE_UNIT_TEST_GET_PARENT_DIR */
/** Creates all missing subdirectories along the given path.
@param[in] path Path name @return DB_SUCCESS if OK, otherwise error code. */ dberr_t os_file_create_subdirs_if_needed( const char* path) { if (srv_read_only_mode) {
ib::error() << "read only mode set. Can't create " << "subdirectories '" << path << "'";
return(DB_READ_ONLY);
}
char* subdir = os_file_get_parent_dir(path);
if (subdir == NULL) { /* subdir is root or cwd, nothing to do */ return(DB_SUCCESS); }
/* Test if subdir exists */ os_file_type_t type; bool subdir_exists; bool success = os_file_status(subdir, &subdir_exists, &type);
if (success && !subdir_exists) {
/* Subdir does not exist, create it */ dberr_t err = os_file_create_subdirs_if_needed(subdir);
if (err != DB_SUCCESS) {
ut_free(subdir);
return(err); }
success = os_file_create_directory(subdir, false); }
ut_free(subdir);
return(success ? DB_SUCCESS : DB_ERROR); }
/** Do the read/write
@param[in] request The IO context and type @param[in] n Number of bytes to read/write @return the number of bytes read/written or negative value on error */ ssize_t SyncFileIO::execute_low(const IORequest& request, ssize_t n) { ut_ad(n > 0); ut_ad(size_t(n) <= os_file_request_size_max);
if (request.is_read()) return IF_WIN(tpool::pread(m_fh, m_buf, n, m_offset), pread(m_fh, m_buf, n, m_offset)); return IF_WIN(tpool::pwrite(m_fh, m_buf, n, m_offset), pwrite(m_fh, m_buf, n, m_offset)); }
/** Do the read/write
@param[in] request The IO context and type @return the number of bytes read/written or negative value on error */ ssize_t SyncFileIO::execute(const IORequest& request) { ssize_t n_bytes= 0; ut_ad(m_n > 0);
while (size_t(m_n) > os_file_request_size_max) { ssize_t n_partial_bytes= execute_low(request, os_file_request_size_max); if (n_partial_bytes < 0) return n_partial_bytes; n_bytes+= n_partial_bytes; if (n_partial_bytes != os_file_request_size_max) return n_bytes; advance(os_file_request_size_max); }
if (ssize_t n= execute_low(request, m_n)) { if (n < 0) return n; n_bytes += n; } return n_bytes; }
#ifndef _WIN32
/** Free storage space associated with a section of the file.
@param[in] fh Open file handle @param[in] off Starting offset (SEEK_SET) @param[in] len Size of the hole @return DB_SUCCESS or error code */ static dberr_t os_file_punch_hole_posix( os_file_t fh, os_offset_t off, os_offset_t len) {
#ifdef HAVE_FALLOC_PUNCH_HOLE_AND_KEEP_SIZE
const int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE;
int ret = fallocate(fh, mode, off, len);
if (ret == 0) { return(DB_SUCCESS); }
if (errno == ENOTSUP) { return(DB_IO_NO_PUNCH_HOLE); }
ib::warn() << "fallocate(" <<", FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, " << off << ", " << len << ") returned errno: " << errno;
return(DB_IO_ERROR);
#elif defined __sun__
// Use F_FREESP
#endif /* HAVE_FALLOC_PUNCH_HOLE_AND_KEEP_SIZE */
return(DB_IO_NO_PUNCH_HOLE); }
/** Retrieves the last error number if an error occurs in a file io function.
The number should be retrieved before any other OS calls (because they may overwrite the error number). If the number is not known to this program, the OS error number + 100 is returned. @param[in] report_all_errors true if we want an error message printed of all errors @param[in] on_error_silent true then don't print any diagnostic to the log @return error number, or OS error number + 100 */ ulint os_file_get_last_error(bool report_all_errors, bool on_error_silent) { int err = errno;
if (err == 0) { return(0); }
if (report_all_errors || (err != ENOSPC && err != EEXIST && err != ENOENT && !on_error_silent)) {
ib::error() << "Operating system error number " << err << " in a file operation.";
if (err == EACCES) {
ib::error() << "The error means mariadbd does not have" " the access rights to the directory.";
} else { if (strerror(err) != NULL) {
ib::error() << "Error number " << err << " means '" << strerror(err) << "'"; }
ib::info() << OPERATING_SYSTEM_ERROR_MSG; } }
switch (err) { case ENOSPC: return(OS_FILE_DISK_FULL); case ENOENT: return(OS_FILE_NOT_FOUND); case EEXIST: return(OS_FILE_ALREADY_EXISTS); case EXDEV: case ENOTDIR: case EISDIR: case EPERM: return(OS_FILE_PATH_ERROR); case EAGAIN: if (srv_use_native_aio) { return(OS_FILE_AIO_RESOURCES_RESERVED); } break; case EINTR: if (srv_use_native_aio) { return(OS_FILE_AIO_INTERRUPTED); } break; case EACCES: return(OS_FILE_ACCESS_VIOLATION); } return(OS_FILE_ERROR_MAX + err); }
/** Wrapper to fsync() or fdatasync() that retries the call on some errors.
Returns the value 0 if successful; otherwise the value -1 is returned and the global variable errno is set to indicate the error. @param[in] file open file handle @return 0 if success, -1 otherwise */ static int os_file_sync_posix(os_file_t file) { #if !defined(HAVE_FDATASYNC) || HAVE_DECL_FDATASYNC == 0
auto func= fsync; auto func_name= "fsync()"; #else
auto func= fdatasync; auto func_name= "fdatasync()"; #endif
ulint failures= 0;
for (;;) { ++os_n_fsyncs;
int ret= func(file);
if (ret == 0) return ret;
switch (errno) { case ENOLCK: ++failures; ut_a(failures < 1000);
if (!(failures % 100)) ib::warn() << func_name << ": No locks available; retrying";
std::this_thread::sleep_for(std::chrono::milliseconds(200)); break;
case EINTR: ++failures; ut_a(failures < 2000); break;
default: ib::fatal() << func_name << " returned " << errno; } } }
/** Check the existence and type of the given file.
@param[in] path path name of file @param[out] exists true if the file exists @param[out] type Type of the file, if it exists @return true if call succeeded */ static bool os_file_status_posix( const char* path, bool* exists, os_file_type_t* type) { struct stat statinfo;
int ret = stat(path, &statinfo);
*exists = !ret;
if (!ret) { /* file exists, everything OK */ MSAN_STAT_WORKAROUND(&statinfo); } else if (errno == ENOENT || errno == ENOTDIR || errno == ENAMETOOLONG) { /* file does not exist */ return(true);
} else { /* file exists, but stat call failed */ os_file_handle_error_no_exit(path, "stat", false); return(false); }
if (S_ISDIR(statinfo.st_mode)) { *type = OS_FILE_TYPE_DIR;
} else if (S_ISLNK(statinfo.st_mode)) { *type = OS_FILE_TYPE_LINK;
} else if (S_ISREG(statinfo.st_mode)) { *type = OS_FILE_TYPE_FILE; } else { *type = OS_FILE_TYPE_UNKNOWN; }
return(true); }
/** NOTE! Use the corresponding macro os_file_flush(), not directly this
function! Flushes the write buffers of a given file to the disk. @param[in] file handle to a file @return true if success */ bool os_file_flush_func( os_file_t file) { int ret;
ret = os_file_sync_posix(file);
if (ret == 0) { return(true); }
/* Since Linux returns EINVAL if the 'file' is actually a raw device,
we choose to ignore that error if we are using raw disks */
if (srv_start_raw_disk_in_use && errno == EINVAL) {
return(true); }
ib::error() << "The OS said file flush did not succeed";
os_file_handle_error("flush");
/* It is a fatal error if a file flush does not succeed, because then
the database can get corrupt on disk */ ut_error;
return(false); }
/** NOTE! Use the corresponding macro os_file_create_simple(), not directly
this function! A simple function to open or create a file. @param[in] name name of the file or path as a null-terminated string @param[in] create_mode create mode @param[in] access_type OS_FILE_READ_ONLY or OS_FILE_READ_WRITE @param[in] read_only if true, read only checks are enforced @param[out] success true if succeed, false if error @return handle to the file, not defined if error, error number can be retrieved with os_file_get_last_error */ pfs_os_file_t os_file_create_simple_func( const char* name, os_file_create_t create_mode, ulint access_type, bool read_only, bool* success) { pfs_os_file_t file;
*success = false;
int create_flag = O_RDONLY | O_CLOEXEC;
if (read_only) { } else if (create_mode == OS_FILE_CREATE) { create_flag = O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC; } else { ut_ad(create_mode == OS_FILE_OPEN); if (access_type != OS_FILE_READ_ONLY) { create_flag = O_RDWR | O_CLOEXEC; } }
bool retry;
#ifdef O_DIRECT
int direct_flag = 0; /* This function is always called for data files, we should disable
OS caching (O_DIRECT) here as we do in os_file_create_func(), so we open the same file in the same mode, see man page of open(2). */ switch (srv_file_flush_method) { case SRV_O_DSYNC: case SRV_O_DIRECT: case SRV_O_DIRECT_NO_FSYNC: direct_flag = O_DIRECT; break; } #else
constexpr int direct_flag = 0; #endif
do { file = open(name, create_flag | direct_flag, os_innodb_umask);
if (file == -1) { #ifdef O_DIRECT
if (direct_flag && errno == EINVAL) { direct_flag = 0; retry = true; continue; } #endif
*success = false; retry = os_file_handle_error_no_exit( name, create_mode == OS_FILE_CREATE ? "create" : "open", false); } else { *success = true; retry = false; }
} while (retry);
if (!read_only && *success && access_type == OS_FILE_READ_WRITE && !my_disable_locking && os_file_lock(file, name)) {
*success = false; close(file); file = -1; }
return(file); }
/** This function attempts to create a directory named pathname. The new
directory gets default permissions. On Unix the permissions are (0770 & ~umask). If the directory exists already, nothing is done and the call succeeds, unless the fail_if_exists arguments is true. If another error occurs, such as a permission error, this does not crash, but reports the error and returns false. @param[in] pathname directory name as null-terminated string @param[in] fail_if_exists if true, pre-existing directory is treated as an error. @return true if call succeeds, false on error */ bool os_file_create_directory( const char* pathname, bool fail_if_exists) { int rcode;
rcode = mkdir(pathname, 0770);
if (!(rcode == 0 || (errno == EEXIST && !fail_if_exists))) { /* failure */ os_file_handle_error_no_exit(pathname, "mkdir", false);
return(false); }
return(true); }
#ifdef O_DIRECT
# if defined __linux
/** Note that the log file uses buffered I/O. */ static ATTRIBUTE_COLD void os_file_log_buffered() { log_sys.log_maybe_unbuffered= false; log_sys.log_buffered= true; } # endif
/** @return whether the log file may work with unbuffered I/O. */ static ATTRIBUTE_COLD bool os_file_log_maybe_unbuffered(const struct stat &st) { MSAN_STAT_WORKAROUND(&st); # ifdef __linux__
char b[20 + sizeof "/sys/dev/block/" ":" "/../queue/physical_block_size"]; if (snprintf(b, sizeof b, "/sys/dev/block/%u:%u/queue/physical_block_size", major(st.st_dev), minor(st.st_dev)) >= static_cast<int>(sizeof b)) { fallback: log_sys.set_block_size(512); return false; } int f= open(b, O_RDONLY); if (f == -1) { if (snprintf(b, sizeof b, "/sys/dev/block/%u:%u/../queue/" "physical_block_size", major(st.st_dev), minor(st.st_dev)) >= static_cast<int>(sizeof b)) goto fallback; f= open(b, O_RDONLY); } unsigned long s= 0; if (f != -1) { ssize_t l= read(f, b, sizeof b); if (l > 0 && size_t(l) < sizeof b && b[l - 1] == '\n') { char *end= b; s= strtoul(b, &end, 10); if (b == end || *end != '\n') s = 0; } close(f); } if (s > 4096 || s < 64 || !ut_is_2pow(s)) goto fallback; log_sys.set_block_size(uint32_t(s)); # else
constexpr unsigned long s= 4096; # endif
return !(st.st_size & (s - 1)); } #endif
/** NOTE! Use the corresponding macro os_file_create(), not directly
this function! Opens an existing file or creates a new. @param[in] name name of the file or path as a null-terminated string @param[in] create_mode create mode @param[in] purpose OS_FILE_AIO, if asynchronous, non-buffered I/O is desired, OS_FILE_NORMAL, if any normal file; NOTE that it also depends on type, os_aio_.. and srv_.. variables whether we really use async I/O or unbuffered I/O: look in the function source code for the exact rules @param[in] type OS_DATA_FILE or OS_LOG_FILE @param[in] read_only true, if read only checks should be enforcedm @param[in] success true if succeeded @return handle to the file, not defined if error, error number can be retrieved with os_file_get_last_error */ pfs_os_file_t os_file_create_func( const char* name, os_file_create_t create_mode, ulint purpose, ulint type, bool read_only, bool* success) { *success = false;
DBUG_EXECUTE_IF( "ib_create_table_fail_disk_full", errno = ENOSPC; return(OS_FILE_CLOSED); );
int create_flag;
if (read_only) { create_flag = O_RDONLY | O_CLOEXEC; } else if (create_mode == OS_FILE_CREATE || create_mode == OS_FILE_CREATE_SILENT) { create_flag = O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC; } else { ut_ad(create_mode == OS_FILE_OPEN || create_mode == OS_FILE_OPEN_SILENT || create_mode == OS_FILE_OPEN_RETRY || create_mode == OS_FILE_OPEN_RETRY_SILENT || create_mode == OS_FILE_OPEN_RAW); create_flag = O_RDWR | O_CLOEXEC; }
#ifdef O_DIRECT
struct stat st; ut_a(type == OS_LOG_FILE || type == OS_DATA_FILE || type == OS_DATA_FILE_NO_O_DIRECT); int direct_flag = 0;
if (type == OS_DATA_FILE) { switch (srv_file_flush_method) { case SRV_O_DSYNC: case SRV_O_DIRECT: case SRV_O_DIRECT_NO_FSYNC: direct_flag = O_DIRECT; break; default: break; } # ifdef __linux__
} else if (type == OS_LOG_FILE && create_mode != OS_FILE_CREATE && create_mode != OS_FILE_CREATE_SILENT && !log_sys.is_opened()) { if (stat(name, &st)) { if (errno == ENOENT) { if (create_mode & OS_FILE_ON_ERROR_SILENT) { goto not_found; } sql_print_error( "InnoDB: File %s was not found", name); goto not_found; } log_sys.set_block_size(512); goto skip_o_direct; } else if (!os_file_log_maybe_unbuffered(st) || log_sys.log_buffered) { skip_o_direct: os_file_log_buffered(); } else { direct_flag = O_DIRECT; log_sys.log_maybe_unbuffered = true; } # endif
} #else
ut_a(type == OS_LOG_FILE || type == OS_DATA_FILE); constexpr int direct_flag = 0; #endif
ut_a(purpose == OS_FILE_AIO || purpose == OS_FILE_NORMAL);
/* We let O_DSYNC only affect log files */
if (!read_only && type == OS_LOG_FILE && srv_file_flush_method == SRV_O_DSYNC) { #ifdef O_DSYNC
create_flag |= O_DSYNC; #else
create_flag |= O_SYNC; #endif
}
os_file_t file;
for (;;) { file = open(name, create_flag | direct_flag, os_innodb_umask);
if (file == -1) { #ifdef O_DIRECT
if (direct_flag && errno == EINVAL) { direct_flag = 0; # ifdef __linux__
if (type == OS_LOG_FILE) { os_file_log_buffered(); } # endif
if (create_mode == OS_FILE_CREATE || create_mode == OS_FILE_CREATE_SILENT) { /* Linux may create the file
before rejecting the O_DIRECT. */ unlink(name); } continue; } #endif
if (!os_file_handle_error_no_exit( name, (create_flag & O_CREAT) ? "create" : "open", create_mode & OS_FILE_ON_ERROR_SILENT)) { break; } } else { *success = true; break; } }
if (!*success) { #ifdef __linux__
not_found: #endif
return OS_FILE_CLOSED; }
#ifdef __linux__
if ((create_flag & O_CREAT) && type == OS_LOG_FILE) { if (fstat(file, &st) || !os_file_log_maybe_unbuffered(st)) { os_file_log_buffered(); } else { close(file); return os_file_create_func(name, OS_FILE_OPEN, purpose, type, false, success); } } #endif
if (!read_only && create_mode != OS_FILE_OPEN_RAW && !my_disable_locking && os_file_lock(file, name)) {
if (create_mode == OS_FILE_OPEN_RETRY || create_mode == OS_FILE_OPEN_RETRY_SILENT) { ib::info() << "Retrying to lock the first data file";
for (int i = 0; i < 100; i++) { std::this_thread::sleep_for( std::chrono::seconds(1));
if (!os_file_lock(file, name)) { *success = true; return(file); } }
ib::info() << "Unable to open the first data file"; }
*success = false; close(file); file = -1; }
return(file); }
/** NOTE! Use the corresponding macro
os_file_create_simple_no_error_handling(), not directly this function! A simple function to open or create a file. @param[in] name name of the file or path as a null-terminated string @param[in] create_mode OS_FILE_CREATE or OS_FILE_OPEN @param[in] access_type OS_FILE_READ_ONLY, OS_FILE_READ_WRITE, or OS_FILE_READ_ALLOW_DELETE; the last option is used by a backup program reading the file @param[in] read_only if true read only mode checks are enforced @param[out] success true if succeeded @return own: handle to the file, not defined if error, error number can be retrieved with os_file_get_last_error */ pfs_os_file_t os_file_create_simple_no_error_handling_func( const char* name, os_file_create_t create_mode, ulint access_type, bool read_only, bool* success) { os_file_t file; int create_flag = O_RDONLY | O_CLOEXEC;
*success = false;
if (read_only) { } else if (create_mode == OS_FILE_CREATE) { create_flag = O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC; } else { ut_ad(create_mode == OS_FILE_OPEN); if (access_type != OS_FILE_READ_ONLY) { ut_a(access_type == OS_FILE_READ_WRITE || access_type == OS_FILE_READ_ALLOW_DELETE);
create_flag = O_RDWR; } }
file = open(name, create_flag, os_innodb_umask);
*success = (file != -1);
if (!read_only && *success && access_type == OS_FILE_READ_WRITE && !my_disable_locking && os_file_lock(file, name)) {
*success = false; close(file); file = -1;
}
return(file); }
/** Deletes a file if it exists. The file has to be closed before calling this.
@param[in] name file path as a null-terminated string @param[out] exist indicate if file pre-exist @return true if success */ bool os_file_delete_if_exists_func( const char* name, bool* exist) { if (exist != NULL) { *exist = true; }
int ret;
ret = unlink(name);
if (ret != 0 && errno == ENOENT) { if (exist != NULL) { *exist = false; } } else if (ret != 0 && errno != ENOENT) { os_file_handle_error_no_exit(name, "delete", false);
return(false); }
return(true); }
/** Deletes a file. The file has to be closed before calling this.
@param[in] name file path as a null-terminated string @return true if success */ bool os_file_delete_func( const char* name) { int ret;
ret = unlink(name);
if (ret != 0) { os_file_handle_error_no_exit(name, "delete", FALSE);
return(false); }
return(true); }
/** NOTE! Use the corresponding macro os_file_rename(), not directly this
function! Renames a file (can also move it to another directory). It is safest that the file is closed before calling this function. @param[in] oldpath old file path as a null-terminated string @param[in] newpath new file path @return true if success */ bool os_file_rename_func( const char* oldpath, const char* newpath) { #ifdef UNIV_DEBUG
os_file_type_t type; bool exists;
/* New path must not exist. */ ut_ad(os_file_status(newpath, &exists, &type)); ut_ad(!exists);
/* Old path must exist. */ ut_ad(os_file_status(oldpath, &exists, &type)); ut_ad(exists); #endif /* UNIV_DEBUG */
int ret;
ret = rename(oldpath, newpath);
if (ret != 0) { os_file_handle_rename_error(oldpath, newpath);
return(false); }
return(true); }
/** NOTE! Use the corresponding macro os_file_close(), not directly this
function! Closes a file handle. In case of error, error number can be retrieved with os_file_get_last_error. @param[in] file Handle to close @return true if success */ bool os_file_close_func(os_file_t file) { int ret= close(file);
if (!ret) return true;
os_file_handle_error("close"); return false; }
/** Gets a file size.
@param[in] file handle to an open file @return file size, or (os_offset_t) -1 on failure */ os_offset_t os_file_get_size(os_file_t file) { struct stat statbuf; if (fstat(file, &statbuf)) return os_offset_t(-1); MSAN_STAT_WORKAROUND(&statbuf); return statbuf.st_size; }
/** Gets a file size.
@param[in] filename Full path to the filename to check @return file size if OK, else set m_total_size to ~0 and m_alloc_size to errno */ os_file_size_t os_file_get_size( const char* filename) { struct stat s; os_file_size_t file_size;
int ret = stat(filename, &s);
if (ret == 0) { MSAN_STAT_WORKAROUND(&s); file_size.m_total_size = s.st_size; /* st_blocks is in 512 byte sized blocks */ file_size.m_alloc_size = s.st_blocks * 512; } else { file_size.m_total_size = ~0U; file_size.m_alloc_size = (os_offset_t) errno; }
return(file_size); }
/** This function returns information about the specified file
@param[in] path pathname of the file @param[out] stat_info information of a file in a directory @param[in,out] statinfo information of a file in a directory @param[in] check_rw_perm for testing whether the file can be opened in RW mode @param[in] read_only if true read only mode checks are enforced @return DB_SUCCESS if all OK */ static dberr_t os_file_get_status_posix( const char* path, os_file_stat_t* stat_info, struct stat* statinfo, bool check_rw_perm, bool read_only) { int ret = stat(path, statinfo);
if (ret && (errno == ENOENT || errno == ENOTDIR || errno == ENAMETOOLONG)) { /* file does not exist */
return(DB_NOT_FOUND);
} else if (ret) { /* file exists, but stat call failed */
os_file_handle_error_no_exit(path, "stat", false);
return(DB_FAIL); }
MSAN_STAT_WORKAROUND(statinfo);
switch (statinfo->st_mode & S_IFMT) { case S_IFDIR: stat_info->type = OS_FILE_TYPE_DIR; break; case S_IFLNK: stat_info->type = OS_FILE_TYPE_LINK; break; case S_IFBLK: /* Handle block device as regular file. */ case S_IFCHR: /* Handle character device as regular file. */ case S_IFREG: stat_info->type = OS_FILE_TYPE_FILE; break; default: stat_info->type = OS_FILE_TYPE_UNKNOWN; }
stat_info->size = statinfo->st_size; stat_info->block_size = statinfo->st_blksize; stat_info->alloc_size = statinfo->st_blocks * 512;
if (check_rw_perm && (stat_info->type == OS_FILE_TYPE_FILE || stat_info->type == OS_FILE_TYPE_BLOCK)) {
stat_info->rw_perm = !access(path, read_only ? R_OK : R_OK | W_OK); }
return(DB_SUCCESS); }
/** Truncates a file to a specified size in bytes.
Do nothing if the size to preserve is greater or equal to the current size of the file. @param[in] pathname file path @param[in] file file to be truncated @param[in] size size to preserve in bytes @return true if success */ static bool os_file_truncate_posix( const char* pathname, os_file_t file, os_offset_t size) { int res = ftruncate(file, size);
if (res == -1) {
bool retry;
retry = os_file_handle_error_no_exit( pathname, "truncate", false);
if (retry) { ib::warn() << "Truncate failed for '" << pathname << "'"; } }
return(res == 0); }
/** Truncates a file at its current position.
@return true if success */ bool os_file_set_eof( FILE* file) /*!< in: file to be truncated */ { return(!ftruncate(fileno(file), ftell(file))); }
#else /* !_WIN32 */
#include <WinIoCtl.h>
/** Free storage space associated with a section of the file.
@param[in] fh Open file handle @param[in] off Starting offset (SEEK_SET) @param[in] len Size of the hole @return 0 on success or errno */ static dberr_t os_file_punch_hole_win32( os_file_t fh, os_offset_t off, os_offset_t len) { FILE_ZERO_DATA_INFORMATION punch;
punch.FileOffset.QuadPart = off; punch.BeyondFinalZero.QuadPart = off + len;
/* If lpOverlapped is NULL, lpBytesReturned cannot be NULL,
therefore we pass a dummy parameter. */ DWORD temp; BOOL success = os_win32_device_io_control( fh, FSCTL_SET_ZERO_DATA, &punch, sizeof(punch), NULL, 0, &temp);
return(success ? DB_SUCCESS: DB_IO_NO_PUNCH_HOLE); }
/** Check the existence and type of the given file.
@param[in] path path name of file @param[out] exists true if the file exists @param[out] type Type of the file, if it exists @return true if call succeeded */ static bool os_file_status_win32( const char* path, bool* exists, os_file_type_t* type) { int ret; struct _stat64 statinfo;
ret = _stat64(path, &statinfo);
*exists = !ret;
if (!ret) { /* file exists, everything OK */
} else if (errno == ENOENT || errno == ENOTDIR || errno == ENAMETOOLONG) { /* file does not exist */ return(true);
} else { /* file exists, but stat call failed */ os_file_handle_error_no_exit(path, "stat", false); return(false); }
if (_S_IFDIR & statinfo.st_mode) { *type = OS_FILE_TYPE_DIR;
} else if (_S_IFREG & statinfo.st_mode) { *type = OS_FILE_TYPE_FILE;
} else { *type = OS_FILE_TYPE_UNKNOWN; }
return(true); }
/* Dynamically load NtFlushBuffersFileEx, used in os_file_flush_func */ #include <winternl.h>
typedef NTSTATUS(WINAPI* pNtFlushBuffersFileEx)( HANDLE FileHandle, ULONG Flags, PVOID Parameters, ULONG ParametersSize, PIO_STATUS_BLOCK IoStatusBlock);
static pNtFlushBuffersFileEx my_NtFlushBuffersFileEx = (pNtFlushBuffersFileEx)GetProcAddress(GetModuleHandle("ntdll"), "NtFlushBuffersFileEx");
/** NOTE! Use the corresponding macro os_file_flush(), not directly this
function! Flushes the write buffers of a given file to the disk. @param[in] file handle to a file @return true if success */ bool os_file_flush_func(os_file_t file) { ++os_n_fsyncs; static bool disable_datasync;
if (my_NtFlushBuffersFileEx && !disable_datasync) { IO_STATUS_BLOCK iosb{}; NTSTATUS status= my_NtFlushBuffersFileEx( file, FLUSH_FLAGS_FILE_DATA_SYNC_ONLY, nullptr, 0, &iosb); if (!status) return true; /*
NtFlushBuffersFileEx(FLUSH_FLAGS_FILE_DATA_SYNC_ONLY) might fail unless on Win10+, and maybe non-NTFS. Switch to using FlushFileBuffers(). */ disable_datasync= true; }
if (FlushFileBuffers(file)) return true;
/* Since Windows returns ERROR_INVALID_FUNCTION if the 'file' is
actually a raw device, we choose to ignore that error if we are using raw disks */ if (srv_start_raw_disk_in_use && GetLastError() == ERROR_INVALID_FUNCTION) return true;
os_file_handle_error("flush");
/* It is a fatal error if a file flush does not succeed, because then
the database can get corrupt on disk */ ut_error;
return false; }
/** Retrieves the last error number if an error occurs in a file io function.
The number should be retrieved before any other OS calls (because they may overwrite the error number). If the number is not known to this program, then OS error number + OS_FILE_ERROR_MAX is returned. @param[in] report_all_errors true if we want an error message printed of all errors @param[in] on_error_silent true then don't print any diagnostic to the log @return error number, or OS error number + OS_FILE_ERROR_MAX */ ulint os_file_get_last_error(bool report_all_errors, bool on_error_silent)
{ ulint err = (ulint) GetLastError();
if (err == ERROR_SUCCESS) { return(0); }
if (report_all_errors || (!on_error_silent && err != ERROR_DISK_FULL && err != ERROR_FILE_NOT_FOUND && err != ERROR_FILE_EXISTS)) {
ib::error() << "Operating system error number " << err << " in a file operation.";
switch (err) { case ERROR_PATH_NOT_FOUND: break; case ERROR_ACCESS_DENIED: ib::error() << "The error means mariadbd does not have" " the access rights to" " the directory. It may also be" " you have created a subdirectory" " of the same name as a data file."; break; case ERROR_SHARING_VIOLATION: case ERROR_LOCK_VIOLATION: ib::error() << "The error means that another program" " is using InnoDB's files." " This might be a backup or antivirus" " software or another instance" " of MariaDB." " Please close it to get rid of this error."; break; case ERROR_WORKING_SET_QUOTA: case ERROR_NO_SYSTEM_RESOURCES: ib::error() << "The error means that there are no" " sufficient system resources or quota to" " complete the operation."; break; case ERROR_OPERATION_ABORTED: ib::error() << "The error means that the I/O" " operation has been aborted" " because of either a thread exit" " or an application request." " Retry attempt is made."; break; default: ib::info() << OPERATING_SYSTEM_ERROR_MSG; } }
if (err == ERROR_FILE_NOT_FOUND) { return(OS_FILE_NOT_FOUND); } else if (err == ERROR_DISK_FULL) { return(OS_FILE_DISK_FULL); } else if (err == ERROR_FILE_EXISTS) { return(OS_FILE_ALREADY_EXISTS); } else if (err == ERROR_SHARING_VIOLATION || err == ERROR_LOCK_VIOLATION) { return(OS_FILE_SHARING_VIOLATION); } else if (err == ERROR_WORKING_SET_QUOTA || err == ERROR_NO_SYSTEM_RESOURCES) { return(OS_FILE_INSUFFICIENT_RESOURCE); } else if (err == ERROR_OPERATION_ABORTED) { return(OS_FILE_OPERATION_ABORTED); } else if (err == ERROR_ACCESS_DENIED) { return(OS_FILE_ACCESS_VIOLATION); }
return(OS_FILE_ERROR_MAX + err); }
/** NOTE! Use the corresponding macro os_file_create_simple(), not directly
this function! A simple function to open or create a file. @param[in] name name of the file or path as a null-terminated string @param[in] create_mode create mode @param[in] access_type OS_FILE_READ_ONLY or OS_FILE_READ_WRITE @param[in] read_only if true read only mode checks are enforced @param[out] success true if succeed, false if error @return handle to the file, not defined if error, error number can be retrieved with os_file_get_last_error */ pfs_os_file_t os_file_create_simple_func( const char* name, os_file_create_t create_mode, ulint access_type, bool read_only, bool* success) { os_file_t file;
*success = false;
DWORD access = GENERIC_READ; DWORD create_flag; DWORD attributes = 0;
ut_ad(srv_operation == SRV_OPERATION_NORMAL);
if (read_only || create_mode == OS_FILE_OPEN) { create_flag = OPEN_EXISTING; } else { ut_ad(create_mode == OS_FILE_CREATE); create_flag = CREATE_NEW; }
if (access_type == OS_FILE_READ_ONLY) { } else if (read_only) { ib::info() << "Read only mode set. Unable to" " open file '" << name << "' in RW mode, " << "trying RO mode"; } else { ut_ad(access_type == OS_FILE_READ_WRITE); access = GENERIC_READ | GENERIC_WRITE; }
for (;;) { /* Use default security attributes and no template file. */
file = CreateFile( (LPCTSTR) name, access, FILE_SHARE_READ | FILE_SHARE_DELETE, my_win_file_secattr(), create_flag, attributes, NULL);
if (file != INVALID_HANDLE_VALUE) { *success = true; break; }
if (!os_file_handle_error_no_exit(name, create_flag == CREATE_NEW ? "create" : "open", false)) { break; } }
return(file); }
/** This function attempts to create a directory named pathname. The new
directory gets default permissions. On Unix the permissions are (0770 & ~umask). If the directory exists already, nothing is done and the call succeeds, unless the fail_if_exists arguments is true. If another error occurs, such as a permission error, this does not crash, but reports the error and returns false. @param[in] pathname directory name as null-terminated string @param[in] fail_if_exists if true, pre-existing directory is treated as an error. @return true if call succeeds, false on error */ bool os_file_create_directory( const char* pathname, bool fail_if_exists) { BOOL rcode;
rcode = CreateDirectory((LPCTSTR) pathname, NULL); if (!(rcode != 0 || (GetLastError() == ERROR_ALREADY_EXISTS && !fail_if_exists))) {
os_file_handle_error_no_exit( pathname, "CreateDirectory", false);
return(false); }
return(true); }
/** Get disk sector size for a file. */ static size_t get_sector_size(HANDLE file) { FILE_STORAGE_INFO fsi; ULONG s= 4096; if (GetFileInformationByHandleEx(file, FileStorageInfo, &fsi, sizeof fsi)) { s= fsi.PhysicalBytesPerSectorForPerformance; if (s > 4096 || s < 64 || !ut_is_2pow(s)) return 4096; } return s; }
/** NOTE! Use the corresponding macro os_file_create(), not directly
this function! Opens an existing file or creates a new. @param[in] name name of the file or path as a null-terminated string @param[in] create_mode create mode @param[in] purpose OS_FILE_AIO, if asynchronous, non-buffered I/O is desired, OS_FILE_NORMAL, if any normal file; NOTE that it also depends on type, os_aio_.. and srv_.. variables whether we really use async I/O or unbuffered I/O: look in the function source code for the exact rules @param[in] type OS_DATA_FILE or OS_LOG_FILE @param[in] success true if succeeded @return handle to the file, not defined if error, error number can be retrieved with os_file_get_last_error */ pfs_os_file_t os_file_create_func( const char* name, os_file_create_t create_mode, ulint purpose, ulint type, bool read_only, bool* success) { os_file_t file;
*success = false;
DBUG_EXECUTE_IF( "ib_create_table_fail_disk_full", *success = false; SetLastError(ERROR_DISK_FULL); return(OS_FILE_CLOSED); );
DWORD create_flag = OPEN_EXISTING; DWORD share_mode = read_only ? FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE : FILE_SHARE_READ | FILE_SHARE_DELETE;
switch (create_mode) { case OS_FILE_OPEN_RAW: ut_a(!read_only); /* On Windows Physical devices require admin privileges and
have to have the write-share mode set. See the remarks section for the CreateFile() function documentation in MSDN. */
share_mode |= FILE_SHARE_WRITE; break; case OS_FILE_CREATE_SILENT: case OS_FILE_CREATE: create_flag = CREATE_NEW; break; default: ut_ad(create_mode == OS_FILE_OPEN || create_mode == OS_FILE_OPEN_SILENT || create_mode == OS_FILE_OPEN_RETRY_SILENT || create_mode == OS_FILE_OPEN_RETRY); break; }
DWORD attributes = (purpose == OS_FILE_AIO && srv_use_native_aio) ? FILE_FLAG_OVERLAPPED : 0;
if (type == OS_LOG_FILE) { if (!log_sys.is_opened() && !log_sys.log_buffered) { attributes|= FILE_FLAG_NO_BUFFERING; } if (srv_file_flush_method == SRV_O_DSYNC) attributes|= FILE_FLAG_WRITE_THROUGH; } else if (type == OS_DATA_FILE) { switch (srv_file_flush_method) { case SRV_FSYNC: case SRV_LITTLESYNC: case SRV_NOSYNC: break; default: attributes|= FILE_FLAG_NO_BUFFERING; } }
DWORD access = GENERIC_READ;
if (!read_only) { access |= GENERIC_WRITE; }
for (;;) { const char *operation;
/* Use default security attributes and no template file. */ file = CreateFile( name, access, share_mode, my_win_file_secattr(), create_flag, attributes, NULL);
*success = file != INVALID_HANDLE_VALUE;
if (*success && type == OS_LOG_FILE) { uint32_t s = uint32_t(get_sector_size(file)); log_sys.set_block_size(s); if (attributes & FILE_FLAG_NO_BUFFERING) { if (os_file_get_size(file) % s) { attributes &= ~FILE_FLAG_NO_BUFFERING; create_flag = OPEN_ALWAYS; CloseHandle(file); continue; } log_sys.log_buffered = false; } }
if (*success) { break; }
operation = create_flag == CREATE_NEW ? "create" : "open";
if (!os_file_handle_error_no_exit(name, operation, create_mode & OS_FILE_ON_ERROR_SILENT)) { break; } }
if (*success && (attributes & FILE_FLAG_OVERLAPPED) && srv_thread_pool) { srv_thread_pool->bind(file); } return(file); }
/** NOTE! Use the corresponding macro os_file_create_simple_no_error_handling(),
not directly this function! A simple function to open or create a file. @param[in] name name of the file or path as a null-terminated string @param[in] create_mode create mode @param[in] access_type OS_FILE_READ_ONLY, OS_FILE_READ_WRITE, or OS_FILE_READ_ALLOW_DELETE; the last option is used by a backup program reading the file @param[out] success true if succeeded @return own: handle to the file, not defined if error, error number can be retrieved with os_file_get_last_error */
pfs_os_file_t os_file_create_simple_no_error_handling_func( const char* name, os_file_create_t create_mode, ulint access_type, bool read_only, bool* success) { os_file_t file;
DWORD access = GENERIC_READ; DWORD create_flag = OPEN_EXISTING; DWORD attributes = 0; DWORD share_mode = FILE_SHARE_READ | FILE_SHARE_DELETE;
ut_a(name);
if (read_only) { share_mode = FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE; } else { if (create_mode == OS_FILE_CREATE) { create_flag = CREATE_NEW; } else { ut_ad(create_mode == OS_FILE_OPEN); }
switch (access_type) { case OS_FILE_READ_ONLY: break; case OS_FILE_READ_WRITE: access = GENERIC_READ | GENERIC_WRITE; break; default: ut_ad(access_type == OS_FILE_READ_ALLOW_DELETE); /* A backup program has to give mariadbd the maximum
freedom to do what it likes with the file */ share_mode |= FILE_SHARE_DELETE | FILE_SHARE_WRITE | FILE_SHARE_READ; } }
file = CreateFile((LPCTSTR) name, access, share_mode, my_win_file_secattr(), create_flag, attributes, NULL); // No template file
*success = (file != INVALID_HANDLE_VALUE);
return(file); }
/** Deletes a file if it exists. The file has to be closed before calling this.
@param[in] name file path as a null-terminated string @param[out] exist indicate if file pre-exist @return true if success */ bool os_file_delete_if_exists_func( const char* name, bool* exist) { ulint count = 0;
if (exist != NULL) { *exist = true; }
for (;;) { /* In Windows, deleting an .ibd file may fail if
the file is being accessed by an external program, such as a backup tool. */
bool ret = DeleteFile((LPCTSTR) name);
if (ret) { return(true); }
switch (GetLastError()) { case ERROR_FILE_NOT_FOUND: case ERROR_PATH_NOT_FOUND: /* the file does not exist, this not an error */ if (exist != NULL) { *exist = false; } /* fall through */ case ERROR_ACCESS_DENIED: return(true); }
++count;
if (count > 100 && 0 == (count % 10)) {
/* Print error information */ os_file_get_last_error(true);
ib::warn() << "Delete of file '" << name << "' failed."; }
std::this_thread::sleep_for(std::chrono::seconds(1));
if (count > 2000) {
return(false); } } }
/** Deletes a file. The file has to be closed before calling this.
@param[in] name File path as NUL terminated string @return true if success */ bool os_file_delete_func( const char* name) { ulint count = 0;
for (;;) { /* In Windows, deleting an .ibd file may fail if
the file is being accessed by an external program, such as a backup tool. */
BOOL ret = DeleteFile((LPCTSTR) name);
if (ret) { return(true); }
if (GetLastError() == ERROR_FILE_NOT_FOUND) { /* If the file does not exist, we classify this as
a 'mild' error and return */
return(false); }
++count;
if (count > 100 && 0 == (count % 10)) {
/* print error information */ os_file_get_last_error(true);
ib::warn() << "Cannot delete file '" << name << "'. Is " << "another program accessing it?"; }
std::this_thread::sleep_for(std::chrono::seconds(1));
if (count > 2000) {
return(false); } }
ut_error; return(false); }
/** NOTE! Use the corresponding macro os_file_rename(), not directly this
function! Renames a file (can also move it to another directory). It is safest that the file is closed before calling this function. @param[in] oldpath old file path as a null-terminated string @param[in] newpath new file path @return true if success */ bool os_file_rename_func( const char* oldpath, const char* newpath) { #ifdef UNIV_DEBUG
os_file_type_t type; bool exists;
/* New path must not exist. */ ut_ad(os_file_status(newpath, &exists, &type)); ut_ad(!exists);
/* Old path must exist. */ ut_ad(os_file_status(oldpath, &exists, &type)); ut_ad(exists); #endif /* UNIV_DEBUG */
if (MoveFileEx(oldpath, newpath, MOVEFILE_REPLACE_EXISTING)) { return(true); }
os_file_handle_rename_error(oldpath, newpath); return(false); }
/** NOTE! Use the corresponding macro os_file_close(), not directly
this function! Closes a file handle. In case of error, error number can be retrieved with os_file_get_last_error. @param[in,own] file Handle to a file @return true if success */ bool os_file_close_func(os_file_t file) { ut_ad(file); if (!CloseHandle(file)) { os_file_handle_error("close"); return false; }
if(srv_thread_pool) srv_thread_pool->unbind(file); return true; }
/** Gets a file size.
@param[in] file Handle to a file @return file size, or (os_offset_t) -1 on failure */ os_offset_t os_file_get_size(os_file_t file) { LARGE_INTEGER li; if (GetFileSizeEx(file, &li)) return li.QuadPart; return ((os_offset_t) -1); }
/** Gets a file size.
@param[in] filename Full path to the filename to check @return file size if OK, else set m_total_size to ~0 and m_alloc_size to errno */ os_file_size_t os_file_get_size( const char* filename) { struct __stat64 s; os_file_size_t file_size;
int ret = _stat64(filename, &s);
if (ret == 0) {
file_size.m_total_size = s.st_size;
DWORD low_size; DWORD high_size;
low_size = GetCompressedFileSize(filename, &high_size);
if (low_size != INVALID_FILE_SIZE) {
file_size.m_alloc_size = high_size; file_size.m_alloc_size <<= 32; file_size.m_alloc_size |= low_size;
} else { ib::error() << "GetCompressedFileSize(" << filename << ", ..) failed.";
file_size.m_alloc_size = (os_offset_t) -1; } } else { file_size.m_total_size = ~0; file_size.m_alloc_size = (os_offset_t) ret; }
return(file_size); }
/** This function returns information about the specified file
@param[in] path pathname of the file @param[out] stat_info information of a file in a directory @param[in,out] statinfo information of a file in a directory @param[in] check_rw_perm for testing whether the file can be opened in RW mode @param[in] read_only true if the file is opened in read-only mode @return DB_SUCCESS if all OK */ static dberr_t os_file_get_status_win32( const char* path, os_file_stat_t* stat_info, struct _stat64* statinfo, bool check_rw_perm, bool read_only) { int ret = _stat64(path, statinfo);
if (ret && (errno == ENOENT || errno == ENOTDIR || errno == ENAMETOOLONG)) { /* file does not exist */
return(DB_NOT_FOUND);
} else if (ret) { /* file exists, but stat call failed */
os_file_handle_error_no_exit(path, "STAT", false);
return(DB_FAIL);
} else if (_S_IFDIR & statinfo->st_mode) {
stat_info->type = OS_FILE_TYPE_DIR;
} else if (_S_IFREG & statinfo->st_mode) {
DWORD access = GENERIC_READ;
if (!read_only) { access |= GENERIC_WRITE; }
stat_info->type = OS_FILE_TYPE_FILE;
/* Check if we can open it in read-only mode. */
if (check_rw_perm) { HANDLE fh;
fh = CreateFile( (LPCTSTR) path, // File to open
access, FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, // Full sharing
my_win_file_secattr(), OPEN_EXISTING, // Existing file only
FILE_ATTRIBUTE_NORMAL, // Normal file
NULL); // No attr. template
if (fh == INVALID_HANDLE_VALUE) { stat_info->rw_perm = false; } else { stat_info->rw_perm = true; CloseHandle(fh); } } } else { stat_info->type = OS_FILE_TYPE_UNKNOWN; }
return(DB_SUCCESS); }
/**
Sets a sparse flag on Windows file. @param[in] file file handle @return true on success, false on error */ #include <versionhelpers.h>
bool os_file_set_sparse_win32(os_file_t file, bool is_sparse) { if (!is_sparse && !IsWindows8OrGreater()) { /* Cannot unset sparse flag on older Windows.
Until Windows8 it is documented to produce unpredictable results, if there are unallocated ranges in file.*/ return false; } DWORD temp; FILE_SET_SPARSE_BUFFER sparse_buffer; sparse_buffer.SetSparse = is_sparse; return os_win32_device_io_control(file, FSCTL_SET_SPARSE, &sparse_buffer, sizeof(sparse_buffer), 0, 0,&temp); }
/**
Change file size on Windows.
If file is extended, the bytes between old and new EOF are zeros.
If file is sparse, "virtual" block is added at the end of allocated area.
If file is normal, file system allocates storage.
@param[in] pathname file path @param[in] file file handle @param[in] size size to preserve in bytes @return true if success */ bool os_file_change_size_win32( const char* pathname, os_file_t file, os_offset_t size) { LARGE_INTEGER length;
length.QuadPart = size;
BOOL success = SetFilePointerEx(file, length, NULL, FILE_BEGIN);
if (!success) { os_file_handle_error_no_exit( pathname, "SetFilePointerEx", false); } else { success = SetEndOfFile(file); if (!success) { os_file_handle_error_no_exit( pathname, "SetEndOfFile", false); } } return(success); }
/** Truncates a file at its current position.
@param[in] file Handle to be truncated @return true if success */ bool os_file_set_eof( FILE* file) { HANDLE h = (HANDLE) _get_osfhandle(fileno(file));
return(SetEndOfFile(h)); }
#endif /* !_WIN32*/
/** Does a synchronous read or write depending upon the type specified
In case of partial reads/writes the function tries NUM_RETRIES_ON_PARTIAL_IO times to read/write the complete data. @param[in] type, IO flags @param[in] file handle to an open file @param[out] buf buffer where to read @param[in] offset file offset from the start where to read @param[in] n number of bytes to read, starting from offset @param[out] err DB_SUCCESS or error code @return number of bytes read/written, -1 if error */ static MY_ATTRIBUTE((warn_unused_result)) ssize_t os_file_io( const IORequest&in_type, os_file_t file, void* buf, ulint n, os_offset_t offset, dberr_t* err) { ssize_t original_n = ssize_t(n); IORequest type = in_type; ssize_t bytes_returned = 0;
SyncFileIO sync_file_io(file, buf, n, offset);
for (ulint i = 0; i < NUM_RETRIES_ON_PARTIAL_IO; ++i) {
ssize_t n_bytes = sync_file_io.execute(type);
/* Check for a hard error. Not much we can do now. */ if (n_bytes < 0) {
break;
} else if (n_bytes + bytes_returned == ssize_t(n)) {
bytes_returned += n_bytes;
*err = type.maybe_punch_hole(offset, n);
return(original_n); }
/* Handle partial read/write. */
ut_ad(ulint(n_bytes + bytes_returned) < n);
bytes_returned += n_bytes;
if (type.type != IORequest::READ_MAYBE_PARTIAL) { sql_print_warning("InnoDB: %zu bytes should have been" " %s at %llu from %s," " but got only %zd." " Retrying.", n, type.is_read() ? "read" : "written", offset, type.node ? type.node->name : "(unknown file)", bytes_returned); }
/* Advance the offset and buffer by n_bytes */ sync_file_io.advance(n_bytes); }
*err = DB_IO_ERROR;
if (type.type != IORequest::READ_MAYBE_PARTIAL) { ib::warn() << "Retry attempts for " << (type.is_read() ? "reading" : "writing") << " partial data failed."; }
return(bytes_returned); }
/** Does a synchronous write operation in Posix.
@param[in] type IO context @param[in] file handle to an open file @param[out] buf buffer from which to write @param[in] n number of bytes to write, starting from offset @param[in] offset file offset from the start where to write @param[out] err DB_SUCCESS or error code @return number of bytes written @retval -1 on error */ static MY_ATTRIBUTE((warn_unused_result)) ssize_t os_file_pwrite( const IORequest& type, os_file_t file, const byte* buf, ulint n, os_offset_t offset, dberr_t* err) { ut_ad(type.is_write());
++os_n_file_writes;
const bool monitor = MONITOR_IS_ON(MONITOR_OS_PENDING_WRITES); MONITOR_ATOMIC_INC_LOW(MONITOR_OS_PENDING_WRITES, monitor); ssize_t n_bytes = os_file_io(type, file, const_cast<byte*>(buf), n, offset, err); MONITOR_ATOMIC_DEC_LOW(MONITOR_OS_PENDING_WRITES, monitor);
return(n_bytes); }
/** NOTE! Use the corresponding macro os_file_write(), not directly
Requests a synchronous write operation. @param[in] type IO flags @param[in] file handle to an open file @param[out] buf buffer from which to write @param[in] offset file offset from the start where to read @param[in] n number of bytes to read, starting from offset @return error code @retval DB_SUCCESS if the operation succeeded */ dberr_t os_file_write_func( const IORequest& type, const char* name, os_file_t file, const void* buf, os_offset_t offset, ulint n) { dberr_t err;
ut_ad(n > 0);
ssize_t n_bytes = os_file_pwrite(type, file, (byte*)buf, n, offset, &err);
if ((ulint) n_bytes != n && !os_has_said_disk_full) {
ib::error() << "Write to file " << name << " failed at offset " << offset << ", " << n << " bytes should have been written," " only " << n_bytes << " were written." " Operating system error number " << IF_WIN(GetLastError(),errno) << "." " Check that your OS and file system" " support files of this size." " Check also that the disk is not full" " or a disk quota exceeded."; #ifndef _WIN32
if (strerror(errno) != NULL) {
ib::error() << "Error number " << errno << " means '" << strerror(errno) << "'"; }
ib::info() << OPERATING_SYSTEM_ERROR_MSG; #endif
os_has_said_disk_full = true; }
return(err); }
/** Does a synchronous read operation in Posix.
@param[in] type IO flags @param[in] file handle to an open file @param[out] buf buffer where to read @param[in] offset file offset from the start where to read @param[in] n number of bytes to read, starting from offset @param[out] err DB_SUCCESS or error code @return number of bytes read, -1 if error */ static MY_ATTRIBUTE((warn_unused_result)) ssize_t os_file_pread( const IORequest& type, os_file_t file, void* buf, ulint n, os_offset_t offset, dberr_t* err) { ut_ad(type.is_read());
++os_n_file_reads;
const bool monitor = MONITOR_IS_ON(MONITOR_OS_PENDING_READS); MONITOR_ATOMIC_INC_LOW(MONITOR_OS_PENDING_READS, monitor); ssize_t n_bytes = os_file_io(type, file, buf, n, offset, err); MONITOR_ATOMIC_DEC_LOW(MONITOR_OS_PENDING_READS, monitor);
return(n_bytes); }
/** Requests a synchronous positioned read operation.
@return DB_SUCCESS if request was successful, false if fail @param[in] type IO flags @param[in] file handle to an open file @param[out] buf buffer where to read @param[in] offset file offset from the start where to read @param[in] n number of bytes to read, starting from offset @param[out] o number of bytes actually read @return DB_SUCCESS or error code */ dberr_t os_file_read_func( const IORequest& type, os_file_t file, void* buf, os_offset_t offset, ulint n, ulint* o) { ut_ad(!type.node || type.node->handle == file); ut_ad(n);
os_bytes_read_since_printout+= n;
dberr_t err; ssize_t n_bytes= os_file_pread(type, file, buf, n, offset, &err);
if (o) *o= ulint(n_bytes);
if (ulint(n_bytes) == n || err != DB_SUCCESS) return err;
os_file_handle_error_no_exit(type.node ? type.node->name : nullptr, "read", false); sql_print_error("InnoDB: Tried to read %zu bytes at offset %llu" " of file %s, but was only able to read %zd", n, offset, type.node ? type.node->name : "(unknown)", n_bytes);
return err ? err : DB_IO_ERROR; }
/** Handle errors for file operations.
@param[in] name name of a file or NULL @param[in] operation operation @param[in] should_abort whether to abort on an unknown error @param[in] on_error_silent whether to suppress reports of non-fatal errors @return true if we should retry the operation */ static MY_ATTRIBUTE((warn_unused_result)) bool os_file_handle_error_cond_exit( const char* name, const char* operation, bool should_abort, bool on_error_silent) { ulint err;
err = os_file_get_last_error(false, on_error_silent);
switch (err) { case OS_FILE_DISK_FULL: /* We only print a warning about disk full once */
if (os_has_said_disk_full) {
return(false); }
/* Disk full error is reported irrespective of the
on_error_silent setting. */
if (name) {
ib::error() << "Encountered a problem with file '" << name << "'"; }
ib::error() << "Disk is full. Try to clean the disk to free space.";
os_has_said_disk_full = true;
return(false);
case OS_FILE_AIO_RESOURCES_RESERVED: case OS_FILE_AIO_INTERRUPTED:
return(true);
case OS_FILE_PATH_ERROR: case OS_FILE_ALREADY_EXISTS: case OS_FILE_ACCESS_VIOLATION: return(false);
case OS_FILE_NOT_FOUND: if (!on_error_silent) { sql_print_error("InnoDB: File %s was not found", name); } return false;
case OS_FILE_SHARING_VIOLATION:
std::this_thread::sleep_for(std::chrono::seconds(10)); return(true);
case OS_FILE_OPERATION_ABORTED: case OS_FILE_INSUFFICIENT_RESOURCE:
std::this_thread::sleep_for(std::chrono::milliseconds(100)); return(true);
default:
/* If it is an operation that can crash on error then it
is better to ignore on_error_silent and print an error message to the log. */
if (should_abort || !on_error_silent) { ib::error() << "File " << (name != NULL ? name : "(unknown)") << ": '" << operation << "'" " returned OS error " << err << "." << (should_abort ? " Cannot continue operation" : ""); }
if (should_abort) { abort(); } }
return(false); }
/** Check if the file system supports sparse files.
@param fh file handle @return true if the file system supports sparse files */ static bool os_is_sparse_file_supported(os_file_t fh) { #ifdef _WIN32
FILE_ATTRIBUTE_TAG_INFO info; if (GetFileInformationByHandleEx(fh, FileAttributeTagInfo, &info, (DWORD)sizeof(info))) { if (info.FileAttributes != INVALID_FILE_ATTRIBUTES) { return (info.FileAttributes & FILE_ATTRIBUTE_SPARSE_FILE) != 0; } } return false; #else
/* We don't know the FS block size, use the sector size. The FS
will do the magic. */ return DB_SUCCESS == os_file_punch_hole_posix(fh, 0, srv_page_size); #endif /* _WIN32 */
}
/** Extend a file.
On Windows, extending a file allocates blocks for the file, unless the file is sparse.
On Unix, we will extend the file with ftruncate(), if file needs to be sparse. Otherwise posix_fallocate() is used when available, and if not, binary zeroes are added to the end of file.
@param[in] name file name @param[in] file file handle @param[in] size desired file size @param[in] sparse whether to create a sparse file (no preallocating) @return whether the operation succeeded */ bool os_file_set_size( const char* name, os_file_t file, os_offset_t size, bool is_sparse) { ut_ad(!(size & 4095));
#ifdef _WIN32
/* On Windows, changing file size works well and as expected for both
sparse and normal files.
However, 10.2 up until 10.2.9 made every file sparse in innodb, causing NTFS fragmentation issues(MDEV-13941). We try to undo the damage, and unsparse the file.*/
if (!is_sparse && os_is_sparse_file_supported(file)) { if (!os_file_set_sparse_win32(file, false)) /* Unsparsing file failed. Fallback to writing binary
zeros, to avoid even higher fragmentation.*/ goto fallback; }
return os_file_change_size_win32(name, file, size);
fallback: #else
struct stat statbuf;
if (is_sparse) { bool success = !ftruncate(file, size); if (!success) { ib::error() << "ftruncate of file " << name << " to " << size << " bytes failed with error " << errno; } return(success); }
# ifdef HAVE_POSIX_FALLOCATE
int err; do { if (fstat(file, &statbuf)) { err = errno; } else { MSAN_STAT_WORKAROUND(&statbuf); os_offset_t current_size = statbuf.st_size; if (current_size >= size) { return true; } current_size &= ~4095ULL; # ifdef __linux__
if (!fallocate(file, 0, current_size, size - current_size)) { err = 0; break; }
err = errno; # else
err = posix_fallocate(file, current_size, size - current_size); # endif
} } while (err == EINTR && srv_shutdown_state <= SRV_SHUTDOWN_INITIATED);
switch (err) { case 0: return true; default: ib::error() << "preallocating " << size << " bytes for file " << name << " failed with error " << err; /* fall through */ case EINTR: errno = err; return false; case EINVAL: case EOPNOTSUPP: /* fall back to the code below */ break; } # endif /* HAVE_POSIX_ALLOCATE */
#endif /* _WIN32*/
#ifdef _WIN32
os_offset_t current_size = os_file_get_size(file); FILE_STORAGE_INFO info; if (GetFileInformationByHandleEx(file, FileStorageInfo, &info, sizeof info)) { if (info.LogicalBytesPerSector) { current_size &= ~os_offset_t(info.LogicalBytesPerSector - 1); } } #else
if (fstat(file, &statbuf)) { return false; } os_offset_t current_size = statbuf.st_size & ~4095ULL; #endif
if (current_size >= size) { return true; }
/* Write up to 1 megabyte at a time. */ ulint buf_size = ut_min(ulint(64), ulint(size >> srv_page_size_shift)) << srv_page_size_shift;
/* Align the buffer for possible raw i/o */ byte* buf = static_cast<byte*>(aligned_malloc(buf_size, srv_page_size)); /* Write buffer full of zeros */ memset(buf, 0, buf_size);
while (current_size < size && srv_shutdown_state <= SRV_SHUTDOWN_INITIATED) { ulint n_bytes;
if (size - current_size < (os_offset_t) buf_size) { n_bytes = (ulint) (size - current_size); } else { n_bytes = buf_size; }
if (os_file_write(IORequestWrite, name, file, buf, current_size, n_bytes) != DB_SUCCESS) { break; }
current_size += n_bytes; }
aligned_free(buf);
return(current_size >= size && os_file_flush(file)); }
/** Truncate a file to a specified size in bytes.
@param[in] pathname file path @param[in] file file to be truncated @param[in] size size preserved in bytes @param[in] allow_shrink whether to allow the file to become smaller @return true if success */ bool os_file_truncate( const char* pathname, os_file_t file, os_offset_t size, bool allow_shrink) { if (!allow_shrink) { /* Do nothing if the size preserved is larger than or
equal to the current size of file */ os_offset_t size_bytes = os_file_get_size(file);
if (size >= size_bytes) { return(true); } }
#ifdef _WIN32
return(os_file_change_size_win32(pathname, file, size)); #else /* _WIN32 */
return(os_file_truncate_posix(pathname, file, size)); #endif /* _WIN32 */
}
/** Check the existence and type of the given file.
@param[in] path path name of file @param[out] exists true if the file exists @param[out] type Type of the file, if it exists @return true if call succeeded */ bool os_file_status( const char* path, bool* exists, os_file_type_t* type) { #ifdef _WIN32
return(os_file_status_win32(path, exists, type)); #else
return(os_file_status_posix(path, exists, type)); #endif /* _WIN32 */
}
/** Free storage space associated with a section of the file.
@param[in] fh Open file handle @param[in] off Starting offset (SEEK_SET) @param[in] len Size of the hole @return DB_SUCCESS or error code */ dberr_t os_file_punch_hole( os_file_t fh, os_offset_t off, os_offset_t len) { #ifdef _WIN32
return os_file_punch_hole_win32(fh, off, len); #else
return os_file_punch_hole_posix(fh, off, len); #endif /* _WIN32 */
}
/** Free storage space associated with a section of the file.
@param off byte offset from the start (SEEK_SET) @param len size of the hole in bytes @return DB_SUCCESS or error code */ dberr_t IORequest::punch_hole(os_offset_t off, ulint len) const { ulint trim_len = bpage ? bpage->physical_size() - len : 0;
if (trim_len == 0) { return(DB_SUCCESS); }
off += len;
/* Check does file system support punching holes for this
tablespace. */ if (!node->punch_hole) { return DB_IO_NO_PUNCH_HOLE; }
dberr_t err = os_file_punch_hole(node->handle, off, trim_len);
switch (err) { case DB_SUCCESS: srv_stats.page_compressed_trim_op.inc(); return err; case DB_IO_NO_PUNCH_HOLE: node->punch_hole = false; err = DB_SUCCESS; /* fall through */ default: return err; } }
/*
Get file system block size, by path.
This is expensive on Windows, and not very useful in general, (only shown in some I_S table), so we keep that out of usual stat. */ size_t os_file_get_fs_block_size(const char *path) { #ifdef _WIN32
char volname[MAX_PATH]; if (!GetVolumePathName(path, volname, MAX_PATH)) return 0; DWORD sectorsPerCluster; DWORD bytesPerSector; DWORD numberOfFreeClusters; DWORD totalNumberOfClusters;
if (GetDiskFreeSpace(volname, §orsPerCluster, &bytesPerSector, &numberOfFreeClusters, &totalNumberOfClusters)) return ((size_t) bytesPerSector) * sectorsPerCluster; #else
os_file_stat_t info; if (os_file_get_status(path, &info, false, false) == DB_SUCCESS) return info.block_size; #endif
return 0; }
/** This function returns information about the specified file
@param[in] path pathname of the file @param[out] stat_info information of a file in a directory @param[in] check_rw_perm for testing whether the file can be opened in RW mode @param[in] read_only true if file is opened in read-only mode @return DB_SUCCESS if all OK */ dberr_t os_file_get_status( const char* path, os_file_stat_t* stat_info, bool check_rw_perm, bool read_only) { dberr_t ret;
#ifdef _WIN32
struct _stat64 info;
ret = os_file_get_status_win32( path, stat_info, &info, check_rw_perm, read_only);
#else
struct stat info;
ret = os_file_get_status_posix( path, stat_info, &info, check_rw_perm, read_only);
#endif /* _WIN32 */
if (ret == DB_SUCCESS) { stat_info->ctime = info.st_ctime; stat_info->atime = info.st_atime; stat_info->mtime = info.st_mtime; stat_info->size = info.st_size; }
return(ret); }
static void fake_io_callback(void *c) { tpool::aiocb *cb= static_cast<tpool::aiocb*>(c); ut_ad(read_slots->contains(cb)); static_cast<const IORequest*>(static_cast<const void*>(cb->m_userdata))-> fake_read_complete(cb->m_offset); read_slots->release(cb); }
static void read_io_callback(void *c) { tpool::aiocb *cb= static_cast<tpool::aiocb*>(c); ut_ad(cb->m_opcode == tpool::aio_opcode::AIO_PREAD); ut_ad(read_slots->contains(cb)); const IORequest &request= *static_cast<const IORequest*> (static_cast<const void*>(cb->m_userdata)); request.read_complete(cb->m_err); read_slots->release(cb); }
static void write_io_callback(void *c) { tpool::aiocb *cb= static_cast<tpool::aiocb*>(c); ut_ad(cb->m_opcode == tpool::aio_opcode::AIO_PWRITE); ut_ad(write_slots->contains(cb)); const IORequest &request= *static_cast<const IORequest*> (static_cast<const void*>(cb->m_userdata));
if (UNIV_UNLIKELY(cb->m_err != 0)) ib::info () << "IO Error: " << cb->m_err << " during write of " << cb->m_len << " bytes, for file " << request.node->name << "(" << cb->m_fh << "), returned " << cb->m_ret_len;
request.write_complete(cb->m_err); write_slots->release(cb); }
#ifdef LINUX_NATIVE_AIO
/** Checks if the system supports native linux aio. On some kernel
versions where native aio is supported it won't work on tmpfs. In such cases we can't use native aio.
@return: true if supported, false otherwise. */ static bool is_linux_native_aio_supported() { File fd; io_context_t io_ctx; std::string log_file_path = get_log_file_path();
memset(&io_ctx, 0, sizeof(io_ctx)); if (io_setup(1, &io_ctx)) {
/* The platform does not support native aio. */
return(false);
} else if (!srv_read_only_mode) {
/* Now check if tmpdir supports native aio ops. */ fd = mysql_tmpfile("ib");
if (fd < 0) { ib::warn() << "Unable to create temp file to check" " native AIO support.";
int ret = io_destroy(io_ctx); ut_a(ret != -EINVAL); ut_ad(ret != -EFAULT);
return(false); } } else { fd = my_open(log_file_path.c_str(), O_RDONLY | O_CLOEXEC, MYF(0));
if (fd == -1) {
ib::warn() << "Unable to open \"" << log_file_path << "\" to check native" << " AIO read support.";
int ret = io_destroy(io_ctx); ut_a(ret != EINVAL); ut_ad(ret != EFAULT);
return(false); } }
struct io_event io_event;
memset(&io_event, 0x0, sizeof(io_event));
byte* ptr = static_cast<byte*>(aligned_malloc(srv_page_size, srv_page_size));
struct iocb iocb;
/* Suppress valgrind warning. */ memset(ptr, 0, srv_page_size); memset(&iocb, 0x0, sizeof(iocb));
struct iocb* p_iocb = &iocb;
if (!srv_read_only_mode) {
io_prep_pwrite(p_iocb, fd, ptr, srv_page_size, 0);
} else { ut_a(srv_page_size >= 512); io_prep_pread(p_iocb, fd, ptr, 512, 0); }
int err = io_submit(io_ctx, 1, &p_iocb);
if (err >= 1) { /* Now collect the submitted IO request. */ err = io_getevents(io_ctx, 1, 1, &io_event, NULL); }
aligned_free(ptr); my_close(fd, MYF(MY_WME));
switch (err) { case 1: { int ret = io_destroy(io_ctx); ut_a(ret != -EINVAL); ut_ad(ret != -EFAULT);
return(true); }
case -EINVAL: case -ENOSYS: ib::warn() << "Linux Native AIO not supported. You can either" " move " << (srv_read_only_mode ? log_file_path : "tmpdir") << " to a file system that supports native" " AIO or you can set innodb_use_native_aio to" " FALSE to avoid this message.";
/* fall through. */ default: ib::warn() << "Linux Native AIO check on " << (srv_read_only_mode ? log_file_path : "tmpdir") << "returned error[" << -err << "]"; }
int ret = io_destroy(io_ctx); ut_a(ret != -EINVAL); ut_ad(ret != -EFAULT);
return(false); } #endif
int os_aio_init() { int max_write_events= int(srv_n_write_io_threads * OS_AIO_N_PENDING_IOS_PER_THREAD); int max_read_events= int(srv_n_read_io_threads * OS_AIO_N_PENDING_IOS_PER_THREAD); int max_events= max_read_events + max_write_events; int ret; #if LINUX_NATIVE_AIO
if (srv_use_native_aio && !is_linux_native_aio_supported()) goto disable; #endif
ret= srv_thread_pool->configure_aio(srv_use_native_aio, max_events);
#ifdef LINUX_NATIVE_AIO
if (ret) { ut_ad(srv_use_native_aio); disable: ib::warn() << "Linux Native AIO disabled."; srv_use_native_aio= false; ret= srv_thread_pool->configure_aio(false, max_events); } #endif
#ifdef HAVE_URING
if (ret) { ut_ad(srv_use_native_aio); ib::warn() << "liburing disabled: falling back to innodb_use_native_aio=OFF"; srv_use_native_aio= false; ret= srv_thread_pool->configure_aio(false, max_events); } #endif
if (!ret) { read_slots= new io_slots(max_read_events, srv_n_read_io_threads); write_slots= new io_slots(max_write_events, srv_n_write_io_threads); } return ret; }
/**
Change reader or writer thread parameter on a running server. This includes resizing the io slots, as we calculate number of outstanding IOs based on the these variables.
It is trickier with when Linux AIO is involved (io_context needs to be recreated to account for different number of max_events). With Linux AIO, depending on fs-max-aio number and user and system wide max-aio limitation, this can fail.
Otherwise, we just resize the slots, and allow for more concurrent threads via thread_group setting.
@param[in] n_reader_threads - max number of concurrently executing read callbacks @param[in] n_writer_thread - max number of cuncurrently executing write callbacks @return 0 for success, !=0 for error. */ int os_aio_resize(ulint n_reader_threads, ulint n_writer_threads) { /* Lock the slots, and wait until all current IOs finish.*/ auto &lk_read= read_slots->mutex(), &lk_write= write_slots->mutex(); mysql_mutex_lock(&lk_read); mysql_mutex_lock(&lk_write);
read_slots->wait(lk_read); write_slots->wait(lk_write);
/* Now, all IOs have finished and no new ones can start, due to locks. */ int max_read_events= int(n_reader_threads * OS_AIO_N_PENDING_IOS_PER_THREAD); int max_write_events= int(n_writer_threads * OS_AIO_N_PENDING_IOS_PER_THREAD); int events= max_read_events + max_write_events;
/** Do the Linux AIO dance (this will try to create a new
io context with changed max_events ,etc*/
int ret= srv_thread_pool->reconfigure_aio(srv_use_native_aio, events);
if (ret) { /** Do the best effort. We can't change the parallel io number,
but we still can adjust the number of concurrent completion handlers.*/ read_slots->task_group().set_max_tasks(static_cast<int>(n_reader_threads)); write_slots->task_group().set_max_tasks(static_cast<int>(n_writer_threads)); } else { /* Allocation succeeded, resize the slots*/ read_slots->resize(max_read_events, static_cast<int>(n_reader_threads)); write_slots->resize(max_write_events, static_cast<int>(n_writer_threads)); }
mysql_mutex_unlock(&lk_read); mysql_mutex_unlock(&lk_write); return ret; }
void os_aio_free() { srv_thread_pool->disable_aio(); delete read_slots; delete write_slots; read_slots= nullptr; write_slots= nullptr; }
/** Wait until there are no pending asynchronous writes. */ static void os_aio_wait_until_no_pending_writes_low(bool declare) { const bool notify_wait= declare && write_slots->pending_io_count();
if (notify_wait) tpool::tpool_wait_begin();
write_slots->wait();
if (notify_wait) tpool::tpool_wait_end(); }
/** Wait until there are no pending asynchronous writes.
@param declare whether the wait will be declared in tpool */ void os_aio_wait_until_no_pending_writes(bool declare) { os_aio_wait_until_no_pending_writes_low(declare); buf_dblwr.wait_flush_buffered_writes(); }
/** @return number of pending reads */ size_t os_aio_pending_reads() { mysql_mutex_lock(&read_slots->mutex()); size_t pending= read_slots->pending_io_count(); mysql_mutex_unlock(&read_slots->mutex()); return pending; }
/** @return approximate number of pending reads */ size_t os_aio_pending_reads_approx() { return read_slots->pending_io_count(); }
/** @return number of pending writes */ size_t os_aio_pending_writes() { mysql_mutex_lock(&write_slots->mutex()); size_t pending= write_slots->pending_io_count(); mysql_mutex_unlock(&write_slots->mutex()); return pending; }
/** Wait until all pending asynchronous reads have completed.
@param declare whether the wait will be declared in tpool */ void os_aio_wait_until_no_pending_reads(bool declare) { const bool notify_wait= declare && read_slots->pending_io_count();
if (notify_wait) tpool::tpool_wait_begin();
read_slots->wait();
if (notify_wait) tpool::tpool_wait_end(); }
/** Submit a fake read request during crash recovery.
@param type fake read request @param offset additional context */ void os_fake_read(const IORequest &type, os_offset_t offset) { tpool::aiocb *cb= read_slots->acquire();
cb->m_group= read_slots->get_task_group(); cb->m_fh= type.node->handle.m_file; cb->m_buffer= nullptr; cb->m_len= 0; cb->m_offset= offset; cb->m_opcode= tpool::aio_opcode::AIO_PREAD; new (cb->m_userdata) IORequest{type}; cb->m_internal_task.m_func= fake_io_callback; cb->m_internal_task.m_arg= cb; cb->m_internal_task.m_group= cb->m_group;
srv_thread_pool->submit_task(&cb->m_internal_task); }
/** Request a read or write.
@param type I/O request @param buf buffer @param offset file offset @param n number of bytes @retval DB_SUCCESS if request was queued successfully @retval DB_IO_ERROR on I/O error */ dberr_t os_aio(const IORequest &type, void *buf, os_offset_t offset, size_t n) { ut_ad(n > 0); ut_ad(!(n & 511)); /* payload of page_compressed tables */ ut_ad((offset % UNIV_ZIP_SIZE_MIN) == 0); ut_ad((reinterpret_cast<size_t>(buf) % UNIV_ZIP_SIZE_MIN) == 0); ut_ad(type.is_read() || type.is_write()); ut_ad(type.node); ut_ad(type.node->is_open());
#ifdef WIN_ASYNC_IO
ut_ad((n & 0xFFFFFFFFUL) == n); #endif /* WIN_ASYNC_IO */
#ifdef UNIV_PFS_IO
PSI_file_locker_state state; PSI_file_locker* locker= nullptr; register_pfs_file_io_begin(&state, locker, type.node->handle, n, type.is_write() ? PSI_FILE_WRITE : PSI_FILE_READ, __FILE__, __LINE__); #endif /* UNIV_PFS_IO */
dberr_t err = DB_SUCCESS;
if (!type.is_async()) { err = type.is_read() ? os_file_read_func(type, type.node->handle, buf, offset, n, nullptr) : os_file_write_func(type, type.node->name, type.node->handle, buf, offset, n); func_exit: #ifdef UNIV_PFS_IO
register_pfs_file_io_end(locker, n); #endif /* UNIV_PFS_IO */
return err; }
io_slots* slots; tpool::callback_func callback; tpool::aio_opcode opcode;
if (type.is_read()) { ++os_n_file_reads; slots = read_slots; callback = read_io_callback; opcode = tpool::aio_opcode::AIO_PREAD; } else { ++os_n_file_writes; slots = write_slots; callback = write_io_callback; opcode = tpool::aio_opcode::AIO_PWRITE; }
compile_time_assert(sizeof(IORequest) <= tpool::MAX_AIO_USERDATA_LEN); tpool::aiocb* cb = slots->acquire();
cb->m_buffer = buf; cb->m_callback = callback; cb->m_group = slots->get_task_group(); cb->m_fh = type.node->handle.m_file; cb->m_len = (int)n; cb->m_offset = offset; cb->m_opcode = opcode; new (cb->m_userdata) IORequest{type};
if (srv_thread_pool->submit_io(cb)) { slots->release(cb); os_file_handle_error_no_exit(type.node->name, type.is_read() ? "aio read" : "aio write", false); err = DB_IO_ERROR; type.node->space->release(); }
goto func_exit; }
/** Prints info of the aio arrays.
@param[in,out] file file where to print */ void os_aio_print(FILE* file) { time_t current_time; double time_elapsed;
current_time = time(NULL); time_elapsed = 0.001 + difftime(current_time, os_last_printout);
fprintf(file, "Pending flushes (fsync): " ULINTPF "\n" ULINTPF " OS file reads, %zu OS file writes, %zu OS fsyncs\n", ulint{fil_n_pending_tablespace_flushes}, ulint{os_n_file_reads}, static_cast<size_t>(os_n_file_writes), static_cast<size_t>(os_n_fsyncs));
const ulint n_reads = ulint(MONITOR_VALUE(MONITOR_OS_PENDING_READS)); const ulint n_writes = ulint(MONITOR_VALUE(MONITOR_OS_PENDING_WRITES));
if (n_reads != 0 || n_writes != 0) { fprintf(file, ULINTPF " pending reads, " ULINTPF " pending writes\n", n_reads, n_writes); }
ulint avg_bytes_read = (os_n_file_reads == os_n_file_reads_old) ? 0 : os_bytes_read_since_printout / (os_n_file_reads - os_n_file_reads_old);
fprintf(file, "%.2f reads/s, " ULINTPF " avg bytes/read," " %.2f writes/s, %.2f fsyncs/s\n", static_cast<double>(os_n_file_reads - os_n_file_reads_old) / time_elapsed, avg_bytes_read, static_cast<double>(os_n_file_writes - os_n_file_writes_old) / time_elapsed, static_cast<double>(os_n_fsyncs - os_n_fsyncs_old) / time_elapsed);
os_n_file_reads_old = os_n_file_reads; os_n_file_writes_old = os_n_file_writes; os_n_fsyncs_old = os_n_fsyncs; os_bytes_read_since_printout = 0;
os_last_printout = current_time; }
/** Refreshes the statistics used to print per-second averages. */ void os_aio_refresh_stats() { os_n_fsyncs_old = os_n_fsyncs;
os_bytes_read_since_printout = 0;
os_n_file_reads_old = os_n_file_reads;
os_n_file_writes_old = os_n_file_writes;
os_n_fsyncs_old = os_n_fsyncs;
os_bytes_read_since_printout = 0;
os_last_printout = time(NULL); }
/**
Set the file create umask @param[in] umask The umask to use for file creation. */ void os_file_set_umask(ulint umask) { os_innodb_umask = umask; }
#ifdef _WIN32
/* Checks whether physical drive is on SSD.*/ static bool is_drive_on_ssd(DWORD nr) { char physical_drive_path[32]; snprintf(physical_drive_path, sizeof(physical_drive_path), "\\\\.\\PhysicalDrive%lu", nr);
HANDLE h= CreateFile(physical_drive_path, 0, FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, nullptr, OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, nullptr); if (h == INVALID_HANDLE_VALUE) return false;
DEVICE_SEEK_PENALTY_DESCRIPTOR seek_penalty; STORAGE_PROPERTY_QUERY storage_query{}; storage_query.PropertyId= StorageDeviceSeekPenaltyProperty; storage_query.QueryType= PropertyStandardQuery;
bool on_ssd= false; DWORD bytes_written; if (DeviceIoControl(h, IOCTL_STORAGE_QUERY_PROPERTY, &storage_query, sizeof storage_query, &seek_penalty, sizeof seek_penalty, &bytes_written, nullptr)) { on_ssd= !seek_penalty.IncursSeekPenalty; } else { on_ssd= false; } CloseHandle(h); return on_ssd; }
/*
Checks whether volume is on SSD, by checking all physical drives in that volume. */ static bool is_volume_on_ssd(const char *volume_mount_point) { char volume_name[MAX_PATH];
if (!GetVolumeNameForVolumeMountPoint(volume_mount_point, volume_name, array_elements(volume_name))) { /* This can fail, e.g if file is on network share */ return false; }
/* Chomp last backslash, this is needed to open volume.*/ size_t length= strlen(volume_name); if (length && volume_name[length - 1] == '\\') volume_name[length - 1]= 0;
/* Open volume handle */ HANDLE volume_handle= CreateFile( volume_name, 0, FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, nullptr, OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, nullptr);
if (volume_handle == INVALID_HANDLE_VALUE) return false;
/*
Enumerate all volume extends, check whether all of them are on SSD */
/* Anticipate common case where there is only one extent.*/ VOLUME_DISK_EXTENTS single_extent;
/* But also have a place to manage allocated data.*/ std::unique_ptr<BYTE[]> lifetime;
DWORD bytes_written; VOLUME_DISK_EXTENTS *extents= nullptr; if (DeviceIoControl(volume_handle, IOCTL_VOLUME_GET_VOLUME_DISK_EXTENTS, nullptr, 0, &single_extent, sizeof(single_extent), &bytes_written, nullptr)) { /* Worked on the first try. Use the preallocated buffer.*/ extents= &single_extent; } else { VOLUME_DISK_EXTENTS *last_query= &single_extent; while (GetLastError() == ERROR_MORE_DATA) { DWORD extentCount= last_query->NumberOfDiskExtents; DWORD allocatedSize= FIELD_OFFSET(VOLUME_DISK_EXTENTS, Extents[extentCount]); lifetime.reset(new BYTE[allocatedSize]); last_query= (VOLUME_DISK_EXTENTS *) lifetime.get(); if (DeviceIoControl(volume_handle, IOCTL_VOLUME_GET_VOLUME_DISK_EXTENTS, nullptr, 0, last_query, allocatedSize, &bytes_written, nullptr)) { extents= last_query; break; } } } CloseHandle(volume_handle); if (!extents) return false;
for (DWORD i= 0; i < extents->NumberOfDiskExtents; i++) if (!is_drive_on_ssd(extents->Extents[i].DiskNumber)) return false;
return true; }
#include <unordered_map>
static bool is_path_on_ssd(char *file_path) { /* Preset result, in case something fails, e.g we're on network drive.*/ char volume_path[MAX_PATH]; if (!GetVolumePathName(file_path, volume_path, array_elements(volume_path))) return false; return is_volume_on_ssd(volume_path); }
static bool is_file_on_ssd(HANDLE handle, char *file_path) { ULONGLONG volume_serial_number; FILE_ID_INFO info; if(!GetFileInformationByHandleEx(handle, FileIdInfo, &info, sizeof(info))) return false; volume_serial_number= info.VolumeSerialNumber;
static std::unordered_map<ULONGLONG, bool> cache; static SRWLOCK lock= SRWLOCK_INIT; bool found; bool result; AcquireSRWLockShared(&lock); auto e= cache.find(volume_serial_number); if ((found= e != cache.end())) result= e->second; ReleaseSRWLockShared(&lock); if (!found) { result= is_path_on_ssd(file_path); /* Update cache */ AcquireSRWLockExclusive(&lock); cache[volume_serial_number]= result; ReleaseSRWLockExclusive(&lock); } return result; }
#endif
void fil_node_t::find_metadata(os_file_t file #ifndef _WIN32
, bool create, struct stat *statbuf #endif
) { if (!is_open()) { handle= file; ut_ad(is_open()); }
if (!space->is_compressed()) punch_hole= 0; else if (my_test_if_thinly_provisioned(file)) punch_hole= 2; else punch_hole= IF_WIN(, !create ||) os_is_sparse_file_supported(file);
#ifdef _WIN32
on_ssd= is_file_on_ssd(file, name); FILE_STORAGE_INFO info; if (GetFileInformationByHandleEx(file, FileStorageInfo, &info, sizeof info)) block_size= info.PhysicalBytesPerSectorForAtomicity; else block_size= 512; #else
struct stat sbuf; if (!statbuf && !fstat(file, &sbuf)) { MSAN_STAT_WORKAROUND(&sbuf); statbuf= &sbuf; } if (statbuf) block_size= statbuf->st_blksize; # ifdef __linux__
on_ssd= statbuf && fil_system.is_ssd(statbuf->st_dev); # endif
#endif
if (space->purpose != FIL_TYPE_TABLESPACE) { /* For temporary tablespace or during IMPORT TABLESPACE, we
disable neighbour flushing and do not care about atomicity. */ on_ssd= true; atomic_write= true; } else /* On Windows, all single sector writes are atomic, as per
WriteFile() documentation on MSDN. */ atomic_write= srv_use_atomic_writes && IF_WIN(srv_page_size == block_size, my_test_if_atomic_write(file, space->physical_size())); }
/** Read the first page of a data file.
@return whether the page was found valid */ bool fil_node_t::read_page0() { mysql_mutex_assert_owner(&fil_system.mutex); const unsigned psize= space->physical_size(); #ifndef _WIN32
struct stat statbuf; if (fstat(handle, &statbuf)) return false; MSAN_STAT_WORKAROUND(&statbuf); os_offset_t size_bytes= statbuf.st_size; #else
os_offset_t size_bytes= os_file_get_size(handle); ut_a(size_bytes != (os_offset_t) -1); #endif
const uint32_t min_size= FIL_IBD_FILE_INITIAL_SIZE * psize;
if (size_bytes < min_size) { ib::error() << "The size of the file " << name << " is only " << size_bytes << " bytes, should be at least " << min_size; return false; }
if (!deferred) { page_t *page= static_cast<byte*>(aligned_malloc(psize, psize)); if (os_file_read(IORequestRead, handle, page, 0, psize, nullptr) != DB_SUCCESS) { sql_print_error("InnoDB: Unable to read first page of file %s", name); aligned_free(page); return false; }
const ulint space_id= memcmp_aligned<2> (FIL_PAGE_SPACE_ID + page, FSP_HEADER_OFFSET + FSP_SPACE_ID + page, 4) ? ULINT_UNDEFINED : mach_read_from_4(FIL_PAGE_SPACE_ID + page); uint32_t flags= fsp_header_get_flags(page); const uint32_t size= fsp_header_get_field(page, FSP_SIZE); const uint32_t free_limit= fsp_header_get_field(page, FSP_FREE_LIMIT); const uint32_t free_len= flst_get_len(FSP_HEADER_OFFSET + FSP_FREE + page); if (!fil_space_t::is_valid_flags(flags, space->id)) { uint32_t cflags= fsp_flags_convert_from_101(flags); if (cflags != UINT32_MAX) { uint32_t cf= cflags & ~FSP_FLAGS_MEM_MASK; uint32_t sf= space->flags & ~FSP_FLAGS_MEM_MASK;
if (fil_space_t::is_flags_equal(cf, sf) || fil_space_t::is_flags_equal(sf, cf)) { flags= cflags; goto flags_ok; } }
aligned_free(page); goto invalid; }
if (!fil_space_t::is_flags_equal((flags & ~FSP_FLAGS_MEM_MASK), (space->flags & ~FSP_FLAGS_MEM_MASK)) && !fil_space_t::is_flags_equal((space->flags & ~FSP_FLAGS_MEM_MASK), (flags & ~FSP_FLAGS_MEM_MASK))) { invalid: sql_print_error("InnoDB: Expected tablespace flags 0x%zx but found 0x%zx" " in the file %s", space->flags, flags, name); return false; }
flags_ok: ut_ad(!(flags & FSP_FLAGS_MEM_MASK));
/* Try to read crypt_data from page 0 if it is not yet read. */ if (!space->crypt_data) space->crypt_data= fil_space_read_crypt_data( fil_space_t::zip_size(flags), page); aligned_free(page);
if (UNIV_UNLIKELY(space_id != space->id)) { ib::error() << "Expected tablespace id " << space->id << " but found " << space_id << " in the file " << name; return false; }
space->flags= (space->flags & FSP_FLAGS_MEM_MASK) | flags; ut_ad(space->free_limit == 0 || space->free_limit == free_limit); ut_ad(space->free_len == 0 || space->free_len == free_len); space->size_in_header= size; space->free_limit= free_limit; space->free_len= free_len; }
IF_WIN(find_metadata(), find_metadata(handle, false, &statbuf)); /* Truncate the size to a multiple of extent size. */ ulint mask= psize * FSP_EXTENT_SIZE - 1;
if (size_bytes <= mask); /* .ibd files start smaller than an
extent size. Do not truncate valid data. */ else size_bytes&= ~os_offset_t(mask);
this->size= uint32_t(size_bytes / psize); space->set_sizes(this->size); return true; }
|