Browse Source

MDEV-37103 innodb_immediate_scrub_data_uncompressed=ON may break innodb_undo_log_truncate=ON

The test innodb.undo_truncate occasionally demonstrates a race condition
where scrubbing is writing zeroes to a freed undo page, and
innodb_undo_log_truncate=ON truncating the same tablespace. The
truncation is an exception to the rule that InnoDB tablespace file sizes
can only grow, never shrink.

The fields fil_space_t::size and fil_node_t::size are protected by
fil_system.mutex, which used to be a highly contended resource. We
do not want to revert back to acquiring the mutex in fil_space_t::io()
because that would introduce an obvious scalability bottleneck.

fil_space_t::flush_freed(): Do not try to scrub pages of the undo
tablespace in order to prevent a race condition between io()
and undo tablespace truncation.

fil_space_t::io(): Prevent a null pointer dereference when reporting
an out-of-bounds access to the non-first file of the system or
temporary tablespace. Do not invoke set_corrupted() after an
out-of-bounds asynchronous read.

Note: fil_space_t::flush_freed() may only invoke PUNCH_RANGE on
page_compressed tablespaces, never on an undo tablespace.
pull/4203/head
Marko Mäkelä 3 months ago
parent
commit
024c7e881f
  1. 9
      storage/innobase/buf/buf0flu.cc
  2. 17
      storage/innobase/fil/fil0fil.cc

9
storage/innobase/buf/buf0flu.cc

@ -985,12 +985,15 @@ MY_ATTRIBUTE((warn_unused_result))
@return number of pages written or hole-punched */
uint32_t fil_space_t::flush_freed(bool writable) noexcept
{
mysql_mutex_assert_not_owner(&buf_pool.flush_list_mutex);
mysql_mutex_assert_not_owner(&buf_pool.mutex);
const bool punch_hole= chain.start->punch_hole == 1;
if (!punch_hole && !srv_immediate_scrub_data_uncompressed)
return 0;
mysql_mutex_assert_not_owner(&buf_pool.flush_list_mutex);
mysql_mutex_assert_not_owner(&buf_pool.mutex);
if (srv_is_undo_tablespace(id))
/* innodb_undo_log_truncate=ON can take care of these better */
return 0;
for (;;)
{

17
storage/innobase/fil/fil0fil.cc

@ -2696,23 +2696,30 @@ fil_io_t fil_space_t::io(const IORequest &type, os_offset_t offset, size_t len,
while (node->size <= p) {
p -= node->size;
node = UT_LIST_GET_NEXT(chain, node);
if (!node) {
if (!UT_LIST_GET_NEXT(chain, node)) {
fail:
if (type.type != IORequest::READ_ASYNC) {
switch (type.type) {
case IORequest::READ_ASYNC:
/* Read-ahead may be requested for
non-existing pages. Ignore such
requests. */
break;
default:
fil_invalid_page_access_msg(
node->name,
offset, len,
type.is_read());
}
#ifndef DBUG_OFF
io_error:
#endif
set_corrupted();
set_corrupted();
}
err = DB_CORRUPTION;
node = nullptr;
goto release;
}
node = UT_LIST_GET_NEXT(chain, node);
}
offset = os_offset_t{p} << srv_page_size_shift;

Loading…
Cancel
Save