You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

3725 lines
97 KiB

12 years ago
Shut down InnoDB after aborted startup. This fixes memory leaks in tests that cause InnoDB startup to fail. buf_pool_free_instance(): Also free buf_pool->flush_rbt, which would normally be freed when crash recovery finishes. fil_node_close_file(), fil_space_free_low(), fil_close_all_files(): Relax some debug assertions to tolerate !srv_was_started. innodb_shutdown(): Renamed from innobase_shutdown_for_mysql(). Changed the return type to void. Do not assume that all subsystems were started. que_init(), que_close(): Remove (empty functions). srv_init(), srv_general_init(): Remove as global functions. srv_free(): Allow srv_sys=NULL. srv_get_active_thread_type(): Only return SRV_PURGE if purge really is running. srv_shutdown_all_bg_threads(): Do not reset srv_start_state. It will be needed by innodb_shutdown(). innobase_start_or_create_for_mysql(): Always call srv_boot() so that innodb_shutdown() can assume that it was called. Make more subsystems dependent on SRV_START_STATE_STAT. srv_shutdown_bg_undo_sources(): Require SRV_START_STATE_STAT. trx_sys_close(): Do not assume purge_sys!=NULL. Do not call buf_dblwr_free(), because the doublewrite buffer can exist while the transaction system does not. logs_empty_and_mark_files_at_shutdown(): Do a faster shutdown if !srv_was_started. recv_sys_close(): Invoke dblwr.pages.clear() which would normally be invoked by buf_dblwr_process(). recv_recovery_from_checkpoint_start(): Always release log_sys->mutex. row_mysql_close(): Allow the subsystem not to exist.
9 years ago
9 years ago
12 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
Merge Google encryption commit 195158e9889365dc3298f8c1f3bcaa745992f27f Author: Minli Zhu <minliz@google.com> Date: Mon Nov 25 11:05:55 2013 -0800 Innodb redo log encryption/decryption. Use start lsn of a log block as part of AES CTR counter. Record key version with each checkpoint. Internally key version 0 means no encryption. Tests done (see test_innodb_log_encryption.sh for detail): - Verify flag innodb_encrypt_log on or off, combined with various key versions passed through CLI, and dynamically set after startup, will not corrupt database. This includes tests from being unencrypted to encrypted, and encrypted to unencrypted. - Verify start-up with no redo logs succeeds. - Verify fresh start-up succeeds. Change-Id: I4ce4c2afdf3076be2fce90ebbc2a7ce01184b612 commit c1b97273659f07866758c25f4a56f680a1fbad24 Author: Jonas Oreland <jonaso@google.com> Date: Tue Dec 3 18:47:27 2013 +0100 encryption of aria data&index files this patch implements encryption of aria data & index files. this is implemented as 1) add read/write hooks (renamed from callbacks) that does encrypt/decrypt (also add pre_read and post_write hooks) 2) modify page headers for data/index to contain key version (making the data-page header size different for with/without encryption) 3) modify index page 0 to contain IV (and crypt header) 4) AES CRT crypt functions 5) counter block is implemented using combination of page no, lsn and table specific id NOTE: 1) log files are not encrypted, this is not needed for if aria is only used for internal temporary tables and they are not transactional (i.e not logged) 2) all encrypted tables are using PAGE_CHECKSUM (crc) normal internal temporary tables are (currently) not CHECKSUM:ed 3) This patch adds insert-order semantics to aria block_format. The default behaviour of aria block-format is best-fit, meaning that rows gets allocated to page trying to fill the pages as much as possible. However, certain sql constructs materialize temporary result in tmp-tables, and expect that a table scan will later return the rows in the same order they were inserted. This implementation of insert-order is only enabled when explicitly requested by sql-layer. CHANGES: 1) found bug in ma_write that made code try to abort a record that was never written unsure why this is not exposed Change-Id: Ia82bbaa92e2c0629c08693c5add2f56b815c0509 commit 89dc1ab651fe0205d55b4eb588f62df550aa65fc Author: Jonas Oreland <jonaso@google.com> Date: Mon Feb 17 08:04:50 2014 -0800 Implement encryption of innodb datafiles. Pages are encrypted before written to disk and decrypted when read from disk. Each page except first page (page 0) in tablespace is encrypted. Page 0 is unencrypted and contains IV for the tablespace. FIL_PAGE_FILE_FLUSH_LSN on each page (except page 0) is used to store a 32-bit key-version, so that multiple keys can be active in a tablespace simultaneous. The other 32-bit of the FIL_PAGE_FILE_FLUSH_LSN field contains a checksum that is computed after encryption. This checksum is used by innochecksum and when restoring from double-write-buffer. The encryption is performed using AES CRT. Monitoring of encryption is enabled using new IS-table INNODB_TABLESPACES_ENCRYPTION. In addition to that new status variables innodb_encryption_rotation_{ pages_read_from_cache, pages_read_from_disk, pages_modified,pages_flushed } has been added. The following tunables are introduces - innodb_encrypt_tables - innodb_encryption_threads - innodb_encryption_rotate_key_age - innodb_encryption_rotation_iops Change-Id: I8f651795a30b52e71b16d6bc9cb7559be349d0b2 commit a17eef2f6948e58219c9e26fc35633d6fd4de1de Author: Andrew Ford <andrewford@google.com> Date: Thu Jan 2 15:43:09 2014 -0800 Key management skeleton with debug hooks. Change-Id: Ifd6aa3743d7ea291c70083f433a059c439aed866 commit 68a399838ad72264fd61b3dc67fecd29bbdb0af1 Author: Andrew Ford <andrewford@google.com> Date: Mon Oct 28 16:27:44 2013 -0700 Add AES-128 CTR and GCM encryption classes. Change-Id: I116305eced2a233db15306bc2ef5b9d398d1a3a2
11 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-11254: innodb-use-trim has no effect in 10.2 Problem was that implementation merged from 10.1 was incompatible with InnoDB 5.7. buf0buf.cc: Add functions to return should we punch hole and how big. buf0flu.cc: Add written page to IORequest fil0fil.cc: Remove unneeded status call and add test is sparse files and punch hole supported by file system when tablespace is created. Add call to get file system block size. Used file node is added to IORequest. Added functions to check is punch hole supported and setting punch hole. ha_innodb.cc: Remove unneeded status variables (trim512-32768) and trim_op_saved. Deprecate innodb_use_trim and set it ON by default. Add function to set innodb-use-trim dynamically. dberr.h: Add error code DB_IO_NO_PUNCH_HOLE if punch hole operation fails. fil0fil.h: Add punch_hole variable to fil_space_t and block size to fil_node_t. os0api.h: Header to helper functions on buf0buf.cc and fil0fil.cc for os0file.h os0file.h: Remove unneeded m_block_size from IORequest and add bpage to IORequest to know actual size of the block and m_fil_node to know tablespace file system block size and does it support punch hole. os0file.cc: Add function punch_hole() to IORequest to do punch_hole operation, get the file system block size and determine does file system support sparse files (for punch hole). page0size.h: remove implicit copy disable and use this implicit copy to implement copy_from() function. buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h, os0file.h, os0file.cc, log0log.cc, log0recv.cc: Remove unneeded write_size parameter from fil_io calls. srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded trim512-trim32678 status variables. Removed these from monitor tests.
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
Merge Google encryption commit 195158e9889365dc3298f8c1f3bcaa745992f27f Author: Minli Zhu <minliz@google.com> Date: Mon Nov 25 11:05:55 2013 -0800 Innodb redo log encryption/decryption. Use start lsn of a log block as part of AES CTR counter. Record key version with each checkpoint. Internally key version 0 means no encryption. Tests done (see test_innodb_log_encryption.sh for detail): - Verify flag innodb_encrypt_log on or off, combined with various key versions passed through CLI, and dynamically set after startup, will not corrupt database. This includes tests from being unencrypted to encrypted, and encrypted to unencrypted. - Verify start-up with no redo logs succeeds. - Verify fresh start-up succeeds. Change-Id: I4ce4c2afdf3076be2fce90ebbc2a7ce01184b612 commit c1b97273659f07866758c25f4a56f680a1fbad24 Author: Jonas Oreland <jonaso@google.com> Date: Tue Dec 3 18:47:27 2013 +0100 encryption of aria data&index files this patch implements encryption of aria data & index files. this is implemented as 1) add read/write hooks (renamed from callbacks) that does encrypt/decrypt (also add pre_read and post_write hooks) 2) modify page headers for data/index to contain key version (making the data-page header size different for with/without encryption) 3) modify index page 0 to contain IV (and crypt header) 4) AES CRT crypt functions 5) counter block is implemented using combination of page no, lsn and table specific id NOTE: 1) log files are not encrypted, this is not needed for if aria is only used for internal temporary tables and they are not transactional (i.e not logged) 2) all encrypted tables are using PAGE_CHECKSUM (crc) normal internal temporary tables are (currently) not CHECKSUM:ed 3) This patch adds insert-order semantics to aria block_format. The default behaviour of aria block-format is best-fit, meaning that rows gets allocated to page trying to fill the pages as much as possible. However, certain sql constructs materialize temporary result in tmp-tables, and expect that a table scan will later return the rows in the same order they were inserted. This implementation of insert-order is only enabled when explicitly requested by sql-layer. CHANGES: 1) found bug in ma_write that made code try to abort a record that was never written unsure why this is not exposed Change-Id: Ia82bbaa92e2c0629c08693c5add2f56b815c0509 commit 89dc1ab651fe0205d55b4eb588f62df550aa65fc Author: Jonas Oreland <jonaso@google.com> Date: Mon Feb 17 08:04:50 2014 -0800 Implement encryption of innodb datafiles. Pages are encrypted before written to disk and decrypted when read from disk. Each page except first page (page 0) in tablespace is encrypted. Page 0 is unencrypted and contains IV for the tablespace. FIL_PAGE_FILE_FLUSH_LSN on each page (except page 0) is used to store a 32-bit key-version, so that multiple keys can be active in a tablespace simultaneous. The other 32-bit of the FIL_PAGE_FILE_FLUSH_LSN field contains a checksum that is computed after encryption. This checksum is used by innochecksum and when restoring from double-write-buffer. The encryption is performed using AES CRT. Monitoring of encryption is enabled using new IS-table INNODB_TABLESPACES_ENCRYPTION. In addition to that new status variables innodb_encryption_rotation_{ pages_read_from_cache, pages_read_from_disk, pages_modified,pages_flushed } has been added. The following tunables are introduces - innodb_encrypt_tables - innodb_encryption_threads - innodb_encryption_rotate_key_age - innodb_encryption_rotation_iops Change-Id: I8f651795a30b52e71b16d6bc9cb7559be349d0b2 commit a17eef2f6948e58219c9e26fc35633d6fd4de1de Author: Andrew Ford <andrewford@google.com> Date: Thu Jan 2 15:43:09 2014 -0800 Key management skeleton with debug hooks. Change-Id: Ifd6aa3743d7ea291c70083f433a059c439aed866 commit 68a399838ad72264fd61b3dc67fecd29bbdb0af1 Author: Andrew Ford <andrewford@google.com> Date: Mon Oct 28 16:27:44 2013 -0700 Add AES-128 CTR and GCM encryption classes. Change-Id: I116305eced2a233db15306bc2ef5b9d398d1a3a2
11 years ago
MDEV-11738: Mariadb uses 100% of several of my 8 cpus doing nothing MDEV-11581: Mariadb starts InnoDB encryption threads when key has not changed or data scrubbing turned off Background: Key rotation is based on background threads (innodb-encryption-threads) periodically going through all tablespaces on fil_system. For each tablespace current used key version is compared to max key age (innodb-encryption-rotate-key-age). This process naturally takes CPU. Similarly, in same time need for scrubbing is investigated. Currently, key rotation is fully supported on Amazon AWS key management plugin only but InnoDB does not have knowledge what key management plugin is used. This patch re-purposes innodb-encryption-rotate-key-age=0 to disable key rotation and background data scrubbing. All new tables are added to special list for key rotation and key rotation is based on sending a event to background encryption threads instead of using periodic checking (i.e. timeout). fil0fil.cc: Added functions fil_space_acquire_low() to acquire a tablespace when it could be dropped concurrently. This function is used from fil_space_acquire() or fil_space_acquire_silent() that will not print any messages if we try to acquire space that does not exist. fil_space_release() to release a acquired tablespace. fil_space_next() to iterate tablespaces in fil_system using fil_space_acquire() and fil_space_release(). Similarly, fil_space_keyrotation_next() to iterate new list fil_system->rotation_list where new tables. are added if key rotation is disabled. Removed unnecessary functions fil_get_first_space_safe() fil_get_next_space_safe() fil_node_open_file(): After page 0 is read read also crypt_info if it is not yet read. btr_scrub_lock_dict_func() buf_page_check_corrupt() buf_page_encrypt_before_write() buf_merge_or_delete_for_page() lock_print_info_all_transactions() row_fts_psort_info_init() row_truncate_table_for_mysql() row_drop_table_for_mysql() Use fil_space_acquire()/release() to access fil_space_t. buf_page_decrypt_after_read(): Use fil_space_get_crypt_data() because at this point we might not yet have read page 0. fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly to functions needing it and store fil_space_t* to rotation state. Use fil_space_acquire()/release() when iterating tablespaces and removed unnecessary is_closing from fil_crypt_t. Use fil_space_t::is_stopping() to detect when access to tablespace should be stopped. Removed unnecessary fil_space_get_crypt_data(). fil_space_create(): Inform key rotation that there could be something to do if key rotation is disabled and new table with encryption enabled is created. Remove unnecessary functions fil_get_first_space_safe() and fil_get_next_space_safe(). fil_space_acquire() and fil_space_release() are used instead. Moved fil_space_get_crypt_data() and fil_space_set_crypt_data() to fil0crypt.cc. fsp_header_init(): Acquire fil_space_t*, write crypt_data and release space. check_table_options() Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_* i_s.cc: Added ROTATING_OR_FLUSHING field to information_schema.innodb_tablespace_encryption to show current status of key rotation.
9 years ago
Merge Google encryption commit 195158e9889365dc3298f8c1f3bcaa745992f27f Author: Minli Zhu <minliz@google.com> Date: Mon Nov 25 11:05:55 2013 -0800 Innodb redo log encryption/decryption. Use start lsn of a log block as part of AES CTR counter. Record key version with each checkpoint. Internally key version 0 means no encryption. Tests done (see test_innodb_log_encryption.sh for detail): - Verify flag innodb_encrypt_log on or off, combined with various key versions passed through CLI, and dynamically set after startup, will not corrupt database. This includes tests from being unencrypted to encrypted, and encrypted to unencrypted. - Verify start-up with no redo logs succeeds. - Verify fresh start-up succeeds. Change-Id: I4ce4c2afdf3076be2fce90ebbc2a7ce01184b612 commit c1b97273659f07866758c25f4a56f680a1fbad24 Author: Jonas Oreland <jonaso@google.com> Date: Tue Dec 3 18:47:27 2013 +0100 encryption of aria data&index files this patch implements encryption of aria data & index files. this is implemented as 1) add read/write hooks (renamed from callbacks) that does encrypt/decrypt (also add pre_read and post_write hooks) 2) modify page headers for data/index to contain key version (making the data-page header size different for with/without encryption) 3) modify index page 0 to contain IV (and crypt header) 4) AES CRT crypt functions 5) counter block is implemented using combination of page no, lsn and table specific id NOTE: 1) log files are not encrypted, this is not needed for if aria is only used for internal temporary tables and they are not transactional (i.e not logged) 2) all encrypted tables are using PAGE_CHECKSUM (crc) normal internal temporary tables are (currently) not CHECKSUM:ed 3) This patch adds insert-order semantics to aria block_format. The default behaviour of aria block-format is best-fit, meaning that rows gets allocated to page trying to fill the pages as much as possible. However, certain sql constructs materialize temporary result in tmp-tables, and expect that a table scan will later return the rows in the same order they were inserted. This implementation of insert-order is only enabled when explicitly requested by sql-layer. CHANGES: 1) found bug in ma_write that made code try to abort a record that was never written unsure why this is not exposed Change-Id: Ia82bbaa92e2c0629c08693c5add2f56b815c0509 commit 89dc1ab651fe0205d55b4eb588f62df550aa65fc Author: Jonas Oreland <jonaso@google.com> Date: Mon Feb 17 08:04:50 2014 -0800 Implement encryption of innodb datafiles. Pages are encrypted before written to disk and decrypted when read from disk. Each page except first page (page 0) in tablespace is encrypted. Page 0 is unencrypted and contains IV for the tablespace. FIL_PAGE_FILE_FLUSH_LSN on each page (except page 0) is used to store a 32-bit key-version, so that multiple keys can be active in a tablespace simultaneous. The other 32-bit of the FIL_PAGE_FILE_FLUSH_LSN field contains a checksum that is computed after encryption. This checksum is used by innochecksum and when restoring from double-write-buffer. The encryption is performed using AES CRT. Monitoring of encryption is enabled using new IS-table INNODB_TABLESPACES_ENCRYPTION. In addition to that new status variables innodb_encryption_rotation_{ pages_read_from_cache, pages_read_from_disk, pages_modified,pages_flushed } has been added. The following tunables are introduces - innodb_encrypt_tables - innodb_encryption_threads - innodb_encryption_rotate_key_age - innodb_encryption_rotation_iops Change-Id: I8f651795a30b52e71b16d6bc9cb7559be349d0b2 commit a17eef2f6948e58219c9e26fc35633d6fd4de1de Author: Andrew Ford <andrewford@google.com> Date: Thu Jan 2 15:43:09 2014 -0800 Key management skeleton with debug hooks. Change-Id: Ifd6aa3743d7ea291c70083f433a059c439aed866 commit 68a399838ad72264fd61b3dc67fecd29bbdb0af1 Author: Andrew Ford <andrewford@google.com> Date: Mon Oct 28 16:27:44 2013 -0700 Add AES-128 CTR and GCM encryption classes. Change-Id: I116305eced2a233db15306bc2ef5b9d398d1a3a2
11 years ago
MDEV-11556 InnoDB redo log apply fails to adjust data file sizes fil_space_t::recv_size: New member: recovered tablespace size in pages; 0 if no size change was read from the redo log, or if the size change was implemented. fil_space_set_recv_size(): New function for setting space->recv_size. innodb_data_file_size_debug: A debug parameter for setting the system tablespace size in recovery even when the redo log does not contain any size changes. It is hard to write a small test case that would cause the system tablespace to be extended at the critical moment. recv_parse_log_rec(): Note those tablespaces whose size is being changed by the redo log, by invoking fil_space_set_recv_size(). innobase_init(): Correct an error message, and do not require a larger innodb_buffer_pool_size when starting up with a smaller innodb_page_size. innobase_start_or_create_for_mysql(): Allow startup with any initial size of the ibdata1 file if the autoextend attribute is set. Require the minimum size of fixed-size system tablespaces to be 640 pages, not 10 megabytes. Implement innodb_data_file_size_debug. open_or_create_data_files(): Round the system tablespace size down to pages, not to full megabytes, (Our test truncates the system tablespace to more than 800 pages with innodb_page_size=4k. InnoDB should not imagine that it was truncated to 768 pages and then overwrite good pages in the tablespace.) fil_flush_low(): Refactored from fil_flush(). fil_space_extend_must_retry(): Refactored from fil_extend_space_to_desired_size(). fil_mutex_enter_and_prepare_for_io(): Extend the tablespace if fil_space_set_recv_size() was called. The test case has been successfully run with all the innodb_page_size values 4k, 8k, 16k, 32k, 64k.
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
Shut down InnoDB after aborted startup. This fixes memory leaks in tests that cause InnoDB startup to fail. buf_pool_free_instance(): Also free buf_pool->flush_rbt, which would normally be freed when crash recovery finishes. fil_node_close_file(), fil_space_free_low(), fil_close_all_files(): Relax some debug assertions to tolerate !srv_was_started. innodb_shutdown(): Renamed from innobase_shutdown_for_mysql(). Changed the return type to void. Do not assume that all subsystems were started. que_init(), que_close(): Remove (empty functions). srv_init(), srv_general_init(): Remove as global functions. srv_free(): Allow srv_sys=NULL. srv_get_active_thread_type(): Only return SRV_PURGE if purge really is running. srv_shutdown_all_bg_threads(): Do not reset srv_start_state. It will be needed by innodb_shutdown(). innobase_start_or_create_for_mysql(): Always call srv_boot() so that innodb_shutdown() can assume that it was called. Make more subsystems dependent on SRV_START_STATE_STAT. srv_shutdown_bg_undo_sources(): Require SRV_START_STATE_STAT. trx_sys_close(): Do not assume purge_sys!=NULL. Do not call buf_dblwr_free(), because the doublewrite buffer can exist while the transaction system does not. logs_empty_and_mark_files_at_shutdown(): Do a faster shutdown if !srv_was_started. recv_sys_close(): Invoke dblwr.pages.clear() which would normally be invoked by buf_dblwr_process(). recv_recovery_from_checkpoint_start(): Always release log_sys->mutex. row_mysql_close(): Allow the subsystem not to exist.
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-12103 Reduce the time of looking for MLOG_CHECKPOINT during crash recovery This fixes MySQL Bug#80788 in MariaDB 10.2.5. When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that can be shortened. This fix will slightly extend the InnoDB redo log format that I introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT mini-transaction to the end of the log checkpoint page, so that recovery can jump straight to it without scanning all the preceding redo log. LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were written as 0. log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter end_lsn for writing LOG_CHECKPOINT_END_LSN. log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT mini-transaction is starting (or at which the redo log ends on shutdown). recv_init_crash_recovery(): Remove. recv_group_scan_log_recs(): Add the parameter checkpoint_lsn. recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN and if it is set, start the first scan from it instead of the checkpoint LSN. Improve some messages and remove bogus assertions. recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some file-level redo log records. recv_parse_or_apply_log_rec_body(): If we have not parsed all redo log between the checkpoint and the corresponding MLOG_CHECKPOINT record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records to recv_init_crash_recovery_spaces(). recv_init_crash_recovery_spaces(): Refuse recovery if MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
9 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
11 years ago
MDEV-11782: Redefine the innodb_encrypt_log format Write only one encryption key to the checkpoint page. Use 4 bytes of nonce. Encrypt more of each redo log block, only skipping the 4-byte field LOG_BLOCK_HDR_NO which the initialization vector is derived from. Issue notes, not warning messages for rewriting the redo log files. recv_recovery_from_checkpoint_finish(): Do not generate any redo log, because we must avoid that before rewriting the redo log files, or otherwise a crash during a redo log rewrite (removing or adding encryption) may end up making the database unrecoverable. Instead, do these tasks in innobase_start_or_create_for_mysql(). Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some unreachable code and duplicated error messages for log corruption. LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo log format. log_group_t::is_encrypted(), log_t::is_encrypted(): Determine if the redo log is in encrypted format. recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED. srv_prepare_to_delete_redo_log_files(): Display NOTE messages about adding or removing encryption. Do not issue warnings for redo log resizing any more. innobase_start_or_create_for_mysql(): Rebuild the redo logs also when the encryption changes. innodb_log_checksums_func_update(): Always use the CRC-32C checksum if innodb_encrypt_log. If needed, issue a warning that innodb_encrypt_log implies innodb_log_checksums. log_group_write_buf(): Compute the checksum on the encrypted block contents, so that transmission errors or incomplete blocks can be detected without decrypting. Rewrite most of the redo log encryption code. Only remember one encryption key at a time (but remember up to 5 when upgrading from the MariaDB 10.1 format.)
9 years ago
12 years ago
12 years ago
12 years ago
12 years ago
12 years ago
12 years ago
12 years ago
12 years ago
12 years ago
12 years ago
12 years ago
12 years ago
  1. /*****************************************************************************
  2. Copyright (c) 1997, 2017, Oracle and/or its affiliates. All Rights Reserved.
  3. Copyright (c) 2012, Facebook Inc.
  4. Copyright (c) 2013, 2017, MariaDB Corporation.
  5. This program is free software; you can redistribute it and/or modify it under
  6. the terms of the GNU General Public License as published by the Free Software
  7. Foundation; version 2 of the License.
  8. This program is distributed in the hope that it will be useful, but WITHOUT
  9. ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
  10. FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
  11. You should have received a copy of the GNU General Public License along with
  12. this program; if not, write to the Free Software Foundation, Inc.,
  13. 51 Franklin Street, Suite 500, Boston, MA 02110-1335 USA
  14. *****************************************************************************/
  15. /**************************************************//**
  16. @file log/log0recv.cc
  17. Recovery
  18. Created 9/20/1997 Heikki Tuuri
  19. *******************************************************/
  20. #include "ha_prototypes.h"
  21. #include <vector>
  22. #include <map>
  23. #include <string>
  24. #include <my_systemd.h>
  25. #include "log0recv.h"
  26. #ifdef HAVE_MY_AES_H
  27. #include <my_aes.h>
  28. #endif
  29. #include "log0crypt.h"
  30. #include "mem0mem.h"
  31. #include "buf0buf.h"
  32. #include "buf0flu.h"
  33. #include "mtr0mtr.h"
  34. #include "mtr0log.h"
  35. #include "page0cur.h"
  36. #include "page0zip.h"
  37. #include "btr0btr.h"
  38. #include "btr0cur.h"
  39. #include "ibuf0ibuf.h"
  40. #include "trx0undo.h"
  41. #include "trx0rec.h"
  42. #include "fil0fil.h"
  43. #include "fsp0sysspace.h"
  44. #include "ut0new.h"
  45. #include "row0trunc.h"
  46. #include "buf0rea.h"
  47. #include "srv0srv.h"
  48. #include "srv0start.h"
  49. #include "trx0roll.h"
  50. #include "row0merge.h"
  51. /** Log records are stored in the hash table in chunks at most of this size;
  52. this must be less than UNIV_PAGE_SIZE as it is stored in the buffer pool */
  53. #define RECV_DATA_BLOCK_SIZE (MEM_MAX_ALLOC_IN_BUF - sizeof(recv_data_t))
  54. /** Read-ahead area in applying log records to file pages */
  55. #define RECV_READ_AHEAD_AREA 32
  56. /** The recovery system */
  57. recv_sys_t* recv_sys;
  58. /** TRUE when applying redo log records during crash recovery; FALSE
  59. otherwise. Note that this is FALSE while a background thread is
  60. rolling back incomplete transactions. */
  61. volatile bool recv_recovery_on;
  62. /** TRUE when recv_init_crash_recovery() has been called. */
  63. bool recv_needed_recovery;
  64. #ifdef UNIV_DEBUG
  65. /** TRUE if writing to the redo log (mtr_commit) is forbidden.
  66. Protected by log_sys->mutex. */
  67. bool recv_no_log_write = false;
  68. #endif /* UNIV_DEBUG */
  69. /** TRUE if buf_page_is_corrupted() should check if the log sequence
  70. number (FIL_PAGE_LSN) is in the future. Initially FALSE, and set by
  71. recv_recovery_from_checkpoint_start(). */
  72. bool recv_lsn_checks_on;
  73. /** If the following is TRUE, the buffer pool file pages must be invalidated
  74. after recovery and no ibuf operations are allowed; this becomes TRUE if
  75. the log record hash table becomes too full, and log records must be merged
  76. to file pages already before the recovery is finished: in this case no
  77. ibuf operations are allowed, as they could modify the pages read in the
  78. buffer pool before the pages have been recovered to the up-to-date state.
  79. TRUE means that recovery is running and no operations on the log files
  80. are allowed yet: the variable name is misleading. */
  81. bool recv_no_ibuf_operations;
  82. /** The type of the previous parsed redo log record */
  83. static mlog_id_t recv_previous_parsed_rec_type;
  84. /** The offset of the previous parsed redo log record */
  85. static ulint recv_previous_parsed_rec_offset;
  86. /** The 'multi' flag of the previous parsed redo log record */
  87. static ulint recv_previous_parsed_rec_is_multi;
  88. /** This many frames must be left free in the buffer pool when we scan
  89. the log and store the scanned log records in the buffer pool: we will
  90. use these free frames to read in pages when we start applying the
  91. log records to the database.
  92. This is the default value. If the actual size of the buffer pool is
  93. larger than 10 MB we'll set this value to 512. */
  94. ulint recv_n_pool_free_frames;
  95. /** The maximum lsn we see for a page during the recovery process. If this
  96. is bigger than the lsn we are able to scan up to, that is an indication that
  97. the recovery failed and the database may be corrupt. */
  98. static lsn_t recv_max_page_lsn;
  99. #ifdef UNIV_PFS_THREAD
  100. mysql_pfs_key_t trx_rollback_clean_thread_key;
  101. mysql_pfs_key_t recv_writer_thread_key;
  102. #endif /* UNIV_PFS_THREAD */
  103. /** Flag indicating if recv_writer thread is active. */
  104. static volatile bool recv_writer_thread_active;
  105. #ifndef DBUG_OFF
  106. /** Return string name of the redo log record type.
  107. @param[in] type record log record enum
  108. @return string name of record log record */
  109. const char*
  110. get_mlog_string(mlog_id_t type);
  111. #endif /* !DBUG_OFF */
  112. /** Tablespace item during recovery */
  113. struct file_name_t {
  114. /** Tablespace file name (MLOG_FILE_NAME) */
  115. std::string name;
  116. /** Tablespace object (NULL if not valid or not found) */
  117. fil_space_t* space;
  118. /** Whether the tablespace has been deleted */
  119. bool deleted;
  120. /** Constructor */
  121. file_name_t(std::string name_, bool deleted_) :
  122. name(name_), space(NULL), deleted (deleted_) {}
  123. };
  124. /** Map of dirty tablespaces during recovery */
  125. typedef std::map<
  126. ulint,
  127. file_name_t,
  128. std::less<ulint>,
  129. ut_allocator<std::pair<const ulint, file_name_t> > > recv_spaces_t;
  130. static recv_spaces_t recv_spaces;
  131. /** Process a file name from a MLOG_FILE_* record.
  132. @param[in,out] name file name
  133. @param[in] len length of the file name
  134. @param[in] space_id the tablespace ID
  135. @param[in] deleted whether this is a MLOG_FILE_DELETE record
  136. @retval true if able to process file successfully.
  137. @retval false if unable to process the file */
  138. static
  139. bool
  140. fil_name_process(
  141. char* name,
  142. ulint len,
  143. ulint space_id,
  144. bool deleted)
  145. {
  146. bool processed = true;
  147. /* We will also insert space=NULL into the map, so that
  148. further checks can ensure that a MLOG_FILE_NAME record was
  149. scanned before applying any page records for the space_id. */
  150. os_normalize_path(name);
  151. file_name_t fname(std::string(name, len - 1), deleted);
  152. std::pair<recv_spaces_t::iterator,bool> p = recv_spaces.insert(
  153. std::make_pair(space_id, fname));
  154. ut_ad(p.first->first == space_id);
  155. file_name_t& f = p.first->second;
  156. if (deleted) {
  157. /* Got MLOG_FILE_DELETE */
  158. if (!p.second && !f.deleted) {
  159. f.deleted = true;
  160. if (f.space != NULL) {
  161. fil_space_free(space_id, false);
  162. f.space = NULL;
  163. }
  164. }
  165. ut_ad(f.space == NULL);
  166. } else if (p.second // the first MLOG_FILE_NAME or MLOG_FILE_RENAME2
  167. || f.name != fname.name) {
  168. fil_space_t* space;
  169. /* Check if the tablespace file exists and contains
  170. the space_id. If not, ignore the file after displaying
  171. a note. Abort if there are multiple files with the
  172. same space_id. */
  173. switch (fil_ibd_load(space_id, name, space)) {
  174. case FIL_LOAD_OK:
  175. ut_ad(space != NULL);
  176. if (f.space == NULL || f.space == space) {
  177. f.name = fname.name;
  178. f.space = space;
  179. f.deleted = false;
  180. } else {
  181. ib::error() << "Tablespace " << space_id
  182. << " has been found in two places: '"
  183. << f.name << "' and '" << name << "'."
  184. " You must delete one of them.";
  185. recv_sys->found_corrupt_fs = true;
  186. processed = false;
  187. }
  188. break;
  189. case FIL_LOAD_ID_CHANGED:
  190. ut_ad(space == NULL);
  191. break;
  192. case FIL_LOAD_NOT_FOUND:
  193. /* No matching tablespace was found; maybe it
  194. was renamed, and we will find a subsequent
  195. MLOG_FILE_* record. */
  196. ut_ad(space == NULL);
  197. if (srv_force_recovery) {
  198. /* Without innodb_force_recovery,
  199. missing tablespaces will only be
  200. reported in
  201. recv_init_crash_recovery_spaces().
  202. Enable some more diagnostics when
  203. forcing recovery. */
  204. ib::info()
  205. << "At LSN: " << recv_sys->recovered_lsn
  206. << ": unable to open file " << name
  207. << " for tablespace " << space_id;
  208. }
  209. break;
  210. case FIL_LOAD_INVALID:
  211. ut_ad(space == NULL);
  212. if (srv_force_recovery == 0) {
  213. ib::warn() << "We do not continue the crash"
  214. " recovery, because the table may"
  215. " become corrupt if we cannot apply"
  216. " the log records in the InnoDB log to"
  217. " it. To fix the problem and start"
  218. " mysqld:";
  219. ib::info() << "1) If there is a permission"
  220. " problem in the file and mysqld"
  221. " cannot open the file, you should"
  222. " modify the permissions.";
  223. ib::info() << "2) If the tablespace is not"
  224. " needed, or you can restore an older"
  225. " version from a backup, then you can"
  226. " remove the .ibd file, and use"
  227. " --innodb_force_recovery=1 to force"
  228. " startup without this file.";
  229. ib::info() << "3) If the file system or the"
  230. " disk is broken, and you cannot"
  231. " remove the .ibd file, you can set"
  232. " --innodb_force_recovery.";
  233. recv_sys->found_corrupt_fs = true;
  234. processed = false;
  235. break;
  236. }
  237. ib::info() << "innodb_force_recovery was set to "
  238. << srv_force_recovery << ". Continuing crash"
  239. " recovery even though we cannot access the"
  240. " files for tablespace " << space_id << ".";
  241. break;
  242. }
  243. }
  244. return(processed);
  245. }
  246. /** Parse or process a MLOG_FILE_* record.
  247. @param[in] ptr redo log record
  248. @param[in] end end of the redo log buffer
  249. @param[in] space_id the tablespace ID
  250. @param[in] first_page_no first page number in the file
  251. @param[in] type MLOG_FILE_NAME or MLOG_FILE_DELETE
  252. or MLOG_FILE_CREATE2 or MLOG_FILE_RENAME2
  253. @param[in] apply whether to apply the record
  254. @return pointer to next redo log record
  255. @retval NULL if this log record was truncated */
  256. static
  257. byte*
  258. fil_name_parse(
  259. byte* ptr,
  260. const byte* end,
  261. ulint space_id,
  262. ulint first_page_no,
  263. mlog_id_t type,
  264. bool apply)
  265. {
  266. if (type == MLOG_FILE_CREATE2) {
  267. if (end < ptr + 4) {
  268. return(NULL);
  269. }
  270. ptr += 4;
  271. }
  272. if (end < ptr + 2) {
  273. return(NULL);
  274. }
  275. ulint len = mach_read_from_2(ptr);
  276. ptr += 2;
  277. if (end < ptr + len) {
  278. return(NULL);
  279. }
  280. /* MLOG_FILE_* records should only be written for
  281. user-created tablespaces. The name must be long enough
  282. and end in .ibd. */
  283. bool corrupt = is_predefined_tablespace(space_id)
  284. || first_page_no != 0 // TODO: multi-file user tablespaces
  285. || len < sizeof "/a.ibd\0"
  286. || memcmp(ptr + len - 5, DOT_IBD, 5) != 0
  287. || memchr(ptr, OS_PATH_SEPARATOR, len) == NULL;
  288. byte* end_ptr = ptr + len;
  289. switch (type) {
  290. default:
  291. ut_ad(0); // the caller checked this
  292. case MLOG_FILE_NAME:
  293. if (corrupt) {
  294. ib::error() << "MLOG_FILE_NAME incorrect:" << ptr;
  295. recv_sys->found_corrupt_log = true;
  296. break;
  297. }
  298. fil_name_process(
  299. reinterpret_cast<char*>(ptr), len, space_id, false);
  300. break;
  301. case MLOG_FILE_DELETE:
  302. if (corrupt) {
  303. ib::error() << "MLOG_FILE_DELETE incorrect:" << ptr;
  304. recv_sys->found_corrupt_log = true;
  305. break;
  306. }
  307. fil_name_process(
  308. reinterpret_cast<char*>(ptr), len, space_id, true);
  309. break;
  310. case MLOG_FILE_CREATE2:
  311. break;
  312. case MLOG_FILE_RENAME2:
  313. if (corrupt) {
  314. ib::error() << "MLOG_FILE_RENAME2 incorrect:" << ptr;
  315. recv_sys->found_corrupt_log = true;
  316. }
  317. /* The new name follows the old name. */
  318. byte* new_name = end_ptr + 2;
  319. if (end < new_name) {
  320. return(NULL);
  321. }
  322. ulint new_len = mach_read_from_2(end_ptr);
  323. if (end < end_ptr + 2 + new_len) {
  324. return(NULL);
  325. }
  326. end_ptr += 2 + new_len;
  327. corrupt = corrupt
  328. || new_len < sizeof "/a.ibd\0"
  329. || memcmp(new_name + new_len - 5, DOT_IBD, 5) != 0
  330. || !memchr(new_name, OS_PATH_SEPARATOR, new_len);
  331. if (corrupt) {
  332. ib::error() << "MLOG_FILE_RENAME2 new_name incorrect:" << ptr
  333. << " new_name: " << new_name;
  334. recv_sys->found_corrupt_log = true;
  335. break;
  336. }
  337. fil_name_process(
  338. reinterpret_cast<char*>(ptr), len,
  339. space_id, false);
  340. fil_name_process(
  341. reinterpret_cast<char*>(new_name), new_len,
  342. space_id, false);
  343. if (!apply) {
  344. break;
  345. }
  346. if (!fil_op_replay_rename(
  347. space_id, first_page_no,
  348. reinterpret_cast<const char*>(ptr),
  349. reinterpret_cast<const char*>(new_name))) {
  350. recv_sys->found_corrupt_fs = true;
  351. }
  352. }
  353. return(end_ptr);
  354. }
  355. /********************************************************//**
  356. Creates the recovery system. */
  357. void
  358. recv_sys_create(void)
  359. /*=================*/
  360. {
  361. if (recv_sys != NULL) {
  362. return;
  363. }
  364. recv_sys = static_cast<recv_sys_t*>(ut_zalloc_nokey(sizeof(*recv_sys)));
  365. mutex_create(LATCH_ID_RECV_SYS, &recv_sys->mutex);
  366. mutex_create(LATCH_ID_RECV_WRITER, &recv_sys->writer_mutex);
  367. recv_sys->heap = NULL;
  368. recv_sys->addr_hash = NULL;
  369. }
  370. /********************************************************//**
  371. Release recovery system mutexes. */
  372. void
  373. recv_sys_close(void)
  374. /*================*/
  375. {
  376. if (recv_sys != NULL) {
  377. recv_sys->dblwr.pages.clear();
  378. if (recv_sys->addr_hash != NULL) {
  379. hash_table_free(recv_sys->addr_hash);
  380. }
  381. if (recv_sys->heap != NULL) {
  382. mem_heap_free(recv_sys->heap);
  383. }
  384. if (recv_sys->flush_start != NULL) {
  385. os_event_destroy(recv_sys->flush_start);
  386. }
  387. if (recv_sys->flush_end != NULL) {
  388. os_event_destroy(recv_sys->flush_end);
  389. }
  390. ut_free(recv_sys->buf);
  391. ut_ad(!recv_writer_thread_active);
  392. mutex_free(&recv_sys->writer_mutex);
  393. mutex_free(&recv_sys->mutex);
  394. ut_free(recv_sys);
  395. recv_sys = NULL;
  396. }
  397. recv_spaces.clear();
  398. }
  399. /********************************************************//**
  400. Frees the recovery system memory. */
  401. void
  402. recv_sys_mem_free(void)
  403. /*===================*/
  404. {
  405. if (recv_sys != NULL) {
  406. if (recv_sys->addr_hash != NULL) {
  407. hash_table_free(recv_sys->addr_hash);
  408. }
  409. if (recv_sys->heap != NULL) {
  410. mem_heap_free(recv_sys->heap);
  411. }
  412. if (recv_sys->flush_start != NULL) {
  413. os_event_destroy(recv_sys->flush_start);
  414. }
  415. if (recv_sys->flush_end != NULL) {
  416. os_event_destroy(recv_sys->flush_end);
  417. }
  418. ut_free(recv_sys->buf);
  419. ut_free(recv_sys);
  420. recv_sys = NULL;
  421. }
  422. }
  423. /************************************************************
  424. Reset the state of the recovery system variables. */
  425. void
  426. recv_sys_var_init(void)
  427. /*===================*/
  428. {
  429. recv_recovery_on = false;
  430. recv_needed_recovery = false;
  431. recv_lsn_checks_on = false;
  432. recv_no_ibuf_operations = false;
  433. recv_previous_parsed_rec_type = MLOG_SINGLE_REC_FLAG;
  434. recv_previous_parsed_rec_offset = 0;
  435. recv_previous_parsed_rec_is_multi = 0;
  436. recv_n_pool_free_frames = 256;
  437. recv_max_page_lsn = 0;
  438. }
  439. /******************************************************************//**
  440. recv_writer thread tasked with flushing dirty pages from the buffer
  441. pools.
  442. @return a dummy parameter */
  443. extern "C"
  444. os_thread_ret_t
  445. DECLARE_THREAD(recv_writer_thread)(
  446. /*===============================*/
  447. void* arg MY_ATTRIBUTE((unused)))
  448. /*!< in: a dummy parameter required by
  449. os_thread_create */
  450. {
  451. my_thread_init();
  452. ut_ad(!srv_read_only_mode);
  453. #ifdef UNIV_PFS_THREAD
  454. pfs_register_thread(recv_writer_thread_key);
  455. #endif /* UNIV_PFS_THREAD */
  456. #ifdef UNIV_DEBUG_THREAD_CREATION
  457. ib::info() << "recv_writer thread running, id "
  458. << os_thread_pf(os_thread_get_curr_id());
  459. #endif /* UNIV_DEBUG_THREAD_CREATION */
  460. while (srv_shutdown_state == SRV_SHUTDOWN_NONE) {
  461. /* Wait till we get a signal to clean the LRU list.
  462. Bounded by max wait time of 100ms. */
  463. ib_uint64_t sig_count = os_event_reset(buf_flush_event);
  464. os_event_wait_time_low(buf_flush_event, 100000, sig_count);
  465. mutex_enter(&recv_sys->writer_mutex);
  466. if (!recv_recovery_on) {
  467. mutex_exit(&recv_sys->writer_mutex);
  468. break;
  469. }
  470. /* Flush pages from end of LRU if required */
  471. os_event_reset(recv_sys->flush_end);
  472. recv_sys->flush_type = BUF_FLUSH_LRU;
  473. os_event_set(recv_sys->flush_start);
  474. os_event_wait(recv_sys->flush_end);
  475. mutex_exit(&recv_sys->writer_mutex);
  476. }
  477. recv_writer_thread_active = false;
  478. my_thread_end();
  479. /* We count the number of threads in os_thread_exit().
  480. A created thread should always use that to exit and not
  481. use return() to exit. */
  482. os_thread_exit();
  483. OS_THREAD_DUMMY_RETURN;
  484. }
  485. /************************************************************
  486. Inits the recovery system for a recovery operation. */
  487. void
  488. recv_sys_init(
  489. /*==========*/
  490. ulint available_memory) /*!< in: available memory in bytes */
  491. {
  492. if (recv_sys->heap != NULL) {
  493. return;
  494. }
  495. mutex_enter(&(recv_sys->mutex));
  496. recv_sys->heap = mem_heap_create_typed(256,
  497. MEM_HEAP_FOR_RECV_SYS);
  498. if (!srv_read_only_mode) {
  499. recv_sys->flush_start = os_event_create(0);
  500. recv_sys->flush_end = os_event_create(0);
  501. }
  502. /* Set appropriate value of recv_n_pool_free_frames. */
  503. if (buf_pool_get_curr_size() >= (10 * 1024 * 1024)) {
  504. /* Buffer pool of size greater than 10 MB. */
  505. recv_n_pool_free_frames = 512;
  506. }
  507. recv_sys->buf = static_cast<byte*>(
  508. ut_malloc_nokey(RECV_PARSING_BUF_SIZE));
  509. recv_sys->len = 0;
  510. recv_sys->recovered_offset = 0;
  511. recv_sys->addr_hash = hash_create(available_memory / 512);
  512. recv_sys->n_addrs = 0;
  513. recv_sys->apply_log_recs = FALSE;
  514. recv_sys->apply_batch_on = FALSE;
  515. recv_sys->found_corrupt_log = false;
  516. recv_sys->found_corrupt_fs = false;
  517. recv_sys->mlog_checkpoint_lsn = 0;
  518. recv_sys->progress_time = ut_time();
  519. recv_max_page_lsn = 0;
  520. /* Call the constructor for recv_sys_t::dblwr member */
  521. new (&recv_sys->dblwr) recv_dblwr_t();
  522. mutex_exit(&(recv_sys->mutex));
  523. }
  524. /** Empty a fully processed hash table. */
  525. static
  526. void
  527. recv_sys_empty_hash()
  528. {
  529. ut_ad(mutex_own(&(recv_sys->mutex)));
  530. ut_a(recv_sys->n_addrs == 0);
  531. hash_table_free(recv_sys->addr_hash);
  532. mem_heap_empty(recv_sys->heap);
  533. recv_sys->addr_hash = hash_create(buf_pool_get_curr_size() / 512);
  534. }
  535. /********************************************************//**
  536. Frees the recovery system. */
  537. void
  538. recv_sys_debug_free(void)
  539. /*=====================*/
  540. {
  541. mutex_enter(&(recv_sys->mutex));
  542. hash_table_free(recv_sys->addr_hash);
  543. mem_heap_free(recv_sys->heap);
  544. ut_free(recv_sys->buf);
  545. recv_sys->buf = NULL;
  546. recv_sys->heap = NULL;
  547. recv_sys->addr_hash = NULL;
  548. /* wake page cleaner up to progress */
  549. if (!srv_read_only_mode) {
  550. ut_ad(!recv_recovery_on);
  551. ut_ad(!recv_writer_thread_active);
  552. os_event_reset(buf_flush_event);
  553. os_event_set(recv_sys->flush_start);
  554. }
  555. mutex_exit(&(recv_sys->mutex));
  556. }
  557. /** Read a log segment to a buffer.
  558. @param[out] buf buffer
  559. @param[in] group redo log files
  560. @param[in] start_lsn read area start
  561. @param[in] end_lsn read area end
  562. @return valid end_lsn */
  563. static
  564. lsn_t
  565. log_group_read_log_seg(
  566. byte* buf,
  567. const log_group_t* group,
  568. lsn_t start_lsn,
  569. lsn_t end_lsn)
  570. {
  571. ulint len;
  572. lsn_t source_offset;
  573. ut_ad(log_mutex_own());
  574. loop:
  575. source_offset = log_group_calc_lsn_offset(start_lsn, group);
  576. ut_a(end_lsn - start_lsn <= ULINT_MAX);
  577. len = (ulint) (end_lsn - start_lsn);
  578. ut_ad(len != 0);
  579. const bool at_eof = (source_offset % group->file_size) + len
  580. > group->file_size;
  581. if (at_eof) {
  582. /* If the above condition is true then len (which is ulint)
  583. is > the expression below, so the typecast is ok */
  584. len = (ulint) (group->file_size -
  585. (source_offset % group->file_size));
  586. }
  587. log_sys->n_log_ios++;
  588. MONITOR_INC(MONITOR_LOG_IO);
  589. ut_a(source_offset / UNIV_PAGE_SIZE <= ULINT_MAX);
  590. const ulint page_no
  591. = (ulint) (source_offset / univ_page_size.physical());
  592. fil_io(IORequestLogRead, true,
  593. page_id_t(group->space_id, page_no),
  594. univ_page_size,
  595. (ulint) (source_offset % univ_page_size.physical()),
  596. len, buf, NULL);
  597. for (ulint l = 0; l < len; l += OS_FILE_LOG_BLOCK_SIZE,
  598. buf += OS_FILE_LOG_BLOCK_SIZE,
  599. start_lsn += OS_FILE_LOG_BLOCK_SIZE) {
  600. const ulint block_number = log_block_get_hdr_no(buf);
  601. if (block_number != log_block_convert_lsn_to_no(start_lsn)) {
  602. /* Garbage or an incompletely written log block.
  603. We will not report any error, because this can
  604. happen when InnoDB was killed while it was
  605. writing redo log. We simply treat this as an
  606. abrupt end of the redo log. */
  607. end_lsn = start_lsn;
  608. break;
  609. }
  610. if (innodb_log_checksums || group->is_encrypted()) {
  611. ulint crc = log_block_calc_checksum_crc32(buf);
  612. ulint cksum = log_block_get_checksum(buf);
  613. if (crc != cksum) {
  614. ib::error() << "Invalid log block checksum."
  615. << " block: " << block_number
  616. << " checkpoint no: "
  617. << log_block_get_checkpoint_no(buf)
  618. << " expected: " << crc
  619. << " found: " << cksum;
  620. end_lsn = start_lsn;
  621. break;
  622. }
  623. if (group->is_encrypted()) {
  624. log_crypt(buf, OS_FILE_LOG_BLOCK_SIZE, true);
  625. }
  626. }
  627. }
  628. if (recv_sys->report(ut_time())) {
  629. ib::info() << "Read redo log up to LSN=" << start_lsn;
  630. sd_notifyf(0, "STATUS=Read redo log up to LSN=" LSN_PF,
  631. start_lsn);
  632. }
  633. if (start_lsn != end_lsn) {
  634. goto loop;
  635. }
  636. return(start_lsn);
  637. }
  638. /********************************************************//**
  639. Copies a log segment from the most up-to-date log group to the other log
  640. groups, so that they all contain the latest log data. Also writes the info
  641. about the latest checkpoint to the groups, and inits the fields in the group
  642. memory structs to up-to-date values. */
  643. static
  644. void
  645. recv_synchronize_groups()
  646. {
  647. const lsn_t recovered_lsn = recv_sys->recovered_lsn;
  648. /* Read the last recovered log block to the recovery system buffer:
  649. the block is always incomplete */
  650. const lsn_t start_lsn = ut_uint64_align_down(recovered_lsn,
  651. OS_FILE_LOG_BLOCK_SIZE);
  652. log_group_read_log_seg(log_sys->buf,
  653. UT_LIST_GET_FIRST(log_sys->log_groups),
  654. start_lsn, start_lsn + OS_FILE_LOG_BLOCK_SIZE);
  655. ut_ad(UT_LIST_GET_LEN(log_sys->log_groups) == 1);
  656. for (log_group_t* group = UT_LIST_GET_FIRST(log_sys->log_groups);
  657. group;
  658. group = UT_LIST_GET_NEXT(log_groups, group)) {
  659. /* Update the fields in the group struct to correspond to
  660. recovered_lsn */
  661. log_group_set_fields(group, recovered_lsn);
  662. }
  663. /* Copy the checkpoint info to the log; remember that we have
  664. incremented checkpoint_no by one, and the info will not be written
  665. over the max checkpoint info, thus making the preservation of max
  666. checkpoint info on disk certain */
  667. if (!srv_read_only_mode) {
  668. log_write_checkpoint_info(true, 0);
  669. log_mutex_enter();
  670. }
  671. }
  672. /** Check the consistency of a log header block.
  673. @param[in] log header block
  674. @return true if ok */
  675. static
  676. bool
  677. recv_check_log_header_checksum(
  678. const byte* buf)
  679. {
  680. return(log_block_get_checksum(buf)
  681. == log_block_calc_checksum_crc32(buf));
  682. }
  683. /** Find the latest checkpoint in the format-0 log header.
  684. @param[out] max_group log group, or NULL
  685. @param[out] max_field LOG_CHECKPOINT_1 or LOG_CHECKPOINT_2
  686. @return error code or DB_SUCCESS */
  687. static MY_ATTRIBUTE((warn_unused_result))
  688. dberr_t
  689. recv_find_max_checkpoint_0(
  690. log_group_t** max_group,
  691. ulint* max_field)
  692. {
  693. log_group_t* group = UT_LIST_GET_FIRST(log_sys->log_groups);
  694. ib_uint64_t max_no = 0;
  695. ib_uint64_t checkpoint_no;
  696. byte* buf = log_sys->checkpoint_buf;
  697. ut_ad(group->format == 0);
  698. ut_ad(UT_LIST_GET_NEXT(log_groups, group) == NULL);
  699. /** Offset of the first checkpoint checksum */
  700. static const uint CHECKSUM_1 = 288;
  701. /** Offset of the second checkpoint checksum */
  702. static const uint CHECKSUM_2 = CHECKSUM_1 + 4;
  703. /** Most significant bits of the checkpoint offset */
  704. static const uint OFFSET_HIGH32 = CHECKSUM_2 + 12;
  705. /** Least significant bits of the checkpoint offset */
  706. static const uint OFFSET_LOW32 = 16;
  707. for (ulint field = LOG_CHECKPOINT_1; field <= LOG_CHECKPOINT_2;
  708. field += LOG_CHECKPOINT_2 - LOG_CHECKPOINT_1) {
  709. log_group_header_read(group, field);
  710. if (static_cast<uint32_t>(ut_fold_binary(buf, CHECKSUM_1))
  711. != mach_read_from_4(buf + CHECKSUM_1)
  712. || static_cast<uint32_t>(
  713. ut_fold_binary(buf + LOG_CHECKPOINT_LSN,
  714. CHECKSUM_2 - LOG_CHECKPOINT_LSN))
  715. != mach_read_from_4(buf + CHECKSUM_2)) {
  716. DBUG_LOG("ib_log",
  717. "invalid pre-10.2.2 checkpoint " << field);
  718. continue;
  719. }
  720. group->state = LOG_GROUP_OK;
  721. group->lsn = mach_read_from_8(
  722. buf + LOG_CHECKPOINT_LSN);
  723. group->lsn_offset = static_cast<ib_uint64_t>(
  724. mach_read_from_4(buf + OFFSET_HIGH32)) << 32
  725. | mach_read_from_4(buf + OFFSET_LOW32);
  726. checkpoint_no = mach_read_from_8(
  727. buf + LOG_CHECKPOINT_NO);
  728. if (!log_crypt_101_read_checkpoint(buf)) {
  729. ib::error() << "Decrypting checkpoint failed";
  730. continue;
  731. }
  732. DBUG_PRINT("ib_log",
  733. ("checkpoint " UINT64PF " at " LSN_PF
  734. " found in group " ULINTPF,
  735. checkpoint_no, group->lsn, group->id));
  736. if (checkpoint_no >= max_no) {
  737. *max_group = group;
  738. *max_field = field;
  739. max_no = checkpoint_no;
  740. }
  741. }
  742. if (*max_group != NULL) {
  743. return(DB_SUCCESS);
  744. }
  745. ib::error() << "Upgrade after a crash is not supported."
  746. " This redo log was created before MariaDB 10.2.2,"
  747. " and we did not find a valid checkpoint."
  748. " Please follow the instructions at"
  749. " " REFMAN "upgrading.html";
  750. return(DB_ERROR);
  751. }
  752. /** Determine if a pre-MySQL 5.7.9/MariaDB 10.2.2 redo log is clean.
  753. @param[in] lsn checkpoint LSN
  754. @return error code
  755. @retval DB_SUCCESS if the redo log is clean
  756. @retval DB_ERROR if the redo log is corrupted or dirty */
  757. static
  758. dberr_t
  759. recv_log_format_0_recover(lsn_t lsn)
  760. {
  761. log_mutex_enter();
  762. log_group_t* group = UT_LIST_GET_FIRST(log_sys->log_groups);
  763. const lsn_t source_offset
  764. = log_group_calc_lsn_offset(lsn, group);
  765. log_mutex_exit();
  766. const ulint page_no
  767. = (ulint) (source_offset / univ_page_size.physical());
  768. byte* buf = log_sys->buf;
  769. static const char* NO_UPGRADE_RECOVERY_MSG =
  770. "Upgrade after a crash is not supported."
  771. " This redo log was created before MariaDB 10.2.2";
  772. static const char* NO_UPGRADE_RTFM_MSG =
  773. ". Please follow the instructions at "
  774. REFMAN "upgrading.html";
  775. fil_io(IORequestLogRead, true,
  776. page_id_t(group->space_id, page_no),
  777. univ_page_size,
  778. (ulint) ((source_offset & ~(OS_FILE_LOG_BLOCK_SIZE - 1))
  779. % univ_page_size.physical()),
  780. OS_FILE_LOG_BLOCK_SIZE, buf, NULL);
  781. if (log_block_calc_checksum_format_0(buf)
  782. != log_block_get_checksum(buf)
  783. && !log_crypt_101_read_block(buf)) {
  784. ib::error() << NO_UPGRADE_RECOVERY_MSG
  785. << ", and it appears corrupted"
  786. << NO_UPGRADE_RTFM_MSG;
  787. return(DB_CORRUPTION);
  788. }
  789. if (log_block_get_data_len(buf)
  790. != (source_offset & (OS_FILE_LOG_BLOCK_SIZE - 1))) {
  791. ib::error() << NO_UPGRADE_RECOVERY_MSG
  792. << NO_UPGRADE_RTFM_MSG;
  793. return(DB_ERROR);
  794. }
  795. /* Mark the redo log for upgrading. */
  796. srv_log_file_size = 0;
  797. recv_sys->parse_start_lsn = recv_sys->recovered_lsn
  798. = recv_sys->scanned_lsn
  799. = recv_sys->mlog_checkpoint_lsn = lsn;
  800. log_sys->last_checkpoint_lsn = log_sys->next_checkpoint_lsn
  801. = log_sys->lsn = log_sys->write_lsn
  802. = log_sys->current_flush_lsn = log_sys->flushed_to_disk_lsn
  803. = lsn;
  804. log_sys->next_checkpoint_no = 0;
  805. return(DB_SUCCESS);
  806. }
  807. /** Find the latest checkpoint in the log header.
  808. @param[out] max_group log group, or NULL
  809. @param[out] max_field LOG_CHECKPOINT_1 or LOG_CHECKPOINT_2
  810. @return error code or DB_SUCCESS */
  811. static MY_ATTRIBUTE((warn_unused_result))
  812. dberr_t
  813. recv_find_max_checkpoint(
  814. log_group_t** max_group,
  815. ulint* max_field)
  816. {
  817. log_group_t* group;
  818. ib_uint64_t max_no;
  819. ib_uint64_t checkpoint_no;
  820. ulint field;
  821. byte* buf;
  822. group = UT_LIST_GET_FIRST(log_sys->log_groups);
  823. max_no = 0;
  824. *max_group = NULL;
  825. *max_field = 0;
  826. buf = log_sys->checkpoint_buf;
  827. while (group) {
  828. group->state = LOG_GROUP_CORRUPTED;
  829. log_group_header_read(group, 0);
  830. /* Check the header page checksum. There was no
  831. checksum in the first redo log format (version 0). */
  832. group->format = mach_read_from_4(buf + LOG_HEADER_FORMAT);
  833. if (group->format != 0
  834. && !recv_check_log_header_checksum(buf)) {
  835. ib::error() << "Invalid redo log header checksum.";
  836. return(DB_CORRUPTION);
  837. }
  838. switch (group->format) {
  839. case 0:
  840. return(recv_find_max_checkpoint_0(
  841. max_group, max_field));
  842. case LOG_HEADER_FORMAT_CURRENT:
  843. case LOG_HEADER_FORMAT_CURRENT | LOG_HEADER_FORMAT_ENCRYPTED:
  844. break;
  845. default:
  846. /* Ensure that the string is NUL-terminated. */
  847. buf[LOG_HEADER_CREATOR_END] = 0;
  848. ib::error() << "Unsupported redo log format."
  849. " The redo log was created"
  850. " with " << buf + LOG_HEADER_CREATOR <<
  851. ". Please follow the instructions at "
  852. REFMAN "upgrading-downgrading.html";
  853. /* Do not issue a message about a possibility
  854. to cleanly shut down the newer server version
  855. and to remove the redo logs, because the
  856. format of the system data structures may
  857. radically change after MySQL 5.7. */
  858. return(DB_ERROR);
  859. }
  860. for (field = LOG_CHECKPOINT_1; field <= LOG_CHECKPOINT_2;
  861. field += LOG_CHECKPOINT_2 - LOG_CHECKPOINT_1) {
  862. log_group_header_read(group, field);
  863. const ulint crc32 = log_block_calc_checksum_crc32(buf);
  864. const ulint cksum = log_block_get_checksum(buf);
  865. if (crc32 != cksum) {
  866. DBUG_PRINT("ib_log",
  867. ("invalid checkpoint,"
  868. " group " ULINTPF " at " ULINTPF
  869. ", checksum %x expected %x",
  870. group->id, field,
  871. (unsigned) cksum,
  872. (unsigned) crc32));
  873. continue;
  874. }
  875. if (group->is_encrypted()
  876. && !log_crypt_read_checkpoint_buf(buf)) {
  877. ib::error() << "Reading checkpoint"
  878. " encryption info failed.";
  879. continue;
  880. }
  881. group->state = LOG_GROUP_OK;
  882. group->lsn = mach_read_from_8(
  883. buf + LOG_CHECKPOINT_LSN);
  884. group->lsn_offset = mach_read_from_8(
  885. buf + LOG_CHECKPOINT_OFFSET);
  886. checkpoint_no = mach_read_from_8(
  887. buf + LOG_CHECKPOINT_NO);
  888. DBUG_PRINT("ib_log",
  889. ("checkpoint " UINT64PF " at " LSN_PF
  890. " found in group " ULINTPF,
  891. checkpoint_no, group->lsn, group->id));
  892. if (checkpoint_no >= max_no) {
  893. *max_group = group;
  894. *max_field = field;
  895. max_no = checkpoint_no;
  896. }
  897. }
  898. group = UT_LIST_GET_NEXT(log_groups, group);
  899. }
  900. if (*max_group == NULL) {
  901. /* Before 5.7.9, we could get here during database
  902. initialization if we created an ib_logfile0 file that
  903. was filled with zeroes, and were killed. After
  904. 5.7.9, we would reject such a file already earlier,
  905. when checking the file header. */
  906. ib::error() << "No valid checkpoint found"
  907. " (corrupted redo log)."
  908. " You can try --innodb-force-recovery=6"
  909. " as a last resort.";
  910. return(DB_ERROR);
  911. }
  912. return(DB_SUCCESS);
  913. }
  914. /** Try to parse a single log record body and also applies it if
  915. specified.
  916. @param[in] type redo log entry type
  917. @param[in] ptr redo log record body
  918. @param[in] end_ptr end of buffer
  919. @param[in] space_id tablespace identifier
  920. @param[in] page_no page number
  921. @param[in] apply whether to apply the record
  922. @param[in,out] block buffer block, or NULL if
  923. a page log record should not be applied
  924. or if it is a MLOG_FILE_ operation
  925. @param[in,out] mtr mini-transaction, or NULL if
  926. a page log record should not be applied
  927. @return log record end, NULL if not a complete record */
  928. static
  929. byte*
  930. recv_parse_or_apply_log_rec_body(
  931. mlog_id_t type,
  932. byte* ptr,
  933. byte* end_ptr,
  934. ulint space_id,
  935. ulint page_no,
  936. bool apply,
  937. buf_block_t* block,
  938. mtr_t* mtr)
  939. {
  940. ut_ad(!block == !mtr);
  941. ut_ad(!apply || recv_sys->mlog_checkpoint_lsn != 0);
  942. switch (type) {
  943. case MLOG_FILE_NAME:
  944. case MLOG_FILE_DELETE:
  945. case MLOG_FILE_CREATE2:
  946. case MLOG_FILE_RENAME2:
  947. ut_ad(block == NULL);
  948. /* Collect the file names when parsing the log,
  949. before applying any log records. */
  950. return(fil_name_parse(ptr, end_ptr, space_id, page_no, type,
  951. apply));
  952. case MLOG_INDEX_LOAD:
  953. if (end_ptr < ptr + 8) {
  954. return(NULL);
  955. }
  956. return(ptr + 8);
  957. case MLOG_TRUNCATE:
  958. return(truncate_t::parse_redo_entry(ptr, end_ptr, space_id));
  959. default:
  960. break;
  961. }
  962. dict_index_t* index = NULL;
  963. page_t* page;
  964. page_zip_des_t* page_zip;
  965. #ifdef UNIV_DEBUG
  966. ulint page_type;
  967. #endif /* UNIV_DEBUG */
  968. if (block) {
  969. /* Applying a page log record. */
  970. ut_ad(apply);
  971. page = block->frame;
  972. page_zip = buf_block_get_page_zip(block);
  973. ut_d(page_type = fil_page_get_type(page));
  974. } else if (apply
  975. && !is_predefined_tablespace(space_id)
  976. && recv_spaces.find(space_id) == recv_spaces.end()) {
  977. if (recv_sys->recovered_lsn < recv_sys->mlog_checkpoint_lsn) {
  978. /* We have not seen all records between the
  979. checkpoint and MLOG_CHECKPOINT. There should be
  980. a MLOG_FILE_DELETE for this tablespace later. */
  981. recv_spaces.insert(
  982. std::make_pair(space_id,
  983. file_name_t("", false)));
  984. goto parse_log;
  985. }
  986. ib::error() << "Missing MLOG_FILE_NAME or MLOG_FILE_DELETE"
  987. " for redo log record " << type << " (page "
  988. << space_id << ":" << page_no << ") at "
  989. << recv_sys->recovered_lsn << ".";
  990. recv_sys->found_corrupt_log = true;
  991. return(NULL);
  992. } else {
  993. parse_log:
  994. /* Parsing a page log record. */
  995. page = NULL;
  996. page_zip = NULL;
  997. ut_d(page_type = FIL_PAGE_TYPE_ALLOCATED);
  998. }
  999. const byte* old_ptr = ptr;
  1000. switch (type) {
  1001. #ifdef UNIV_LOG_LSN_DEBUG
  1002. case MLOG_LSN:
  1003. /* The LSN is checked in recv_parse_log_rec(). */
  1004. break;
  1005. #endif /* UNIV_LOG_LSN_DEBUG */
  1006. case MLOG_1BYTE: case MLOG_2BYTES: case MLOG_4BYTES: case MLOG_8BYTES:
  1007. #ifdef UNIV_DEBUG
  1008. if (page && page_type == FIL_PAGE_TYPE_ALLOCATED
  1009. && end_ptr >= ptr + 2) {
  1010. /* It is OK to set FIL_PAGE_TYPE and certain
  1011. list node fields on an empty page. Any other
  1012. write is not OK. */
  1013. /* NOTE: There may be bogus assertion failures for
  1014. dict_hdr_create(), trx_rseg_header_create(),
  1015. trx_sys_create_doublewrite_buf(), and
  1016. trx_sysf_create().
  1017. These are only called during database creation. */
  1018. ulint offs = mach_read_from_2(ptr);
  1019. switch (type) {
  1020. default:
  1021. ut_error;
  1022. case MLOG_2BYTES:
  1023. /* Note that this can fail when the
  1024. redo log been written with something
  1025. older than InnoDB Plugin 1.0.4. */
  1026. ut_ad(offs == FIL_PAGE_TYPE
  1027. || offs == IBUF_TREE_SEG_HEADER
  1028. + IBUF_HEADER + FSEG_HDR_OFFSET
  1029. || offs == PAGE_BTR_IBUF_FREE_LIST
  1030. + PAGE_HEADER + FIL_ADDR_BYTE
  1031. || offs == PAGE_BTR_IBUF_FREE_LIST
  1032. + PAGE_HEADER + FIL_ADDR_BYTE
  1033. + FIL_ADDR_SIZE
  1034. || offs == PAGE_BTR_SEG_LEAF
  1035. + PAGE_HEADER + FSEG_HDR_OFFSET
  1036. || offs == PAGE_BTR_SEG_TOP
  1037. + PAGE_HEADER + FSEG_HDR_OFFSET
  1038. || offs == PAGE_BTR_IBUF_FREE_LIST_NODE
  1039. + PAGE_HEADER + FIL_ADDR_BYTE
  1040. + 0 /*FLST_PREV*/
  1041. || offs == PAGE_BTR_IBUF_FREE_LIST_NODE
  1042. + PAGE_HEADER + FIL_ADDR_BYTE
  1043. + FIL_ADDR_SIZE /*FLST_NEXT*/);
  1044. break;
  1045. case MLOG_4BYTES:
  1046. /* Note that this can fail when the
  1047. redo log been written with something
  1048. older than InnoDB Plugin 1.0.4. */
  1049. ut_ad(0
  1050. || offs == IBUF_TREE_SEG_HEADER
  1051. + IBUF_HEADER + FSEG_HDR_SPACE
  1052. || offs == IBUF_TREE_SEG_HEADER
  1053. + IBUF_HEADER + FSEG_HDR_PAGE_NO
  1054. || offs == PAGE_BTR_IBUF_FREE_LIST
  1055. + PAGE_HEADER/* flst_init */
  1056. || offs == PAGE_BTR_IBUF_FREE_LIST
  1057. + PAGE_HEADER + FIL_ADDR_PAGE
  1058. || offs == PAGE_BTR_IBUF_FREE_LIST
  1059. + PAGE_HEADER + FIL_ADDR_PAGE
  1060. + FIL_ADDR_SIZE
  1061. || offs == PAGE_BTR_SEG_LEAF
  1062. + PAGE_HEADER + FSEG_HDR_PAGE_NO
  1063. || offs == PAGE_BTR_SEG_LEAF
  1064. + PAGE_HEADER + FSEG_HDR_SPACE
  1065. || offs == PAGE_BTR_SEG_TOP
  1066. + PAGE_HEADER + FSEG_HDR_PAGE_NO
  1067. || offs == PAGE_BTR_SEG_TOP
  1068. + PAGE_HEADER + FSEG_HDR_SPACE
  1069. || offs == PAGE_BTR_IBUF_FREE_LIST_NODE
  1070. + PAGE_HEADER + FIL_ADDR_PAGE
  1071. + 0 /*FLST_PREV*/
  1072. || offs == PAGE_BTR_IBUF_FREE_LIST_NODE
  1073. + PAGE_HEADER + FIL_ADDR_PAGE
  1074. + FIL_ADDR_SIZE /*FLST_NEXT*/);
  1075. break;
  1076. }
  1077. }
  1078. #endif /* UNIV_DEBUG */
  1079. ptr = mlog_parse_nbytes(type, ptr, end_ptr, page, page_zip);
  1080. if (ptr != NULL && page != NULL
  1081. && page_no == 0 && type == MLOG_4BYTES) {
  1082. ulint offs = mach_read_from_2(old_ptr);
  1083. switch (offs) {
  1084. fil_space_t* space;
  1085. ulint val;
  1086. default:
  1087. break;
  1088. case FSP_HEADER_OFFSET + FSP_SPACE_FLAGS:
  1089. case FSP_HEADER_OFFSET + FSP_SIZE:
  1090. case FSP_HEADER_OFFSET + FSP_FREE_LIMIT:
  1091. case FSP_HEADER_OFFSET + FSP_FREE + FLST_LEN:
  1092. space = fil_space_get(space_id);
  1093. ut_a(space != NULL);
  1094. val = mach_read_from_4(page + offs);
  1095. switch (offs) {
  1096. case FSP_HEADER_OFFSET + FSP_SPACE_FLAGS:
  1097. space->flags = val;
  1098. break;
  1099. case FSP_HEADER_OFFSET + FSP_SIZE:
  1100. space->size_in_header = val;
  1101. break;
  1102. case FSP_HEADER_OFFSET + FSP_FREE_LIMIT:
  1103. space->free_limit = val;
  1104. break;
  1105. case FSP_HEADER_OFFSET + FSP_FREE + FLST_LEN:
  1106. space->free_len = val;
  1107. ut_ad(val == flst_get_len(
  1108. page + offs));
  1109. break;
  1110. }
  1111. }
  1112. }
  1113. break;
  1114. case MLOG_REC_INSERT: case MLOG_COMP_REC_INSERT:
  1115. ut_ad(!page || fil_page_type_is_index(page_type));
  1116. if (NULL != (ptr = mlog_parse_index(
  1117. ptr, end_ptr,
  1118. type == MLOG_COMP_REC_INSERT,
  1119. &index))) {
  1120. ut_a(!page
  1121. || (ibool)!!page_is_comp(page)
  1122. == dict_table_is_comp(index->table));
  1123. ptr = page_cur_parse_insert_rec(FALSE, ptr, end_ptr,
  1124. block, index, mtr);
  1125. }
  1126. break;
  1127. case MLOG_REC_CLUST_DELETE_MARK: case MLOG_COMP_REC_CLUST_DELETE_MARK:
  1128. ut_ad(!page || fil_page_type_is_index(page_type));
  1129. if (NULL != (ptr = mlog_parse_index(
  1130. ptr, end_ptr,
  1131. type == MLOG_COMP_REC_CLUST_DELETE_MARK,
  1132. &index))) {
  1133. ut_a(!page
  1134. || (ibool)!!page_is_comp(page)
  1135. == dict_table_is_comp(index->table));
  1136. ptr = btr_cur_parse_del_mark_set_clust_rec(
  1137. ptr, end_ptr, page, page_zip, index);
  1138. }
  1139. break;
  1140. case MLOG_REC_SEC_DELETE_MARK:
  1141. ut_ad(!page || fil_page_type_is_index(page_type));
  1142. ptr = btr_cur_parse_del_mark_set_sec_rec(ptr, end_ptr,
  1143. page, page_zip);
  1144. break;
  1145. case MLOG_REC_UPDATE_IN_PLACE: case MLOG_COMP_REC_UPDATE_IN_PLACE:
  1146. ut_ad(!page || fil_page_type_is_index(page_type));
  1147. if (NULL != (ptr = mlog_parse_index(
  1148. ptr, end_ptr,
  1149. type == MLOG_COMP_REC_UPDATE_IN_PLACE,
  1150. &index))) {
  1151. ut_a(!page
  1152. || (ibool)!!page_is_comp(page)
  1153. == dict_table_is_comp(index->table));
  1154. ptr = btr_cur_parse_update_in_place(ptr, end_ptr, page,
  1155. page_zip, index);
  1156. }
  1157. break;
  1158. case MLOG_LIST_END_DELETE: case MLOG_COMP_LIST_END_DELETE:
  1159. case MLOG_LIST_START_DELETE: case MLOG_COMP_LIST_START_DELETE:
  1160. ut_ad(!page || fil_page_type_is_index(page_type));
  1161. if (NULL != (ptr = mlog_parse_index(
  1162. ptr, end_ptr,
  1163. type == MLOG_COMP_LIST_END_DELETE
  1164. || type == MLOG_COMP_LIST_START_DELETE,
  1165. &index))) {
  1166. ut_a(!page
  1167. || (ibool)!!page_is_comp(page)
  1168. == dict_table_is_comp(index->table));
  1169. ptr = page_parse_delete_rec_list(type, ptr, end_ptr,
  1170. block, index, mtr);
  1171. }
  1172. break;
  1173. case MLOG_LIST_END_COPY_CREATED: case MLOG_COMP_LIST_END_COPY_CREATED:
  1174. ut_ad(!page || fil_page_type_is_index(page_type));
  1175. if (NULL != (ptr = mlog_parse_index(
  1176. ptr, end_ptr,
  1177. type == MLOG_COMP_LIST_END_COPY_CREATED,
  1178. &index))) {
  1179. ut_a(!page
  1180. || (ibool)!!page_is_comp(page)
  1181. == dict_table_is_comp(index->table));
  1182. ptr = page_parse_copy_rec_list_to_created_page(
  1183. ptr, end_ptr, block, index, mtr);
  1184. }
  1185. break;
  1186. case MLOG_PAGE_REORGANIZE:
  1187. case MLOG_COMP_PAGE_REORGANIZE:
  1188. case MLOG_ZIP_PAGE_REORGANIZE:
  1189. ut_ad(!page || fil_page_type_is_index(page_type));
  1190. if (NULL != (ptr = mlog_parse_index(
  1191. ptr, end_ptr,
  1192. type != MLOG_PAGE_REORGANIZE,
  1193. &index))) {
  1194. ut_a(!page
  1195. || (ibool)!!page_is_comp(page)
  1196. == dict_table_is_comp(index->table));
  1197. ptr = btr_parse_page_reorganize(
  1198. ptr, end_ptr, index,
  1199. type == MLOG_ZIP_PAGE_REORGANIZE,
  1200. block, mtr);
  1201. }
  1202. break;
  1203. case MLOG_PAGE_CREATE: case MLOG_COMP_PAGE_CREATE:
  1204. /* Allow anything in page_type when creating a page. */
  1205. ut_a(!page_zip);
  1206. page_parse_create(block, type == MLOG_COMP_PAGE_CREATE, false);
  1207. break;
  1208. case MLOG_PAGE_CREATE_RTREE: case MLOG_COMP_PAGE_CREATE_RTREE:
  1209. page_parse_create(block, type == MLOG_COMP_PAGE_CREATE_RTREE,
  1210. true);
  1211. break;
  1212. case MLOG_UNDO_INSERT:
  1213. ut_ad(!page || page_type == FIL_PAGE_UNDO_LOG);
  1214. ptr = trx_undo_parse_add_undo_rec(ptr, end_ptr, page);
  1215. break;
  1216. case MLOG_UNDO_ERASE_END:
  1217. ut_ad(!page || page_type == FIL_PAGE_UNDO_LOG);
  1218. ptr = trx_undo_parse_erase_page_end(ptr, end_ptr, page, mtr);
  1219. break;
  1220. case MLOG_UNDO_INIT:
  1221. /* Allow anything in page_type when creating a page. */
  1222. ptr = trx_undo_parse_page_init(ptr, end_ptr, page, mtr);
  1223. break;
  1224. case MLOG_UNDO_HDR_DISCARD:
  1225. ut_ad(!page || page_type == FIL_PAGE_UNDO_LOG);
  1226. ptr = trx_undo_parse_discard_latest(ptr, end_ptr, page, mtr);
  1227. break;
  1228. case MLOG_UNDO_HDR_CREATE:
  1229. case MLOG_UNDO_HDR_REUSE:
  1230. ut_ad(!page || page_type == FIL_PAGE_UNDO_LOG);
  1231. ptr = trx_undo_parse_page_header(type, ptr, end_ptr,
  1232. page, mtr);
  1233. break;
  1234. case MLOG_REC_MIN_MARK: case MLOG_COMP_REC_MIN_MARK:
  1235. ut_ad(!page || fil_page_type_is_index(page_type));
  1236. /* On a compressed page, MLOG_COMP_REC_MIN_MARK
  1237. will be followed by MLOG_COMP_REC_DELETE
  1238. or MLOG_ZIP_WRITE_HEADER(FIL_PAGE_PREV, FIL_NULL)
  1239. in the same mini-transaction. */
  1240. ut_a(type == MLOG_COMP_REC_MIN_MARK || !page_zip);
  1241. ptr = btr_parse_set_min_rec_mark(
  1242. ptr, end_ptr, type == MLOG_COMP_REC_MIN_MARK,
  1243. page, mtr);
  1244. break;
  1245. case MLOG_REC_DELETE: case MLOG_COMP_REC_DELETE:
  1246. ut_ad(!page || fil_page_type_is_index(page_type));
  1247. if (NULL != (ptr = mlog_parse_index(
  1248. ptr, end_ptr,
  1249. type == MLOG_COMP_REC_DELETE,
  1250. &index))) {
  1251. ut_a(!page
  1252. || (ibool)!!page_is_comp(page)
  1253. == dict_table_is_comp(index->table));
  1254. ptr = page_cur_parse_delete_rec(ptr, end_ptr,
  1255. block, index, mtr);
  1256. }
  1257. break;
  1258. case MLOG_IBUF_BITMAP_INIT:
  1259. /* Allow anything in page_type when creating a page. */
  1260. ptr = ibuf_parse_bitmap_init(ptr, end_ptr, block, mtr);
  1261. break;
  1262. case MLOG_INIT_FILE_PAGE:
  1263. case MLOG_INIT_FILE_PAGE2:
  1264. /* Allow anything in page_type when creating a page. */
  1265. ptr = fsp_parse_init_file_page(ptr, end_ptr, block);
  1266. break;
  1267. case MLOG_WRITE_STRING:
  1268. ptr = mlog_parse_string(ptr, end_ptr, page, page_zip);
  1269. break;
  1270. case MLOG_ZIP_WRITE_NODE_PTR:
  1271. ut_ad(!page || fil_page_type_is_index(page_type));
  1272. ptr = page_zip_parse_write_node_ptr(ptr, end_ptr,
  1273. page, page_zip);
  1274. break;
  1275. case MLOG_ZIP_WRITE_BLOB_PTR:
  1276. ut_ad(!page || fil_page_type_is_index(page_type));
  1277. ptr = page_zip_parse_write_blob_ptr(ptr, end_ptr,
  1278. page, page_zip);
  1279. break;
  1280. case MLOG_ZIP_WRITE_HEADER:
  1281. ut_ad(!page || fil_page_type_is_index(page_type));
  1282. ptr = page_zip_parse_write_header(ptr, end_ptr,
  1283. page, page_zip);
  1284. break;
  1285. case MLOG_ZIP_PAGE_COMPRESS:
  1286. /* Allow anything in page_type when creating a page. */
  1287. ptr = page_zip_parse_compress(ptr, end_ptr,
  1288. page, page_zip);
  1289. break;
  1290. case MLOG_ZIP_PAGE_COMPRESS_NO_DATA:
  1291. if (NULL != (ptr = mlog_parse_index(
  1292. ptr, end_ptr, TRUE, &index))) {
  1293. ut_a(!page || ((ibool)!!page_is_comp(page)
  1294. == dict_table_is_comp(index->table)));
  1295. ptr = page_zip_parse_compress_no_data(
  1296. ptr, end_ptr, page, page_zip, index);
  1297. }
  1298. break;
  1299. case MLOG_FILE_WRITE_CRYPT_DATA:
  1300. ptr = const_cast<byte*>(fil_parse_write_crypt_data(ptr, end_ptr, block));
  1301. break;
  1302. default:
  1303. ptr = NULL;
  1304. ib::error() << "Incorrect log record type:" << type;
  1305. recv_sys->found_corrupt_log = true;
  1306. }
  1307. if (index) {
  1308. dict_table_t* table = index->table;
  1309. dict_mem_index_free(index);
  1310. dict_mem_table_free(table);
  1311. }
  1312. return(ptr);
  1313. }
  1314. /*********************************************************************//**
  1315. Calculates the fold value of a page file address: used in inserting or
  1316. searching for a log record in the hash table.
  1317. @return folded value */
  1318. UNIV_INLINE
  1319. ulint
  1320. recv_fold(
  1321. /*======*/
  1322. ulint space, /*!< in: space */
  1323. ulint page_no)/*!< in: page number */
  1324. {
  1325. return(ut_fold_ulint_pair(space, page_no));
  1326. }
  1327. /*********************************************************************//**
  1328. Calculates the hash value of a page file address: used in inserting or
  1329. searching for a log record in the hash table.
  1330. @return folded value */
  1331. UNIV_INLINE
  1332. ulint
  1333. recv_hash(
  1334. /*======*/
  1335. ulint space, /*!< in: space */
  1336. ulint page_no)/*!< in: page number */
  1337. {
  1338. return(hash_calc_hash(recv_fold(space, page_no), recv_sys->addr_hash));
  1339. }
  1340. /*********************************************************************//**
  1341. Gets the hashed file address struct for a page.
  1342. @return file address struct, NULL if not found from the hash table */
  1343. static
  1344. recv_addr_t*
  1345. recv_get_fil_addr_struct(
  1346. /*=====================*/
  1347. ulint space, /*!< in: space id */
  1348. ulint page_no)/*!< in: page number */
  1349. {
  1350. recv_addr_t* recv_addr;
  1351. for (recv_addr = static_cast<recv_addr_t*>(
  1352. HASH_GET_FIRST(recv_sys->addr_hash,
  1353. recv_hash(space, page_no)));
  1354. recv_addr != 0;
  1355. recv_addr = static_cast<recv_addr_t*>(
  1356. HASH_GET_NEXT(addr_hash, recv_addr))) {
  1357. if (recv_addr->space == space
  1358. && recv_addr->page_no == page_no) {
  1359. return(recv_addr);
  1360. }
  1361. }
  1362. return(NULL);
  1363. }
  1364. /*******************************************************************//**
  1365. Adds a new log record to the hash table of log records. */
  1366. static
  1367. void
  1368. recv_add_to_hash_table(
  1369. /*===================*/
  1370. mlog_id_t type, /*!< in: log record type */
  1371. ulint space, /*!< in: space id */
  1372. ulint page_no, /*!< in: page number */
  1373. byte* body, /*!< in: log record body */
  1374. byte* rec_end, /*!< in: log record end */
  1375. lsn_t start_lsn, /*!< in: start lsn of the mtr */
  1376. lsn_t end_lsn) /*!< in: end lsn of the mtr */
  1377. {
  1378. recv_t* recv;
  1379. ulint len;
  1380. recv_data_t* recv_data;
  1381. recv_data_t** prev_field;
  1382. recv_addr_t* recv_addr;
  1383. ut_ad(type != MLOG_FILE_DELETE);
  1384. ut_ad(type != MLOG_FILE_CREATE2);
  1385. ut_ad(type != MLOG_FILE_RENAME2);
  1386. ut_ad(type != MLOG_FILE_NAME);
  1387. ut_ad(type != MLOG_DUMMY_RECORD);
  1388. ut_ad(type != MLOG_CHECKPOINT);
  1389. ut_ad(type != MLOG_INDEX_LOAD);
  1390. ut_ad(type != MLOG_TRUNCATE);
  1391. len = rec_end - body;
  1392. recv = static_cast<recv_t*>(
  1393. mem_heap_alloc(recv_sys->heap, sizeof(recv_t)));
  1394. recv->type = type;
  1395. recv->len = rec_end - body;
  1396. recv->start_lsn = start_lsn;
  1397. recv->end_lsn = end_lsn;
  1398. recv_addr = recv_get_fil_addr_struct(space, page_no);
  1399. if (recv_addr == NULL) {
  1400. recv_addr = static_cast<recv_addr_t*>(
  1401. mem_heap_alloc(recv_sys->heap, sizeof(recv_addr_t)));
  1402. recv_addr->space = space;
  1403. recv_addr->page_no = page_no;
  1404. recv_addr->state = RECV_NOT_PROCESSED;
  1405. UT_LIST_INIT(recv_addr->rec_list, &recv_t::rec_list);
  1406. HASH_INSERT(recv_addr_t, addr_hash, recv_sys->addr_hash,
  1407. recv_fold(space, page_no), recv_addr);
  1408. recv_sys->n_addrs++;
  1409. #if 0
  1410. fprintf(stderr, "Inserting log rec for space %lu, page %lu\n",
  1411. space, page_no);
  1412. #endif
  1413. }
  1414. UT_LIST_ADD_LAST(recv_addr->rec_list, recv);
  1415. prev_field = &(recv->data);
  1416. /* Store the log record body in chunks of less than UNIV_PAGE_SIZE:
  1417. recv_sys->heap grows into the buffer pool, and bigger chunks could not
  1418. be allocated */
  1419. while (rec_end > body) {
  1420. len = rec_end - body;
  1421. if (len > RECV_DATA_BLOCK_SIZE) {
  1422. len = RECV_DATA_BLOCK_SIZE;
  1423. }
  1424. recv_data = static_cast<recv_data_t*>(
  1425. mem_heap_alloc(recv_sys->heap,
  1426. sizeof(recv_data_t) + len));
  1427. *prev_field = recv_data;
  1428. memcpy(recv_data + 1, body, len);
  1429. prev_field = &(recv_data->next);
  1430. body += len;
  1431. }
  1432. *prev_field = NULL;
  1433. }
  1434. /*********************************************************************//**
  1435. Copies the log record body from recv to buf. */
  1436. static
  1437. void
  1438. recv_data_copy_to_buf(
  1439. /*==================*/
  1440. byte* buf, /*!< in: buffer of length at least recv->len */
  1441. recv_t* recv) /*!< in: log record */
  1442. {
  1443. recv_data_t* recv_data;
  1444. ulint part_len;
  1445. ulint len;
  1446. len = recv->len;
  1447. recv_data = recv->data;
  1448. while (len > 0) {
  1449. if (len > RECV_DATA_BLOCK_SIZE) {
  1450. part_len = RECV_DATA_BLOCK_SIZE;
  1451. } else {
  1452. part_len = len;
  1453. }
  1454. ut_memcpy(buf, ((byte*) recv_data) + sizeof(recv_data_t),
  1455. part_len);
  1456. buf += part_len;
  1457. len -= part_len;
  1458. recv_data = recv_data->next;
  1459. }
  1460. }
  1461. /** Apply the hashed log records to the page, if the page lsn is less than the
  1462. lsn of a log record.
  1463. @param just_read_in whether the page recently arrived to the I/O handler
  1464. @param block the page in the buffer pool */
  1465. void
  1466. recv_recover_page(bool just_read_in, buf_block_t* block)
  1467. {
  1468. page_t* page;
  1469. page_zip_des_t* page_zip;
  1470. recv_addr_t* recv_addr;
  1471. recv_t* recv;
  1472. byte* buf;
  1473. lsn_t start_lsn;
  1474. lsn_t end_lsn;
  1475. lsn_t page_lsn;
  1476. lsn_t page_newest_lsn;
  1477. ibool modification_to_page;
  1478. mtr_t mtr;
  1479. mutex_enter(&(recv_sys->mutex));
  1480. if (recv_sys->apply_log_recs == FALSE) {
  1481. /* Log records should not be applied now */
  1482. mutex_exit(&(recv_sys->mutex));
  1483. return;
  1484. }
  1485. recv_addr = recv_get_fil_addr_struct(block->page.id.space(),
  1486. block->page.id.page_no());
  1487. if ((recv_addr == NULL)
  1488. || (recv_addr->state == RECV_BEING_PROCESSED)
  1489. || (recv_addr->state == RECV_PROCESSED)) {
  1490. ut_ad(recv_addr == NULL || recv_needed_recovery);
  1491. mutex_exit(&(recv_sys->mutex));
  1492. return;
  1493. }
  1494. ut_ad(recv_needed_recovery);
  1495. DBUG_PRINT("ib_log",
  1496. ("Applying log to page %u:%u",
  1497. recv_addr->space, recv_addr->page_no));
  1498. recv_addr->state = RECV_BEING_PROCESSED;
  1499. mutex_exit(&(recv_sys->mutex));
  1500. mtr_start(&mtr);
  1501. mtr_set_log_mode(&mtr, MTR_LOG_NONE);
  1502. page = block->frame;
  1503. page_zip = buf_block_get_page_zip(block);
  1504. if (just_read_in) {
  1505. /* Move the ownership of the x-latch on the page to
  1506. this OS thread, so that we can acquire a second
  1507. x-latch on it. This is needed for the operations to
  1508. the page to pass the debug checks. */
  1509. rw_lock_x_lock_move_ownership(&block->lock);
  1510. }
  1511. ibool success = buf_page_get_known_nowait(
  1512. RW_X_LATCH, block, BUF_KEEP_OLD,
  1513. __FILE__, __LINE__, &mtr);
  1514. ut_a(success);
  1515. buf_block_dbg_add_level(block, SYNC_NO_ORDER_CHECK);
  1516. /* Read the newest modification lsn from the page */
  1517. page_lsn = mach_read_from_8(page + FIL_PAGE_LSN);
  1518. /* It may be that the page has been modified in the buffer
  1519. pool: read the newest modification lsn there */
  1520. page_newest_lsn = buf_page_get_newest_modification(&block->page);
  1521. if (page_newest_lsn) {
  1522. page_lsn = page_newest_lsn;
  1523. }
  1524. modification_to_page = FALSE;
  1525. start_lsn = end_lsn = 0;
  1526. recv = UT_LIST_GET_FIRST(recv_addr->rec_list);
  1527. while (recv) {
  1528. end_lsn = recv->end_lsn;
  1529. ut_ad(end_lsn
  1530. <= UT_LIST_GET_FIRST(log_sys->log_groups)->scanned_lsn);
  1531. if (recv->len > RECV_DATA_BLOCK_SIZE) {
  1532. /* We have to copy the record body to a separate
  1533. buffer */
  1534. buf = static_cast<byte*>(ut_malloc_nokey(recv->len));
  1535. recv_data_copy_to_buf(buf, recv);
  1536. } else {
  1537. buf = ((byte*)(recv->data)) + sizeof(recv_data_t);
  1538. }
  1539. if (recv->type == MLOG_INIT_FILE_PAGE) {
  1540. page_lsn = page_newest_lsn;
  1541. memset(FIL_PAGE_LSN + page, 0, 8);
  1542. memset(UNIV_PAGE_SIZE - FIL_PAGE_END_LSN_OLD_CHKSUM
  1543. + page, 0, 8);
  1544. if (page_zip) {
  1545. memset(FIL_PAGE_LSN + page_zip->data, 0, 8);
  1546. }
  1547. }
  1548. /* If per-table tablespace was truncated and there exist REDO
  1549. records before truncate that are to be applied as part of
  1550. recovery (checkpoint didn't happen since truncate was done)
  1551. skip such records using lsn check as they may not stand valid
  1552. post truncate.
  1553. LSN at start of truncate is recorded and any redo record
  1554. with LSN less than recorded LSN is skipped.
  1555. Note: We can't skip complete recv_addr as same page may have
  1556. valid REDO records post truncate those needs to be applied. */
  1557. bool skip_recv = false;
  1558. if (srv_was_tablespace_truncated(fil_space_get(recv_addr->space))) {
  1559. lsn_t init_lsn =
  1560. truncate_t::get_truncated_tablespace_init_lsn(
  1561. recv_addr->space);
  1562. skip_recv = (recv->start_lsn < init_lsn);
  1563. }
  1564. /* Ignore applying the redo logs for tablespace that is
  1565. truncated. Post recovery there is fixup action that will
  1566. restore the tablespace back to normal state.
  1567. Applying redo at this stage can result in error given that
  1568. redo will have action recorded on page before tablespace
  1569. was re-inited and that would lead to an error while applying
  1570. such action. */
  1571. if (recv->start_lsn >= page_lsn
  1572. && !srv_is_tablespace_truncated(recv_addr->space)
  1573. && !skip_recv) {
  1574. lsn_t end_lsn;
  1575. if (!modification_to_page) {
  1576. modification_to_page = TRUE;
  1577. start_lsn = recv->start_lsn;
  1578. }
  1579. DBUG_PRINT("ib_log",
  1580. ("apply " LSN_PF ":"
  1581. " %s len " ULINTPF " page %u:%u",
  1582. recv->start_lsn,
  1583. get_mlog_string(recv->type), recv->len,
  1584. recv_addr->space,
  1585. recv_addr->page_no));
  1586. recv_parse_or_apply_log_rec_body(
  1587. recv->type, buf, buf + recv->len,
  1588. recv_addr->space, recv_addr->page_no,
  1589. true, block, &mtr);
  1590. end_lsn = recv->start_lsn + recv->len;
  1591. mach_write_to_8(FIL_PAGE_LSN + page, end_lsn);
  1592. mach_write_to_8(UNIV_PAGE_SIZE
  1593. - FIL_PAGE_END_LSN_OLD_CHKSUM
  1594. + page, end_lsn);
  1595. if (page_zip) {
  1596. mach_write_to_8(FIL_PAGE_LSN
  1597. + page_zip->data, end_lsn);
  1598. }
  1599. }
  1600. if (recv->len > RECV_DATA_BLOCK_SIZE) {
  1601. ut_free(buf);
  1602. }
  1603. recv = UT_LIST_GET_NEXT(rec_list, recv);
  1604. }
  1605. #ifdef UNIV_ZIP_DEBUG
  1606. if (fil_page_index_page_check(page)) {
  1607. page_zip_des_t* page_zip = buf_block_get_page_zip(block);
  1608. ut_a(!page_zip
  1609. || page_zip_validate_low(page_zip, page, NULL, FALSE));
  1610. }
  1611. #endif /* UNIV_ZIP_DEBUG */
  1612. if (modification_to_page) {
  1613. ut_a(block);
  1614. log_flush_order_mutex_enter();
  1615. buf_flush_recv_note_modification(block, start_lsn, end_lsn);
  1616. log_flush_order_mutex_exit();
  1617. }
  1618. /* Make sure that committing mtr does not change the modification
  1619. lsn values of page */
  1620. mtr.discard_modifications();
  1621. mtr_commit(&mtr);
  1622. ib_time_t time = ut_time();
  1623. mutex_enter(&recv_sys->mutex);
  1624. if (recv_max_page_lsn < page_lsn) {
  1625. recv_max_page_lsn = page_lsn;
  1626. }
  1627. recv_addr->state = RECV_PROCESSED;
  1628. ut_a(recv_sys->n_addrs > 0);
  1629. if (ulint n = --recv_sys->n_addrs) {
  1630. if (recv_sys->report(time)) {
  1631. ib::info() << "To recover: " << n << " pages from log";
  1632. sd_notifyf(0, "STATUS=To recover: " ULINTPF
  1633. " pages from log", n);
  1634. }
  1635. }
  1636. mutex_exit(&recv_sys->mutex);
  1637. }
  1638. /** Reads in pages which have hashed log records, from an area around a given
  1639. page number.
  1640. @param[in] page_id page id
  1641. @return number of pages found */
  1642. static
  1643. ulint
  1644. recv_read_in_area(
  1645. const page_id_t& page_id)
  1646. {
  1647. recv_addr_t* recv_addr;
  1648. ulint page_nos[RECV_READ_AHEAD_AREA];
  1649. ulint low_limit;
  1650. ulint n;
  1651. low_limit = page_id.page_no()
  1652. - (page_id.page_no() % RECV_READ_AHEAD_AREA);
  1653. n = 0;
  1654. for (ulint page_no = low_limit;
  1655. page_no < low_limit + RECV_READ_AHEAD_AREA;
  1656. page_no++) {
  1657. recv_addr = recv_get_fil_addr_struct(page_id.space(), page_no);
  1658. const page_id_t cur_page_id(page_id.space(), page_no);
  1659. if (recv_addr && !buf_page_peek(cur_page_id)) {
  1660. mutex_enter(&(recv_sys->mutex));
  1661. if (recv_addr->state == RECV_NOT_PROCESSED) {
  1662. recv_addr->state = RECV_BEING_READ;
  1663. page_nos[n] = page_no;
  1664. n++;
  1665. }
  1666. mutex_exit(&(recv_sys->mutex));
  1667. }
  1668. }
  1669. buf_read_recv_pages(FALSE, page_id.space(), page_nos, n);
  1670. return(n);
  1671. }
  1672. /** Apply the hash table of stored log records to persistent data pages.
  1673. @param[in] last_batch whether the change buffer merge will be
  1674. performed as part of the operation */
  1675. void
  1676. recv_apply_hashed_log_recs(bool last_batch)
  1677. {
  1678. for (;;) {
  1679. mutex_enter(&recv_sys->mutex);
  1680. if (!recv_sys->apply_batch_on) {
  1681. break;
  1682. }
  1683. mutex_exit(&recv_sys->mutex);
  1684. os_thread_sleep(500000);
  1685. }
  1686. ut_ad(!last_batch == log_mutex_own());
  1687. if (!last_batch) {
  1688. recv_no_ibuf_operations = true;
  1689. }
  1690. if (ulint n = recv_sys->n_addrs) {
  1691. const char* msg = last_batch
  1692. ? "Starting final batch to recover "
  1693. : "Starting a batch to recover ";
  1694. ib::info() << msg << n << " pages from redo log.";
  1695. sd_notifyf(0, "STATUS=%s" ULINTPF " pages from redo log",
  1696. msg, n);
  1697. }
  1698. recv_sys->apply_log_recs = TRUE;
  1699. recv_sys->apply_batch_on = TRUE;
  1700. for (ulint i = 0; i < hash_get_n_cells(recv_sys->addr_hash); i++) {
  1701. for (recv_addr_t* recv_addr = static_cast<recv_addr_t*>(
  1702. HASH_GET_FIRST(recv_sys->addr_hash, i));
  1703. recv_addr;
  1704. recv_addr = static_cast<recv_addr_t*>(
  1705. HASH_GET_NEXT(addr_hash, recv_addr))) {
  1706. if (srv_is_tablespace_truncated(recv_addr->space)) {
  1707. /* Avoid applying REDO log for the tablespace
  1708. that is schedule for TRUNCATE. */
  1709. ut_a(recv_sys->n_addrs);
  1710. recv_addr->state = RECV_DISCARDED;
  1711. recv_sys->n_addrs--;
  1712. continue;
  1713. }
  1714. if (recv_addr->state == RECV_DISCARDED) {
  1715. ut_a(recv_sys->n_addrs);
  1716. recv_sys->n_addrs--;
  1717. continue;
  1718. }
  1719. const page_id_t page_id(recv_addr->space,
  1720. recv_addr->page_no);
  1721. bool found;
  1722. const page_size_t& page_size
  1723. = fil_space_get_page_size(recv_addr->space,
  1724. &found);
  1725. ut_ad(found);
  1726. if (recv_addr->state == RECV_NOT_PROCESSED) {
  1727. mutex_exit(&recv_sys->mutex);
  1728. if (buf_page_peek(page_id)) {
  1729. mtr_t mtr;
  1730. mtr.start();
  1731. buf_block_t* block = buf_page_get(
  1732. page_id, page_size,
  1733. RW_X_LATCH, &mtr);
  1734. buf_block_dbg_add_level(
  1735. block, SYNC_NO_ORDER_CHECK);
  1736. recv_recover_page(FALSE, block);
  1737. mtr.commit();
  1738. } else {
  1739. recv_read_in_area(page_id);
  1740. }
  1741. mutex_enter(&recv_sys->mutex);
  1742. }
  1743. }
  1744. }
  1745. /* Wait until all the pages have been processed */
  1746. while (recv_sys->n_addrs != 0) {
  1747. mutex_exit(&(recv_sys->mutex));
  1748. os_thread_sleep(500000);
  1749. mutex_enter(&(recv_sys->mutex));
  1750. }
  1751. if (!last_batch) {
  1752. /* Flush all the file pages to disk and invalidate them in
  1753. the buffer pool */
  1754. ut_d(recv_no_log_write = true);
  1755. mutex_exit(&(recv_sys->mutex));
  1756. log_mutex_exit();
  1757. /* Stop the recv_writer thread from issuing any LRU
  1758. flush batches. */
  1759. mutex_enter(&recv_sys->writer_mutex);
  1760. /* Wait for any currently run batch to end. */
  1761. buf_flush_wait_LRU_batch_end();
  1762. os_event_reset(recv_sys->flush_end);
  1763. recv_sys->flush_type = BUF_FLUSH_LIST;
  1764. os_event_set(recv_sys->flush_start);
  1765. os_event_wait(recv_sys->flush_end);
  1766. buf_pool_invalidate();
  1767. /* Allow batches from recv_writer thread. */
  1768. mutex_exit(&recv_sys->writer_mutex);
  1769. log_mutex_enter();
  1770. mutex_enter(&(recv_sys->mutex));
  1771. ut_d(recv_no_log_write = false);
  1772. recv_no_ibuf_operations = false;
  1773. }
  1774. recv_sys->apply_log_recs = FALSE;
  1775. recv_sys->apply_batch_on = FALSE;
  1776. recv_sys_empty_hash();
  1777. mutex_exit(&recv_sys->mutex);
  1778. }
  1779. /** Tries to parse a single log record.
  1780. @param[out] type log record type
  1781. @param[in] ptr pointer to a buffer
  1782. @param[in] end_ptr end of the buffer
  1783. @param[out] space_id tablespace identifier
  1784. @param[out] page_no page number
  1785. @param[in] apply whether to apply MLOG_FILE_* records
  1786. @param[out] body start of log record body
  1787. @return length of the record, or 0 if the record was not complete */
  1788. static
  1789. ulint
  1790. recv_parse_log_rec(
  1791. mlog_id_t* type,
  1792. byte* ptr,
  1793. byte* end_ptr,
  1794. ulint* space,
  1795. ulint* page_no,
  1796. bool apply,
  1797. byte** body)
  1798. {
  1799. byte* new_ptr;
  1800. *body = NULL;
  1801. UNIV_MEM_INVALID(type, sizeof *type);
  1802. UNIV_MEM_INVALID(space, sizeof *space);
  1803. UNIV_MEM_INVALID(page_no, sizeof *page_no);
  1804. UNIV_MEM_INVALID(body, sizeof *body);
  1805. if (ptr == end_ptr) {
  1806. return(0);
  1807. }
  1808. switch (*ptr) {
  1809. #ifdef UNIV_LOG_LSN_DEBUG
  1810. case MLOG_LSN | MLOG_SINGLE_REC_FLAG:
  1811. case MLOG_LSN:
  1812. new_ptr = mlog_parse_initial_log_record(
  1813. ptr, end_ptr, type, space, page_no);
  1814. if (new_ptr != NULL) {
  1815. const lsn_t lsn = static_cast<lsn_t>(
  1816. *space) << 32 | *page_no;
  1817. ut_a(lsn == recv_sys->recovered_lsn);
  1818. }
  1819. *type = MLOG_LSN;
  1820. return(new_ptr - ptr);
  1821. #endif /* UNIV_LOG_LSN_DEBUG */
  1822. case MLOG_MULTI_REC_END:
  1823. case MLOG_DUMMY_RECORD:
  1824. *type = static_cast<mlog_id_t>(*ptr);
  1825. return(1);
  1826. case MLOG_CHECKPOINT:
  1827. if (end_ptr < ptr + SIZE_OF_MLOG_CHECKPOINT) {
  1828. return(0);
  1829. }
  1830. *type = static_cast<mlog_id_t>(*ptr);
  1831. return(SIZE_OF_MLOG_CHECKPOINT);
  1832. case MLOG_MULTI_REC_END | MLOG_SINGLE_REC_FLAG:
  1833. case MLOG_DUMMY_RECORD | MLOG_SINGLE_REC_FLAG:
  1834. case MLOG_CHECKPOINT | MLOG_SINGLE_REC_FLAG:
  1835. ib::error() << "Incorrect log record type:" << *ptr;
  1836. recv_sys->found_corrupt_log = true;
  1837. return(0);
  1838. }
  1839. new_ptr = mlog_parse_initial_log_record(ptr, end_ptr, type, space,
  1840. page_no);
  1841. *body = new_ptr;
  1842. if (UNIV_UNLIKELY(!new_ptr)) {
  1843. return(0);
  1844. }
  1845. const byte* old_ptr = new_ptr;
  1846. new_ptr = recv_parse_or_apply_log_rec_body(
  1847. *type, new_ptr, end_ptr, *space, *page_no, apply, NULL, NULL);
  1848. if (UNIV_UNLIKELY(new_ptr == NULL)) {
  1849. return(0);
  1850. }
  1851. if (*page_no == 0 && *type == MLOG_4BYTES
  1852. && mach_read_from_2(old_ptr) == FSP_HEADER_OFFSET + FSP_SIZE) {
  1853. old_ptr += 2;
  1854. fil_space_set_recv_size(*space,
  1855. mach_parse_compressed(&old_ptr,
  1856. end_ptr));
  1857. }
  1858. return(new_ptr - ptr);
  1859. }
  1860. /*******************************************************//**
  1861. Calculates the new value for lsn when more data is added to the log. */
  1862. static
  1863. lsn_t
  1864. recv_calc_lsn_on_data_add(
  1865. /*======================*/
  1866. lsn_t lsn, /*!< in: old lsn */
  1867. ib_uint64_t len) /*!< in: this many bytes of data is
  1868. added, log block headers not included */
  1869. {
  1870. ulint frag_len;
  1871. ib_uint64_t lsn_len;
  1872. frag_len = (lsn % OS_FILE_LOG_BLOCK_SIZE) - LOG_BLOCK_HDR_SIZE;
  1873. ut_ad(frag_len < OS_FILE_LOG_BLOCK_SIZE - LOG_BLOCK_HDR_SIZE
  1874. - LOG_BLOCK_TRL_SIZE);
  1875. lsn_len = len;
  1876. lsn_len += (lsn_len + frag_len)
  1877. / (OS_FILE_LOG_BLOCK_SIZE - LOG_BLOCK_HDR_SIZE
  1878. - LOG_BLOCK_TRL_SIZE)
  1879. * (LOG_BLOCK_HDR_SIZE + LOG_BLOCK_TRL_SIZE);
  1880. return(lsn + lsn_len);
  1881. }
  1882. /** Prints diagnostic info of corrupt log.
  1883. @param[in] ptr pointer to corrupt log record
  1884. @param[in] type type of the log record (could be garbage)
  1885. @param[in] space tablespace ID (could be garbage)
  1886. @param[in] page_no page number (could be garbage)
  1887. @return whether processing should continue */
  1888. static
  1889. bool
  1890. recv_report_corrupt_log(
  1891. const byte* ptr,
  1892. int type,
  1893. ulint space,
  1894. ulint page_no)
  1895. {
  1896. ib::error() <<
  1897. "############### CORRUPT LOG RECORD FOUND ##################";
  1898. ib::info() << "Log record type " << type << ", page " << space << ":"
  1899. << page_no << ". Log parsing proceeded successfully up to "
  1900. << recv_sys->recovered_lsn << ". Previous log record type "
  1901. << recv_previous_parsed_rec_type << ", is multi "
  1902. << recv_previous_parsed_rec_is_multi << " Recv offset "
  1903. << (ptr - recv_sys->buf) << ", prev "
  1904. << recv_previous_parsed_rec_offset;
  1905. ut_ad(ptr <= recv_sys->buf + recv_sys->len);
  1906. const ulint limit = 100;
  1907. const ulint before
  1908. = std::min(recv_previous_parsed_rec_offset, limit);
  1909. const ulint after
  1910. = std::min(recv_sys->len - (ptr - recv_sys->buf), limit);
  1911. ib::info() << "Hex dump starting " << before << " bytes before and"
  1912. " ending " << after << " bytes after the corrupted record:";
  1913. ut_print_buf(stderr,
  1914. recv_sys->buf
  1915. + recv_previous_parsed_rec_offset - before,
  1916. ptr - recv_sys->buf + before + after
  1917. - recv_previous_parsed_rec_offset);
  1918. putc('\n', stderr);
  1919. if (!srv_force_recovery) {
  1920. ib::info() << "Set innodb_force_recovery to ignore this error.";
  1921. return(false);
  1922. }
  1923. ib::warn() << "The log file may have been corrupt and it is possible"
  1924. " that the log scan did not proceed far enough in recovery!"
  1925. " Please run CHECK TABLE on your InnoDB tables to check"
  1926. " that they are ok! If mysqld crashes after this recovery; "
  1927. << FORCE_RECOVERY_MSG;
  1928. return(true);
  1929. }
  1930. /** Whether to store redo log records to the hash table */
  1931. enum store_t {
  1932. /** Do not store redo log records. */
  1933. STORE_NO,
  1934. /** Store redo log records. */
  1935. STORE_YES,
  1936. /** Store redo log records if the tablespace exists. */
  1937. STORE_IF_EXISTS
  1938. };
  1939. /** Parse log records from a buffer and optionally store them to a
  1940. hash table to wait merging to file pages.
  1941. @param[in] checkpoint_lsn the LSN of the latest checkpoint
  1942. @param[in] store whether to store page operations
  1943. @param[in] apply whether to apply the records
  1944. @param[out] err DB_SUCCESS or error code
  1945. @return whether MLOG_CHECKPOINT record was seen the first time,
  1946. or corruption was noticed */
  1947. static MY_ATTRIBUTE((warn_unused_result))
  1948. bool
  1949. recv_parse_log_recs(
  1950. lsn_t checkpoint_lsn,
  1951. store_t store,
  1952. bool apply)
  1953. {
  1954. byte* ptr;
  1955. byte* end_ptr;
  1956. bool single_rec;
  1957. ulint len;
  1958. lsn_t new_recovered_lsn;
  1959. lsn_t old_lsn;
  1960. mlog_id_t type;
  1961. ulint space;
  1962. ulint page_no;
  1963. byte* body;
  1964. ut_ad(log_mutex_own());
  1965. ut_ad(recv_sys->parse_start_lsn != 0);
  1966. loop:
  1967. ptr = recv_sys->buf + recv_sys->recovered_offset;
  1968. end_ptr = recv_sys->buf + recv_sys->len;
  1969. if (ptr == end_ptr) {
  1970. return(false);
  1971. }
  1972. switch (*ptr) {
  1973. case MLOG_CHECKPOINT:
  1974. #ifdef UNIV_LOG_LSN_DEBUG
  1975. case MLOG_LSN:
  1976. #endif /* UNIV_LOG_LSN_DEBUG */
  1977. case MLOG_DUMMY_RECORD:
  1978. single_rec = true;
  1979. break;
  1980. default:
  1981. single_rec = !!(*ptr & MLOG_SINGLE_REC_FLAG);
  1982. }
  1983. if (single_rec) {
  1984. /* The mtr did not modify multiple pages */
  1985. old_lsn = recv_sys->recovered_lsn;
  1986. /* Try to parse a log record, fetching its type, space id,
  1987. page no, and a pointer to the body of the log record */
  1988. len = recv_parse_log_rec(&type, ptr, end_ptr, &space,
  1989. &page_no, apply, &body);
  1990. if (len == 0) {
  1991. return(false);
  1992. }
  1993. if (recv_sys->found_corrupt_log) {
  1994. recv_report_corrupt_log(
  1995. ptr, type, space, page_no);
  1996. return(true);
  1997. }
  1998. if (recv_sys->found_corrupt_fs) {
  1999. return(true);
  2000. }
  2001. new_recovered_lsn = recv_calc_lsn_on_data_add(old_lsn, len);
  2002. if (new_recovered_lsn > recv_sys->scanned_lsn) {
  2003. /* The log record filled a log block, and we require
  2004. that also the next log block should have been scanned
  2005. in */
  2006. return(false);
  2007. }
  2008. recv_previous_parsed_rec_type = type;
  2009. recv_previous_parsed_rec_offset = recv_sys->recovered_offset;
  2010. recv_previous_parsed_rec_is_multi = 0;
  2011. recv_sys->recovered_offset += len;
  2012. recv_sys->recovered_lsn = new_recovered_lsn;
  2013. switch (type) {
  2014. lsn_t lsn;
  2015. case MLOG_DUMMY_RECORD:
  2016. /* Do nothing */
  2017. break;
  2018. case MLOG_CHECKPOINT:
  2019. #if SIZE_OF_MLOG_CHECKPOINT != 1 + 8
  2020. # error SIZE_OF_MLOG_CHECKPOINT != 1 + 8
  2021. #endif
  2022. lsn = mach_read_from_8(ptr + 1);
  2023. DBUG_PRINT("ib_log",
  2024. ("MLOG_CHECKPOINT(" LSN_PF ") %s at "
  2025. LSN_PF,
  2026. lsn,
  2027. lsn != checkpoint_lsn ? "ignored"
  2028. : recv_sys->mlog_checkpoint_lsn
  2029. ? "reread" : "read",
  2030. recv_sys->recovered_lsn));
  2031. if (lsn == checkpoint_lsn) {
  2032. if (recv_sys->mlog_checkpoint_lsn) {
  2033. /* At recv_reset_logs() we may
  2034. write a duplicate MLOG_CHECKPOINT
  2035. for the same checkpoint LSN. Thus
  2036. recv_sys->mlog_checkpoint_lsn
  2037. can differ from the current LSN. */
  2038. ut_ad(recv_sys->mlog_checkpoint_lsn
  2039. <= recv_sys->recovered_lsn);
  2040. break;
  2041. }
  2042. recv_sys->mlog_checkpoint_lsn
  2043. = recv_sys->recovered_lsn;
  2044. return(true);
  2045. }
  2046. break;
  2047. #ifdef UNIV_LOG_LSN_DEBUG
  2048. case MLOG_LSN:
  2049. /* Do not add these records to the hash table.
  2050. The page number and space id fields are misused
  2051. for something else. */
  2052. break;
  2053. #endif /* UNIV_LOG_LSN_DEBUG */
  2054. default:
  2055. switch (store) {
  2056. case STORE_NO:
  2057. break;
  2058. case STORE_IF_EXISTS:
  2059. if (fil_space_get_flags(space)
  2060. == ULINT_UNDEFINED) {
  2061. break;
  2062. }
  2063. /* fall through */
  2064. case STORE_YES:
  2065. recv_add_to_hash_table(
  2066. type, space, page_no, body,
  2067. ptr + len, old_lsn,
  2068. recv_sys->recovered_lsn);
  2069. }
  2070. /* fall through */
  2071. case MLOG_FILE_NAME:
  2072. case MLOG_FILE_DELETE:
  2073. case MLOG_FILE_CREATE2:
  2074. case MLOG_FILE_RENAME2:
  2075. case MLOG_TRUNCATE:
  2076. /* These were already handled by
  2077. recv_parse_log_rec() and
  2078. recv_parse_or_apply_log_rec_body(). */
  2079. case MLOG_INDEX_LOAD:
  2080. DBUG_PRINT("ib_log",
  2081. ("scan " LSN_PF ": log rec %s"
  2082. " len " ULINTPF
  2083. " page " ULINTPF ":" ULINTPF,
  2084. old_lsn, get_mlog_string(type),
  2085. len, space, page_no));
  2086. }
  2087. } else {
  2088. /* Check that all the records associated with the single mtr
  2089. are included within the buffer */
  2090. ulint total_len = 0;
  2091. ulint n_recs = 0;
  2092. bool only_mlog_file = true;
  2093. ulint mlog_rec_len = 0;
  2094. for (;;) {
  2095. len = recv_parse_log_rec(
  2096. &type, ptr, end_ptr, &space, &page_no,
  2097. false, &body);
  2098. if (len == 0) {
  2099. return(false);
  2100. }
  2101. if (recv_sys->found_corrupt_log
  2102. || type == MLOG_CHECKPOINT
  2103. || (*ptr & MLOG_SINGLE_REC_FLAG)) {
  2104. recv_sys->found_corrupt_log = true;
  2105. recv_report_corrupt_log(
  2106. ptr, type, space, page_no);
  2107. return(true);
  2108. }
  2109. if (recv_sys->found_corrupt_fs) {
  2110. return(true);
  2111. }
  2112. recv_previous_parsed_rec_type = type;
  2113. recv_previous_parsed_rec_offset
  2114. = recv_sys->recovered_offset + total_len;
  2115. recv_previous_parsed_rec_is_multi = 1;
  2116. /* MLOG_FILE_NAME redo log records doesn't make changes
  2117. to persistent data. If only MLOG_FILE_NAME redo
  2118. log record exists then reset the parsing buffer pointer
  2119. by changing recovered_lsn and recovered_offset. */
  2120. if (type != MLOG_FILE_NAME && only_mlog_file == true) {
  2121. only_mlog_file = false;
  2122. }
  2123. if (only_mlog_file) {
  2124. new_recovered_lsn = recv_calc_lsn_on_data_add(
  2125. recv_sys->recovered_lsn, len);
  2126. mlog_rec_len += len;
  2127. recv_sys->recovered_offset += len;
  2128. recv_sys->recovered_lsn = new_recovered_lsn;
  2129. }
  2130. total_len += len;
  2131. n_recs++;
  2132. ptr += len;
  2133. if (type == MLOG_MULTI_REC_END) {
  2134. DBUG_PRINT("ib_log",
  2135. ("scan " LSN_PF
  2136. ": multi-log end"
  2137. " total_len " ULINTPF
  2138. " n=" ULINTPF,
  2139. recv_sys->recovered_lsn,
  2140. total_len, n_recs));
  2141. total_len -= mlog_rec_len;
  2142. break;
  2143. }
  2144. DBUG_PRINT("ib_log",
  2145. ("scan " LSN_PF ": multi-log rec %s"
  2146. " len " ULINTPF
  2147. " page " ULINTPF ":" ULINTPF,
  2148. recv_sys->recovered_lsn,
  2149. get_mlog_string(type), len, space, page_no));
  2150. }
  2151. new_recovered_lsn = recv_calc_lsn_on_data_add(
  2152. recv_sys->recovered_lsn, total_len);
  2153. if (new_recovered_lsn > recv_sys->scanned_lsn) {
  2154. /* The log record filled a log block, and we require
  2155. that also the next log block should have been scanned
  2156. in */
  2157. return(false);
  2158. }
  2159. /* Add all the records to the hash table */
  2160. ptr = recv_sys->buf + recv_sys->recovered_offset;
  2161. for (;;) {
  2162. old_lsn = recv_sys->recovered_lsn;
  2163. /* This will apply MLOG_FILE_ records. We
  2164. had to skip them in the first scan, because we
  2165. did not know if the mini-transaction was
  2166. completely recovered (until MLOG_MULTI_REC_END). */
  2167. len = recv_parse_log_rec(
  2168. &type, ptr, end_ptr, &space, &page_no,
  2169. apply, &body);
  2170. if (recv_sys->found_corrupt_log
  2171. && !recv_report_corrupt_log(
  2172. ptr, type, space, page_no)) {
  2173. return(true);
  2174. }
  2175. if (recv_sys->found_corrupt_fs) {
  2176. return(true);
  2177. }
  2178. ut_a(len != 0);
  2179. ut_a(!(*ptr & MLOG_SINGLE_REC_FLAG));
  2180. recv_sys->recovered_offset += len;
  2181. recv_sys->recovered_lsn
  2182. = recv_calc_lsn_on_data_add(old_lsn, len);
  2183. switch (type) {
  2184. case MLOG_MULTI_REC_END:
  2185. /* Found the end mark for the records */
  2186. goto loop;
  2187. #ifdef UNIV_LOG_LSN_DEBUG
  2188. case MLOG_LSN:
  2189. /* Do not add these records to the hash table.
  2190. The page number and space id fields are misused
  2191. for something else. */
  2192. break;
  2193. #endif /* UNIV_LOG_LSN_DEBUG */
  2194. case MLOG_FILE_NAME:
  2195. case MLOG_FILE_DELETE:
  2196. case MLOG_FILE_CREATE2:
  2197. case MLOG_FILE_RENAME2:
  2198. case MLOG_INDEX_LOAD:
  2199. case MLOG_TRUNCATE:
  2200. /* These were already handled by
  2201. recv_parse_log_rec() and
  2202. recv_parse_or_apply_log_rec_body(). */
  2203. break;
  2204. default:
  2205. switch (store) {
  2206. case STORE_NO:
  2207. break;
  2208. case STORE_IF_EXISTS:
  2209. if (fil_space_get_flags(space)
  2210. == ULINT_UNDEFINED) {
  2211. break;
  2212. }
  2213. /* fall through */
  2214. case STORE_YES:
  2215. recv_add_to_hash_table(
  2216. type, space, page_no,
  2217. body, ptr + len,
  2218. old_lsn,
  2219. new_recovered_lsn);
  2220. }
  2221. }
  2222. ptr += len;
  2223. }
  2224. }
  2225. goto loop;
  2226. }
  2227. /*******************************************************//**
  2228. Adds data from a new log block to the parsing buffer of recv_sys if
  2229. recv_sys->parse_start_lsn is non-zero.
  2230. @return true if more data added */
  2231. static
  2232. bool
  2233. recv_sys_add_to_parsing_buf(
  2234. /*========================*/
  2235. const byte* log_block, /*!< in: log block */
  2236. lsn_t scanned_lsn) /*!< in: lsn of how far we were able
  2237. to find data in this log block */
  2238. {
  2239. ulint more_len;
  2240. ulint data_len;
  2241. ulint start_offset;
  2242. ulint end_offset;
  2243. ut_ad(scanned_lsn >= recv_sys->scanned_lsn);
  2244. if (!recv_sys->parse_start_lsn) {
  2245. /* Cannot start parsing yet because no start point for
  2246. it found */
  2247. return(false);
  2248. }
  2249. data_len = log_block_get_data_len(log_block);
  2250. if (recv_sys->parse_start_lsn >= scanned_lsn) {
  2251. return(false);
  2252. } else if (recv_sys->scanned_lsn >= scanned_lsn) {
  2253. return(false);
  2254. } else if (recv_sys->parse_start_lsn > recv_sys->scanned_lsn) {
  2255. more_len = (ulint) (scanned_lsn - recv_sys->parse_start_lsn);
  2256. } else {
  2257. more_len = (ulint) (scanned_lsn - recv_sys->scanned_lsn);
  2258. }
  2259. if (more_len == 0) {
  2260. return(false);
  2261. }
  2262. ut_ad(data_len >= more_len);
  2263. start_offset = data_len - more_len;
  2264. if (start_offset < LOG_BLOCK_HDR_SIZE) {
  2265. start_offset = LOG_BLOCK_HDR_SIZE;
  2266. }
  2267. end_offset = data_len;
  2268. if (end_offset > OS_FILE_LOG_BLOCK_SIZE - LOG_BLOCK_TRL_SIZE) {
  2269. end_offset = OS_FILE_LOG_BLOCK_SIZE - LOG_BLOCK_TRL_SIZE;
  2270. }
  2271. ut_ad(start_offset <= end_offset);
  2272. if (start_offset < end_offset) {
  2273. ut_memcpy(recv_sys->buf + recv_sys->len,
  2274. log_block + start_offset, end_offset - start_offset);
  2275. recv_sys->len += end_offset - start_offset;
  2276. ut_a(recv_sys->len <= RECV_PARSING_BUF_SIZE);
  2277. }
  2278. return(true);
  2279. }
  2280. /*******************************************************//**
  2281. Moves the parsing buffer data left to the buffer start. */
  2282. static
  2283. void
  2284. recv_sys_justify_left_parsing_buf(void)
  2285. /*===================================*/
  2286. {
  2287. ut_memmove(recv_sys->buf, recv_sys->buf + recv_sys->recovered_offset,
  2288. recv_sys->len - recv_sys->recovered_offset);
  2289. recv_sys->len -= recv_sys->recovered_offset;
  2290. recv_sys->recovered_offset = 0;
  2291. }
  2292. /** Scan redo log from a buffer and stores new log data to the parsing buffer.
  2293. Parse and hash the log records if new data found.
  2294. Apply log records automatically when the hash table becomes full.
  2295. @return true if not able to scan any more in this log group */
  2296. static
  2297. bool
  2298. recv_scan_log_recs(
  2299. /*===============*/
  2300. ulint available_memory,/*!< in: we let the hash table of recs
  2301. to grow to this size, at the maximum */
  2302. store_t* store_to_hash, /*!< in,out: whether the records should be
  2303. stored to the hash table; this is reset
  2304. if just debug checking is needed, or
  2305. when the available_memory runs out */
  2306. const byte* log_block, /*!< in: log segment */
  2307. lsn_t checkpoint_lsn, /*!< in: latest checkpoint LSN */
  2308. lsn_t start_lsn, /*!< in: buffer start LSN */
  2309. lsn_t end_lsn, /*!< in: buffer end LSN */
  2310. lsn_t* contiguous_lsn, /*!< in/out: it is known that all log
  2311. groups contain contiguous log data up
  2312. to this lsn */
  2313. lsn_t* group_scanned_lsn)/*!< out: scanning succeeded up to
  2314. this lsn */
  2315. {
  2316. lsn_t scanned_lsn = start_lsn;
  2317. bool finished = false;
  2318. ulint data_len;
  2319. bool more_data = false;
  2320. bool apply = recv_sys->mlog_checkpoint_lsn != 0;
  2321. ulint recv_parsing_buf_size = RECV_PARSING_BUF_SIZE;
  2322. ut_ad(start_lsn % OS_FILE_LOG_BLOCK_SIZE == 0);
  2323. ut_ad(end_lsn % OS_FILE_LOG_BLOCK_SIZE == 0);
  2324. ut_ad(end_lsn >= start_lsn + OS_FILE_LOG_BLOCK_SIZE);
  2325. const byte* const log_end = log_block
  2326. + ulint(end_lsn - start_lsn);
  2327. do {
  2328. ut_ad(!finished);
  2329. if (log_block_get_flush_bit(log_block)) {
  2330. /* This block was a start of a log flush operation:
  2331. we know that the previous flush operation must have
  2332. been completed for all log groups before this block
  2333. can have been flushed to any of the groups. Therefore,
  2334. we know that log data is contiguous up to scanned_lsn
  2335. in all non-corrupt log groups. */
  2336. if (scanned_lsn > *contiguous_lsn) {
  2337. *contiguous_lsn = scanned_lsn;
  2338. }
  2339. }
  2340. data_len = log_block_get_data_len(log_block);
  2341. if (scanned_lsn + data_len > recv_sys->scanned_lsn
  2342. && log_block_get_checkpoint_no(log_block)
  2343. < recv_sys->scanned_checkpoint_no
  2344. && (recv_sys->scanned_checkpoint_no
  2345. - log_block_get_checkpoint_no(log_block)
  2346. > 0x80000000UL)) {
  2347. /* Garbage from a log buffer flush which was made
  2348. before the most recent database recovery */
  2349. finished = true;
  2350. break;
  2351. }
  2352. if (!recv_sys->parse_start_lsn
  2353. && (log_block_get_first_rec_group(log_block) > 0)) {
  2354. /* We found a point from which to start the parsing
  2355. of log records */
  2356. recv_sys->parse_start_lsn = scanned_lsn
  2357. + log_block_get_first_rec_group(log_block);
  2358. recv_sys->scanned_lsn = recv_sys->parse_start_lsn;
  2359. recv_sys->recovered_lsn = recv_sys->parse_start_lsn;
  2360. }
  2361. scanned_lsn += data_len;
  2362. if (scanned_lsn > recv_sys->scanned_lsn) {
  2363. if (!recv_needed_recovery) {
  2364. recv_needed_recovery = true;
  2365. if (srv_read_only_mode) {
  2366. ib::warn() << "innodb_read_only"
  2367. " prevents crash recovery";
  2368. return(true);
  2369. }
  2370. ib::info() << "Starting crash recovery from"
  2371. " checkpoint LSN="
  2372. << recv_sys->scanned_lsn;
  2373. }
  2374. /* We were able to find more log data: add it to the
  2375. parsing buffer if parse_start_lsn is already
  2376. non-zero */
  2377. DBUG_EXECUTE_IF(
  2378. "reduce_recv_parsing_buf",
  2379. recv_parsing_buf_size
  2380. = (70 * 1024);
  2381. );
  2382. if (recv_sys->len + 4 * OS_FILE_LOG_BLOCK_SIZE
  2383. >= recv_parsing_buf_size) {
  2384. ib::error() << "Log parsing buffer overflow."
  2385. " Recovery may have failed!";
  2386. recv_sys->found_corrupt_log = true;
  2387. if (!srv_force_recovery) {
  2388. ib::error()
  2389. << "Set innodb_force_recovery"
  2390. " to ignore this error.";
  2391. return(true);
  2392. }
  2393. } else if (!recv_sys->found_corrupt_log) {
  2394. more_data = recv_sys_add_to_parsing_buf(
  2395. log_block, scanned_lsn);
  2396. }
  2397. recv_sys->scanned_lsn = scanned_lsn;
  2398. recv_sys->scanned_checkpoint_no
  2399. = log_block_get_checkpoint_no(log_block);
  2400. }
  2401. if (data_len < OS_FILE_LOG_BLOCK_SIZE) {
  2402. /* Log data for this group ends here */
  2403. finished = true;
  2404. break;
  2405. } else {
  2406. log_block += OS_FILE_LOG_BLOCK_SIZE;
  2407. }
  2408. } while (log_block < log_end);
  2409. *group_scanned_lsn = scanned_lsn;
  2410. if (more_data && !recv_sys->found_corrupt_log) {
  2411. /* Try to parse more log records */
  2412. if (recv_parse_log_recs(checkpoint_lsn,
  2413. *store_to_hash, apply)) {
  2414. ut_ad(recv_sys->found_corrupt_log
  2415. || recv_sys->found_corrupt_fs
  2416. || recv_sys->mlog_checkpoint_lsn
  2417. == recv_sys->recovered_lsn);
  2418. return(true);
  2419. }
  2420. if (*store_to_hash != STORE_NO
  2421. && mem_heap_get_size(recv_sys->heap) > available_memory) {
  2422. *store_to_hash = STORE_NO;
  2423. }
  2424. if (recv_sys->recovered_offset > recv_parsing_buf_size / 4) {
  2425. /* Move parsing buffer data to the buffer start */
  2426. recv_sys_justify_left_parsing_buf();
  2427. }
  2428. }
  2429. return(finished);
  2430. }
  2431. /** Scans log from a buffer and stores new log data to the parsing buffer.
  2432. Parses and hashes the log records if new data found.
  2433. @param[in,out] group log group
  2434. @param[in] checkpoint_lsn latest checkpoint log sequence number
  2435. @param[in,out] contiguous_lsn log sequence number
  2436. until which all redo log has been scanned
  2437. @param[in] last_phase whether changes
  2438. can be applied to the tablespaces
  2439. @return whether rescan is needed (not everything was stored) */
  2440. static
  2441. bool
  2442. recv_group_scan_log_recs(
  2443. log_group_t* group,
  2444. lsn_t checkpoint_lsn,
  2445. lsn_t* contiguous_lsn,
  2446. bool last_phase)
  2447. {
  2448. DBUG_ENTER("recv_group_scan_log_recs");
  2449. DBUG_ASSERT(!last_phase || recv_sys->mlog_checkpoint_lsn > 0);
  2450. mutex_enter(&recv_sys->mutex);
  2451. recv_sys->len = 0;
  2452. recv_sys->recovered_offset = 0;
  2453. recv_sys->n_addrs = 0;
  2454. recv_sys_empty_hash();
  2455. srv_start_lsn = *contiguous_lsn;
  2456. recv_sys->parse_start_lsn = *contiguous_lsn;
  2457. recv_sys->scanned_lsn = *contiguous_lsn;
  2458. recv_sys->recovered_lsn = *contiguous_lsn;
  2459. recv_sys->scanned_checkpoint_no = 0;
  2460. recv_previous_parsed_rec_type = MLOG_SINGLE_REC_FLAG;
  2461. recv_previous_parsed_rec_offset = 0;
  2462. recv_previous_parsed_rec_is_multi = 0;
  2463. ut_ad(recv_max_page_lsn == 0);
  2464. ut_ad(last_phase || !recv_writer_thread_active);
  2465. mutex_exit(&recv_sys->mutex);
  2466. lsn_t start_lsn;
  2467. lsn_t end_lsn;
  2468. store_t store_to_hash = recv_sys->mlog_checkpoint_lsn == 0
  2469. ? STORE_NO : (last_phase ? STORE_IF_EXISTS : STORE_YES);
  2470. ulint available_mem = UNIV_PAGE_SIZE
  2471. * (buf_pool_get_n_pages()
  2472. - (recv_n_pool_free_frames * srv_buf_pool_instances));
  2473. group->scanned_lsn = end_lsn = *contiguous_lsn = ut_uint64_align_down(
  2474. *contiguous_lsn, OS_FILE_LOG_BLOCK_SIZE);
  2475. do {
  2476. if (last_phase && store_to_hash == STORE_NO) {
  2477. store_to_hash = STORE_IF_EXISTS;
  2478. /* We must not allow change buffer
  2479. merge here, because it would generate
  2480. redo log records before we have
  2481. finished the redo log scan. */
  2482. recv_apply_hashed_log_recs(false);
  2483. }
  2484. start_lsn = end_lsn;
  2485. end_lsn = log_group_read_log_seg(
  2486. log_sys->buf, group, start_lsn,
  2487. start_lsn + RECV_SCAN_SIZE);
  2488. } while (end_lsn != start_lsn
  2489. && !recv_scan_log_recs(
  2490. available_mem, &store_to_hash, log_sys->buf,
  2491. checkpoint_lsn,
  2492. start_lsn, end_lsn,
  2493. contiguous_lsn, &group->scanned_lsn));
  2494. if (recv_sys->found_corrupt_log || recv_sys->found_corrupt_fs) {
  2495. DBUG_RETURN(false);
  2496. }
  2497. DBUG_PRINT("ib_log", ("%s " LSN_PF
  2498. " completed for log group " ULINTPF,
  2499. last_phase ? "rescan" : "scan",
  2500. group->scanned_lsn, group->id));
  2501. DBUG_RETURN(store_to_hash == STORE_NO);
  2502. }
  2503. /** Report a missing tablespace for which page-redo log exists.
  2504. @param[in] err previous error code
  2505. @param[in] i tablespace descriptor
  2506. @return new error code */
  2507. static
  2508. dberr_t
  2509. recv_init_missing_space(dberr_t err, const recv_spaces_t::const_iterator& i)
  2510. {
  2511. if (srv_force_recovery == 0) {
  2512. ib::error() << "Tablespace " << i->first << " was not"
  2513. " found at " << i->second.name << ".";
  2514. if (err == DB_SUCCESS) {
  2515. ib::error() << "Set innodb_force_recovery=1 to"
  2516. " ignore this and to permanently lose"
  2517. " all changes to the tablespace.";
  2518. err = DB_TABLESPACE_NOT_FOUND;
  2519. }
  2520. } else {
  2521. ib::warn() << "Tablespace " << i->first << " was not"
  2522. " found at " << i->second.name << ", and"
  2523. " innodb_force_recovery was set. All redo log"
  2524. " for this tablespace will be ignored!";
  2525. }
  2526. return(err);
  2527. }
  2528. /** Check if all tablespaces were found for crash recovery.
  2529. @return error code or DB_SUCCESS */
  2530. static MY_ATTRIBUTE((warn_unused_result))
  2531. dberr_t
  2532. recv_init_crash_recovery_spaces()
  2533. {
  2534. typedef std::set<ulint> space_set_t;
  2535. bool flag_deleted = false;
  2536. space_set_t missing_spaces;
  2537. ut_ad(!srv_read_only_mode);
  2538. ut_ad(recv_needed_recovery);
  2539. for (recv_spaces_t::iterator i = recv_spaces.begin();
  2540. i != recv_spaces.end(); i++) {
  2541. ut_ad(!is_predefined_tablespace(i->first));
  2542. ut_ad(!i->second.deleted || !i->second.space);
  2543. if (i->second.deleted) {
  2544. /* The tablespace was deleted,
  2545. so we can ignore any redo log for it. */
  2546. flag_deleted = true;
  2547. } else if (i->second.space != NULL) {
  2548. /* The tablespace was found, and there
  2549. are some redo log records for it. */
  2550. fil_names_dirty(i->second.space);
  2551. } else if (i->second.name == "") {
  2552. ib::error() << "Missing MLOG_FILE_NAME"
  2553. " or MLOG_FILE_DELETE"
  2554. " before MLOG_CHECKPOINT for tablespace "
  2555. << i->first;
  2556. recv_sys->found_corrupt_log = true;
  2557. return(DB_CORRUPTION);
  2558. } else {
  2559. missing_spaces.insert(i->first);
  2560. flag_deleted = true;
  2561. }
  2562. ut_ad(i->second.deleted || i->second.name != "");
  2563. }
  2564. if (flag_deleted) {
  2565. dberr_t err = DB_SUCCESS;
  2566. for (ulint h = 0;
  2567. h < hash_get_n_cells(recv_sys->addr_hash);
  2568. h++) {
  2569. for (recv_addr_t* recv_addr
  2570. = static_cast<recv_addr_t*>(
  2571. HASH_GET_FIRST(
  2572. recv_sys->addr_hash, h));
  2573. recv_addr != 0;
  2574. recv_addr = static_cast<recv_addr_t*>(
  2575. HASH_GET_NEXT(addr_hash, recv_addr))) {
  2576. const ulint space = recv_addr->space;
  2577. if (is_predefined_tablespace(space)) {
  2578. continue;
  2579. }
  2580. recv_spaces_t::iterator i
  2581. = recv_spaces.find(space);
  2582. ut_ad(i != recv_spaces.end());
  2583. if (i->second.deleted) {
  2584. ut_ad(missing_spaces.find(space)
  2585. == missing_spaces.end());
  2586. recv_addr->state = RECV_DISCARDED;
  2587. continue;
  2588. }
  2589. space_set_t::iterator m = missing_spaces.find(
  2590. space);
  2591. if (m != missing_spaces.end()) {
  2592. missing_spaces.erase(m);
  2593. err = recv_init_missing_space(err, i);
  2594. recv_addr->state = RECV_DISCARDED;
  2595. /* All further redo log for this
  2596. tablespace should be removed. */
  2597. i->second.deleted = true;
  2598. }
  2599. }
  2600. }
  2601. if (err != DB_SUCCESS) {
  2602. return(err);
  2603. }
  2604. }
  2605. for (space_set_t::const_iterator m = missing_spaces.begin();
  2606. m != missing_spaces.end(); m++) {
  2607. recv_spaces_t::iterator i = recv_spaces.find(*m);
  2608. ut_ad(i != recv_spaces.end());
  2609. ib::info() << "Tablespace " << i->first
  2610. << " was not found at '" << i->second.name
  2611. << "', but there were no modifications either.";
  2612. }
  2613. buf_dblwr_process();
  2614. if (srv_force_recovery < SRV_FORCE_NO_LOG_REDO) {
  2615. /* Spawn the background thread to flush dirty pages
  2616. from the buffer pools. */
  2617. recv_writer_thread_active = true;
  2618. os_thread_create(recv_writer_thread, 0, 0);
  2619. }
  2620. return(DB_SUCCESS);
  2621. }
  2622. /** Start recovering from a redo log checkpoint.
  2623. @see recv_recovery_from_checkpoint_finish
  2624. @param[in] flush_lsn FIL_PAGE_FILE_FLUSH_LSN
  2625. of first system tablespace page
  2626. @return error code or DB_SUCCESS */
  2627. dberr_t
  2628. recv_recovery_from_checkpoint_start(
  2629. lsn_t flush_lsn)
  2630. {
  2631. log_group_t* group;
  2632. log_group_t* max_cp_group;
  2633. ulint max_cp_field;
  2634. lsn_t checkpoint_lsn;
  2635. bool rescan;
  2636. ib_uint64_t checkpoint_no;
  2637. lsn_t contiguous_lsn;
  2638. byte* buf;
  2639. dberr_t err = DB_SUCCESS;
  2640. /* Initialize red-black tree for fast insertions into the
  2641. flush_list during recovery process. */
  2642. buf_flush_init_flush_rbt();
  2643. if (srv_force_recovery >= SRV_FORCE_NO_LOG_REDO) {
  2644. ib::info() << "innodb_force_recovery=6 skips redo log apply";
  2645. return(DB_SUCCESS);
  2646. }
  2647. recv_recovery_on = true;
  2648. log_mutex_enter();
  2649. /* Look for the latest checkpoint from any of the log groups */
  2650. err = recv_find_max_checkpoint(&max_cp_group, &max_cp_field);
  2651. if (err != DB_SUCCESS) {
  2652. log_mutex_exit();
  2653. return(err);
  2654. }
  2655. log_group_header_read(max_cp_group, max_cp_field);
  2656. buf = log_sys->checkpoint_buf;
  2657. checkpoint_lsn = mach_read_from_8(buf + LOG_CHECKPOINT_LSN);
  2658. checkpoint_no = mach_read_from_8(buf + LOG_CHECKPOINT_NO);
  2659. /* Start reading the log groups from the checkpoint lsn up. The
  2660. variable contiguous_lsn contains an lsn up to which the log is
  2661. known to be contiguously written to all log groups. */
  2662. recv_sys->mlog_checkpoint_lsn = 0;
  2663. ut_ad(RECV_SCAN_SIZE <= log_sys->buf_size);
  2664. ut_ad(UT_LIST_GET_LEN(log_sys->log_groups) == 1);
  2665. group = UT_LIST_GET_FIRST(log_sys->log_groups);
  2666. const lsn_t end_lsn = mach_read_from_8(
  2667. buf + LOG_CHECKPOINT_END_LSN);
  2668. ut_ad(recv_sys->n_addrs == 0);
  2669. contiguous_lsn = checkpoint_lsn;
  2670. switch (group->format) {
  2671. case 0:
  2672. log_mutex_exit();
  2673. return(recv_log_format_0_recover(checkpoint_lsn));
  2674. case LOG_HEADER_FORMAT_CURRENT:
  2675. case LOG_HEADER_FORMAT_CURRENT | LOG_HEADER_FORMAT_ENCRYPTED:
  2676. if (end_lsn == 0) {
  2677. break;
  2678. }
  2679. if (end_lsn >= checkpoint_lsn) {
  2680. contiguous_lsn = end_lsn;
  2681. break;
  2682. }
  2683. /* fall through */
  2684. default:
  2685. recv_sys->found_corrupt_log = true;
  2686. log_mutex_exit();
  2687. return(DB_ERROR);
  2688. }
  2689. /* Look for MLOG_CHECKPOINT. */
  2690. recv_group_scan_log_recs(group, checkpoint_lsn, &contiguous_lsn,
  2691. false);
  2692. /* The first scan should not have stored or applied any records. */
  2693. ut_ad(recv_sys->n_addrs == 0);
  2694. ut_ad(!recv_sys->found_corrupt_fs);
  2695. if (srv_read_only_mode && recv_needed_recovery) {
  2696. log_mutex_exit();
  2697. return(DB_READ_ONLY);
  2698. }
  2699. if (recv_sys->found_corrupt_log && !srv_force_recovery) {
  2700. log_mutex_exit();
  2701. ib::warn() << "Log scan aborted at LSN " << contiguous_lsn;
  2702. return(DB_ERROR);
  2703. }
  2704. if (recv_sys->mlog_checkpoint_lsn == 0) {
  2705. lsn_t scan_lsn = group->scanned_lsn;
  2706. if (!srv_read_only_mode && scan_lsn != checkpoint_lsn) {
  2707. log_mutex_exit();
  2708. ib::error err;
  2709. err << "Missing MLOG_CHECKPOINT";
  2710. if (end_lsn) {
  2711. err << " at " << end_lsn;
  2712. }
  2713. err << " between the checkpoint " << checkpoint_lsn
  2714. << " and the end " << scan_lsn << ".";
  2715. return(DB_ERROR);
  2716. }
  2717. group->scanned_lsn = checkpoint_lsn;
  2718. rescan = false;
  2719. } else {
  2720. contiguous_lsn = checkpoint_lsn;
  2721. rescan = recv_group_scan_log_recs(
  2722. group, checkpoint_lsn, &contiguous_lsn, false);
  2723. if ((recv_sys->found_corrupt_log && !srv_force_recovery)
  2724. || recv_sys->found_corrupt_fs) {
  2725. log_mutex_exit();
  2726. return(DB_ERROR);
  2727. }
  2728. }
  2729. /* NOTE: we always do a 'recovery' at startup, but only if
  2730. there is something wrong we will print a message to the
  2731. user about recovery: */
  2732. if (checkpoint_lsn != flush_lsn) {
  2733. if (checkpoint_lsn + SIZE_OF_MLOG_CHECKPOINT < flush_lsn) {
  2734. ib::warn() << " Are you sure you are using the"
  2735. " right ib_logfiles to start up the database?"
  2736. " Log sequence number in the ib_logfiles is "
  2737. << checkpoint_lsn << ", less than the"
  2738. " log sequence number in the first system"
  2739. " tablespace file header, " << flush_lsn << ".";
  2740. }
  2741. if (!recv_needed_recovery) {
  2742. ib::info() << "The log sequence number " << flush_lsn
  2743. << " in the system tablespace does not match"
  2744. " the log sequence number " << checkpoint_lsn
  2745. << " in the ib_logfiles!";
  2746. if (srv_read_only_mode) {
  2747. ib::error() << "innodb_read_only"
  2748. " prevents crash recovery";
  2749. log_mutex_exit();
  2750. return(DB_READ_ONLY);
  2751. }
  2752. recv_needed_recovery = true;
  2753. }
  2754. }
  2755. log_sys->lsn = recv_sys->recovered_lsn;
  2756. if (recv_needed_recovery) {
  2757. err = recv_init_crash_recovery_spaces();
  2758. if (err != DB_SUCCESS) {
  2759. log_mutex_exit();
  2760. return(err);
  2761. }
  2762. if (rescan) {
  2763. contiguous_lsn = checkpoint_lsn;
  2764. recv_group_scan_log_recs(group, checkpoint_lsn,
  2765. &contiguous_lsn, true);
  2766. if ((recv_sys->found_corrupt_log
  2767. && !srv_force_recovery)
  2768. || recv_sys->found_corrupt_fs) {
  2769. log_mutex_exit();
  2770. return(DB_ERROR);
  2771. }
  2772. }
  2773. } else {
  2774. ut_ad(!rescan || recv_sys->n_addrs == 0);
  2775. }
  2776. /* We currently have only one log group */
  2777. if (group->scanned_lsn < checkpoint_lsn
  2778. || group->scanned_lsn < recv_max_page_lsn) {
  2779. ib::error() << "We scanned the log up to " << group->scanned_lsn
  2780. << ". A checkpoint was at " << checkpoint_lsn << " and"
  2781. " the maximum LSN on a database page was "
  2782. << recv_max_page_lsn << ". It is possible that the"
  2783. " database is now corrupt!";
  2784. }
  2785. if (recv_sys->recovered_lsn < checkpoint_lsn) {
  2786. log_mutex_exit();
  2787. ib::error() << "Recovered only to lsn:"
  2788. << recv_sys->recovered_lsn << " checkpoint_lsn: " << checkpoint_lsn;
  2789. return(DB_ERROR);
  2790. }
  2791. /* Synchronize the uncorrupted log groups to the most up-to-date log
  2792. group; we also copy checkpoint info to groups */
  2793. log_sys->next_checkpoint_lsn = checkpoint_lsn;
  2794. log_sys->next_checkpoint_no = checkpoint_no + 1;
  2795. recv_synchronize_groups();
  2796. if (!recv_needed_recovery) {
  2797. ut_a(checkpoint_lsn == recv_sys->recovered_lsn);
  2798. } else {
  2799. srv_start_lsn = recv_sys->recovered_lsn;
  2800. }
  2801. log_sys->buf_free = (ulint) log_sys->lsn % OS_FILE_LOG_BLOCK_SIZE;
  2802. log_sys->buf_next_to_write = log_sys->buf_free;
  2803. log_sys->write_lsn = log_sys->lsn;
  2804. log_sys->last_checkpoint_lsn = checkpoint_lsn;
  2805. if (!srv_read_only_mode) {
  2806. /* Write a MLOG_CHECKPOINT marker as the first thing,
  2807. before generating any other redo log. */
  2808. fil_names_clear(log_sys->last_checkpoint_lsn, true);
  2809. }
  2810. MONITOR_SET(MONITOR_LSN_CHECKPOINT_AGE,
  2811. log_sys->lsn - log_sys->last_checkpoint_lsn);
  2812. log_sys->next_checkpoint_no = ++checkpoint_no;
  2813. mutex_enter(&recv_sys->mutex);
  2814. recv_sys->apply_log_recs = TRUE;
  2815. mutex_exit(&recv_sys->mutex);
  2816. log_mutex_exit();
  2817. recv_lsn_checks_on = true;
  2818. /* The database is now ready to start almost normal processing of user
  2819. transactions: transaction rollbacks and the application of the log
  2820. records in the hash table can be run in background. */
  2821. return(DB_SUCCESS);
  2822. }
  2823. /** Complete recovery from a checkpoint. */
  2824. void
  2825. recv_recovery_from_checkpoint_finish(void)
  2826. {
  2827. /* Make sure that the recv_writer thread is done. This is
  2828. required because it grabs various mutexes and we want to
  2829. ensure that when we enable sync_order_checks there is no
  2830. mutex currently held by any thread. */
  2831. mutex_enter(&recv_sys->writer_mutex);
  2832. /* Free the resources of the recovery system */
  2833. recv_recovery_on = false;
  2834. /* By acquring the mutex we ensure that the recv_writer thread
  2835. won't trigger any more LRU batches. Now wait for currently
  2836. in progress batches to finish. */
  2837. buf_flush_wait_LRU_batch_end();
  2838. mutex_exit(&recv_sys->writer_mutex);
  2839. ulint count = 0;
  2840. while (recv_writer_thread_active) {
  2841. ++count;
  2842. os_thread_sleep(100000);
  2843. if (srv_print_verbose_log && count > 600) {
  2844. ib::info() << "Waiting for recv_writer to"
  2845. " finish flushing of buffer pool";
  2846. count = 0;
  2847. }
  2848. }
  2849. recv_sys_debug_free();
  2850. /* Free up the flush_rbt. */
  2851. buf_flush_free_flush_rbt();
  2852. }
  2853. /********************************************************//**
  2854. Initiates the rollback of active transactions. */
  2855. void
  2856. recv_recovery_rollback_active(void)
  2857. /*===============================*/
  2858. {
  2859. ut_ad(!recv_writer_thread_active);
  2860. /* Switch latching order checks on in sync0debug.cc, if
  2861. --innodb-sync-debug=true (default) */
  2862. ut_d(sync_check_enable());
  2863. /* We can't start any (DDL) transactions if UNDO logging
  2864. has been disabled, additionally disable ROLLBACK of recovered
  2865. user transactions. */
  2866. if (srv_force_recovery < SRV_FORCE_NO_TRX_UNDO
  2867. && !srv_read_only_mode) {
  2868. /* Drop partially created indexes. */
  2869. row_merge_drop_temp_indexes();
  2870. /* Drop any auxiliary tables that were not dropped when the
  2871. parent table was dropped. This can happen if the parent table
  2872. was dropped but the server crashed before the auxiliary tables
  2873. were dropped. */
  2874. fts_drop_orphaned_tables();
  2875. /* Rollback the uncommitted transactions which have no user
  2876. session */
  2877. trx_rollback_or_clean_is_active = true;
  2878. os_thread_create(trx_rollback_or_clean_all_recovered, 0, 0);
  2879. }
  2880. }
  2881. /******************************************************//**
  2882. Resets the logs. The contents of log files will be lost! */
  2883. void
  2884. recv_reset_logs(
  2885. /*============*/
  2886. lsn_t lsn) /*!< in: reset to this lsn
  2887. rounded up to be divisible by
  2888. OS_FILE_LOG_BLOCK_SIZE, after
  2889. which we add
  2890. LOG_BLOCK_HDR_SIZE */
  2891. {
  2892. ut_ad(log_mutex_own());
  2893. log_sys->lsn = ut_uint64_align_up(lsn, OS_FILE_LOG_BLOCK_SIZE);
  2894. for (log_group_t* group = UT_LIST_GET_FIRST(log_sys->log_groups);
  2895. group; group = UT_LIST_GET_NEXT(log_groups, group)) {
  2896. group->lsn = log_sys->lsn;
  2897. group->lsn_offset = LOG_FILE_HDR_SIZE;
  2898. }
  2899. log_sys->buf_next_to_write = 0;
  2900. log_sys->write_lsn = log_sys->lsn;
  2901. log_sys->next_checkpoint_no = 0;
  2902. log_sys->last_checkpoint_lsn = 0;
  2903. log_block_init(log_sys->buf, log_sys->lsn);
  2904. log_block_set_first_rec_group(log_sys->buf, LOG_BLOCK_HDR_SIZE);
  2905. log_sys->buf_free = LOG_BLOCK_HDR_SIZE;
  2906. log_sys->lsn += LOG_BLOCK_HDR_SIZE;
  2907. MONITOR_SET(MONITOR_LSN_CHECKPOINT_AGE,
  2908. (log_sys->lsn - log_sys->last_checkpoint_lsn));
  2909. log_mutex_exit();
  2910. /* Reset the checkpoint fields in logs */
  2911. log_make_checkpoint_at(LSN_MAX, TRUE);
  2912. log_mutex_enter();
  2913. }
  2914. /** Find a doublewrite copy of a page.
  2915. @param[in] space_id tablespace identifier
  2916. @param[in] page_no page number
  2917. @return page frame
  2918. @retval NULL if no page was found */
  2919. const byte*
  2920. recv_dblwr_t::find_page(ulint space_id, ulint page_no)
  2921. {
  2922. typedef std::vector<const byte*, ut_allocator<const byte*> >
  2923. matches_t;
  2924. matches_t matches;
  2925. const byte* result = 0;
  2926. for (list::iterator i = pages.begin(); i != pages.end(); ++i) {
  2927. if (page_get_space_id(*i) == space_id
  2928. && page_get_page_no(*i) == page_no) {
  2929. matches.push_back(*i);
  2930. }
  2931. }
  2932. if (matches.size() == 1) {
  2933. result = matches[0];
  2934. } else if (matches.size() > 1) {
  2935. lsn_t max_lsn = 0;
  2936. lsn_t page_lsn = 0;
  2937. for (matches_t::iterator i = matches.begin();
  2938. i != matches.end();
  2939. ++i) {
  2940. page_lsn = mach_read_from_8(*i + FIL_PAGE_LSN);
  2941. if (page_lsn > max_lsn) {
  2942. max_lsn = page_lsn;
  2943. result = *i;
  2944. }
  2945. }
  2946. }
  2947. return(result);
  2948. }
  2949. #ifndef DBUG_OFF
  2950. /** Return string name of the redo log record type.
  2951. @param[in] type record log record enum
  2952. @return string name of record log record */
  2953. const char*
  2954. get_mlog_string(mlog_id_t type)
  2955. {
  2956. switch (type) {
  2957. case MLOG_SINGLE_REC_FLAG:
  2958. return("MLOG_SINGLE_REC_FLAG");
  2959. case MLOG_1BYTE:
  2960. return("MLOG_1BYTE");
  2961. case MLOG_2BYTES:
  2962. return("MLOG_2BYTES");
  2963. case MLOG_4BYTES:
  2964. return("MLOG_4BYTES");
  2965. case MLOG_8BYTES:
  2966. return("MLOG_8BYTES");
  2967. case MLOG_REC_INSERT:
  2968. return("MLOG_REC_INSERT");
  2969. case MLOG_REC_CLUST_DELETE_MARK:
  2970. return("MLOG_REC_CLUST_DELETE_MARK");
  2971. case MLOG_REC_SEC_DELETE_MARK:
  2972. return("MLOG_REC_SEC_DELETE_MARK");
  2973. case MLOG_REC_UPDATE_IN_PLACE:
  2974. return("MLOG_REC_UPDATE_IN_PLACE");
  2975. case MLOG_REC_DELETE:
  2976. return("MLOG_REC_DELETE");
  2977. case MLOG_LIST_END_DELETE:
  2978. return("MLOG_LIST_END_DELETE");
  2979. case MLOG_LIST_START_DELETE:
  2980. return("MLOG_LIST_START_DELETE");
  2981. case MLOG_LIST_END_COPY_CREATED:
  2982. return("MLOG_LIST_END_COPY_CREATED");
  2983. case MLOG_PAGE_REORGANIZE:
  2984. return("MLOG_PAGE_REORGANIZE");
  2985. case MLOG_PAGE_CREATE:
  2986. return("MLOG_PAGE_CREATE");
  2987. case MLOG_UNDO_INSERT:
  2988. return("MLOG_UNDO_INSERT");
  2989. case MLOG_UNDO_ERASE_END:
  2990. return("MLOG_UNDO_ERASE_END");
  2991. case MLOG_UNDO_INIT:
  2992. return("MLOG_UNDO_INIT");
  2993. case MLOG_UNDO_HDR_DISCARD:
  2994. return("MLOG_UNDO_HDR_DISCARD");
  2995. case MLOG_UNDO_HDR_REUSE:
  2996. return("MLOG_UNDO_HDR_REUSE");
  2997. case MLOG_UNDO_HDR_CREATE:
  2998. return("MLOG_UNDO_HDR_CREATE");
  2999. case MLOG_REC_MIN_MARK:
  3000. return("MLOG_REC_MIN_MARK");
  3001. case MLOG_IBUF_BITMAP_INIT:
  3002. return("MLOG_IBUF_BITMAP_INIT");
  3003. #ifdef UNIV_LOG_LSN_DEBUG
  3004. case MLOG_LSN:
  3005. return("MLOG_LSN");
  3006. #endif /* UNIV_LOG_LSN_DEBUG */
  3007. case MLOG_INIT_FILE_PAGE:
  3008. return("MLOG_INIT_FILE_PAGE");
  3009. case MLOG_WRITE_STRING:
  3010. return("MLOG_WRITE_STRING");
  3011. case MLOG_MULTI_REC_END:
  3012. return("MLOG_MULTI_REC_END");
  3013. case MLOG_DUMMY_RECORD:
  3014. return("MLOG_DUMMY_RECORD");
  3015. case MLOG_FILE_DELETE:
  3016. return("MLOG_FILE_DELETE");
  3017. case MLOG_COMP_REC_MIN_MARK:
  3018. return("MLOG_COMP_REC_MIN_MARK");
  3019. case MLOG_COMP_PAGE_CREATE:
  3020. return("MLOG_COMP_PAGE_CREATE");
  3021. case MLOG_COMP_REC_INSERT:
  3022. return("MLOG_COMP_REC_INSERT");
  3023. case MLOG_COMP_REC_CLUST_DELETE_MARK:
  3024. return("MLOG_COMP_REC_CLUST_DELETE_MARK");
  3025. case MLOG_COMP_REC_UPDATE_IN_PLACE:
  3026. return("MLOG_COMP_REC_UPDATE_IN_PLACE");
  3027. case MLOG_COMP_REC_DELETE:
  3028. return("MLOG_COMP_REC_DELETE");
  3029. case MLOG_COMP_LIST_END_DELETE:
  3030. return("MLOG_COMP_LIST_END_DELETE");
  3031. case MLOG_COMP_LIST_START_DELETE:
  3032. return("MLOG_COMP_LIST_START_DELETE");
  3033. case MLOG_COMP_LIST_END_COPY_CREATED:
  3034. return("MLOG_COMP_LIST_END_COPY_CREATED");
  3035. case MLOG_COMP_PAGE_REORGANIZE:
  3036. return("MLOG_COMP_PAGE_REORGANIZE");
  3037. case MLOG_FILE_CREATE2:
  3038. return("MLOG_FILE_CREATE2");
  3039. case MLOG_ZIP_WRITE_NODE_PTR:
  3040. return("MLOG_ZIP_WRITE_NODE_PTR");
  3041. case MLOG_ZIP_WRITE_BLOB_PTR:
  3042. return("MLOG_ZIP_WRITE_BLOB_PTR");
  3043. case MLOG_ZIP_WRITE_HEADER:
  3044. return("MLOG_ZIP_WRITE_HEADER");
  3045. case MLOG_ZIP_PAGE_COMPRESS:
  3046. return("MLOG_ZIP_PAGE_COMPRESS");
  3047. case MLOG_ZIP_PAGE_COMPRESS_NO_DATA:
  3048. return("MLOG_ZIP_PAGE_COMPRESS_NO_DATA");
  3049. case MLOG_ZIP_PAGE_REORGANIZE:
  3050. return("MLOG_ZIP_PAGE_REORGANIZE");
  3051. case MLOG_FILE_RENAME2:
  3052. return("MLOG_FILE_RENAME2");
  3053. case MLOG_FILE_NAME:
  3054. return("MLOG_FILE_NAME");
  3055. case MLOG_CHECKPOINT:
  3056. return("MLOG_CHECKPOINT");
  3057. case MLOG_PAGE_CREATE_RTREE:
  3058. return("MLOG_PAGE_CREATE_RTREE");
  3059. case MLOG_COMP_PAGE_CREATE_RTREE:
  3060. return("MLOG_COMP_PAGE_CREATE_RTREE");
  3061. case MLOG_INIT_FILE_PAGE2:
  3062. return("MLOG_INIT_FILE_PAGE2");
  3063. case MLOG_INDEX_LOAD:
  3064. return("MLOG_INDEX_LOAD");
  3065. case MLOG_TRUNCATE:
  3066. return("MLOG_TRUNCATE");
  3067. case MLOG_FILE_WRITE_CRYPT_DATA:
  3068. return("MLOG_FILE_WRITE_CRYPT_DATA");
  3069. }
  3070. DBUG_ASSERT(0);
  3071. return(NULL);
  3072. }
  3073. #endif /* !DBUG_OFF */