You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1626 lines
53 KiB

MDEV-11254: innodb-use-trim has no effect in 10.2 Problem was that implementation merged from 10.1 was incompatible with InnoDB 5.7. buf0buf.cc: Add functions to return should we punch hole and how big. buf0flu.cc: Add written page to IORequest fil0fil.cc: Remove unneeded status call and add test is sparse files and punch hole supported by file system when tablespace is created. Add call to get file system block size. Used file node is added to IORequest. Added functions to check is punch hole supported and setting punch hole. ha_innodb.cc: Remove unneeded status variables (trim512-32768) and trim_op_saved. Deprecate innodb_use_trim and set it ON by default. Add function to set innodb-use-trim dynamically. dberr.h: Add error code DB_IO_NO_PUNCH_HOLE if punch hole operation fails. fil0fil.h: Add punch_hole variable to fil_space_t and block size to fil_node_t. os0api.h: Header to helper functions on buf0buf.cc and fil0fil.cc for os0file.h os0file.h: Remove unneeded m_block_size from IORequest and add bpage to IORequest to know actual size of the block and m_fil_node to know tablespace file system block size and does it support punch hole. os0file.cc: Add function punch_hole() to IORequest to do punch_hole operation, get the file system block size and determine does file system support sparse files (for punch hole). page0size.h: remove implicit copy disable and use this implicit copy to implement copy_from() function. buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h, os0file.h, os0file.cc, log0log.cc, log0recv.cc: Remove unneeded write_size parameter from fil_io calls. srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded trim512-trim32678 status variables. Removed these from monitor tests.
9 years ago
MDEV-11254: innodb-use-trim has no effect in 10.2 Problem was that implementation merged from 10.1 was incompatible with InnoDB 5.7. buf0buf.cc: Add functions to return should we punch hole and how big. buf0flu.cc: Add written page to IORequest fil0fil.cc: Remove unneeded status call and add test is sparse files and punch hole supported by file system when tablespace is created. Add call to get file system block size. Used file node is added to IORequest. Added functions to check is punch hole supported and setting punch hole. ha_innodb.cc: Remove unneeded status variables (trim512-32768) and trim_op_saved. Deprecate innodb_use_trim and set it ON by default. Add function to set innodb-use-trim dynamically. dberr.h: Add error code DB_IO_NO_PUNCH_HOLE if punch hole operation fails. fil0fil.h: Add punch_hole variable to fil_space_t and block size to fil_node_t. os0api.h: Header to helper functions on buf0buf.cc and fil0fil.cc for os0file.h os0file.h: Remove unneeded m_block_size from IORequest and add bpage to IORequest to know actual size of the block and m_fil_node to know tablespace file system block size and does it support punch hole. os0file.cc: Add function punch_hole() to IORequest to do punch_hole operation, get the file system block size and determine does file system support sparse files (for punch hole). page0size.h: remove implicit copy disable and use this implicit copy to implement copy_from() function. buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h, os0file.h, os0file.cc, log0log.cc, log0recv.cc: Remove unneeded write_size parameter from fil_io calls. srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded trim512-trim32678 status variables. Removed these from monitor tests.
9 years ago
11 years ago
Merge Google encryption commit 195158e9889365dc3298f8c1f3bcaa745992f27f Author: Minli Zhu <minliz@google.com> Date: Mon Nov 25 11:05:55 2013 -0800 Innodb redo log encryption/decryption. Use start lsn of a log block as part of AES CTR counter. Record key version with each checkpoint. Internally key version 0 means no encryption. Tests done (see test_innodb_log_encryption.sh for detail): - Verify flag innodb_encrypt_log on or off, combined with various key versions passed through CLI, and dynamically set after startup, will not corrupt database. This includes tests from being unencrypted to encrypted, and encrypted to unencrypted. - Verify start-up with no redo logs succeeds. - Verify fresh start-up succeeds. Change-Id: I4ce4c2afdf3076be2fce90ebbc2a7ce01184b612 commit c1b97273659f07866758c25f4a56f680a1fbad24 Author: Jonas Oreland <jonaso@google.com> Date: Tue Dec 3 18:47:27 2013 +0100 encryption of aria data&index files this patch implements encryption of aria data & index files. this is implemented as 1) add read/write hooks (renamed from callbacks) that does encrypt/decrypt (also add pre_read and post_write hooks) 2) modify page headers for data/index to contain key version (making the data-page header size different for with/without encryption) 3) modify index page 0 to contain IV (and crypt header) 4) AES CRT crypt functions 5) counter block is implemented using combination of page no, lsn and table specific id NOTE: 1) log files are not encrypted, this is not needed for if aria is only used for internal temporary tables and they are not transactional (i.e not logged) 2) all encrypted tables are using PAGE_CHECKSUM (crc) normal internal temporary tables are (currently) not CHECKSUM:ed 3) This patch adds insert-order semantics to aria block_format. The default behaviour of aria block-format is best-fit, meaning that rows gets allocated to page trying to fill the pages as much as possible. However, certain sql constructs materialize temporary result in tmp-tables, and expect that a table scan will later return the rows in the same order they were inserted. This implementation of insert-order is only enabled when explicitly requested by sql-layer. CHANGES: 1) found bug in ma_write that made code try to abort a record that was never written unsure why this is not exposed Change-Id: Ia82bbaa92e2c0629c08693c5add2f56b815c0509 commit 89dc1ab651fe0205d55b4eb588f62df550aa65fc Author: Jonas Oreland <jonaso@google.com> Date: Mon Feb 17 08:04:50 2014 -0800 Implement encryption of innodb datafiles. Pages are encrypted before written to disk and decrypted when read from disk. Each page except first page (page 0) in tablespace is encrypted. Page 0 is unencrypted and contains IV for the tablespace. FIL_PAGE_FILE_FLUSH_LSN on each page (except page 0) is used to store a 32-bit key-version, so that multiple keys can be active in a tablespace simultaneous. The other 32-bit of the FIL_PAGE_FILE_FLUSH_LSN field contains a checksum that is computed after encryption. This checksum is used by innochecksum and when restoring from double-write-buffer. The encryption is performed using AES CRT. Monitoring of encryption is enabled using new IS-table INNODB_TABLESPACES_ENCRYPTION. In addition to that new status variables innodb_encryption_rotation_{ pages_read_from_cache, pages_read_from_disk, pages_modified,pages_flushed } has been added. The following tunables are introduces - innodb_encrypt_tables - innodb_encryption_threads - innodb_encryption_rotate_key_age - innodb_encryption_rotation_iops Change-Id: I8f651795a30b52e71b16d6bc9cb7559be349d0b2 commit a17eef2f6948e58219c9e26fc35633d6fd4de1de Author: Andrew Ford <andrewford@google.com> Date: Thu Jan 2 15:43:09 2014 -0800 Key management skeleton with debug hooks. Change-Id: Ifd6aa3743d7ea291c70083f433a059c439aed866 commit 68a399838ad72264fd61b3dc67fecd29bbdb0af1 Author: Andrew Ford <andrewford@google.com> Date: Mon Oct 28 16:27:44 2013 -0700 Add AES-128 CTR and GCM encryption classes. Change-Id: I116305eced2a233db15306bc2ef5b9d398d1a3a2
11 years ago
12 years ago
MDEV-11738: Mariadb uses 100% of several of my 8 cpus doing nothing MDEV-11581: Mariadb starts InnoDB encryption threads when key has not changed or data scrubbing turned off Background: Key rotation is based on background threads (innodb-encryption-threads) periodically going through all tablespaces on fil_system. For each tablespace current used key version is compared to max key age (innodb-encryption-rotate-key-age). This process naturally takes CPU. Similarly, in same time need for scrubbing is investigated. Currently, key rotation is fully supported on Amazon AWS key management plugin only but InnoDB does not have knowledge what key management plugin is used. This patch re-purposes innodb-encryption-rotate-key-age=0 to disable key rotation and background data scrubbing. All new tables are added to special list for key rotation and key rotation is based on sending a event to background encryption threads instead of using periodic checking (i.e. timeout). fil0fil.cc: Added functions fil_space_acquire_low() to acquire a tablespace when it could be dropped concurrently. This function is used from fil_space_acquire() or fil_space_acquire_silent() that will not print any messages if we try to acquire space that does not exist. fil_space_release() to release a acquired tablespace. fil_space_next() to iterate tablespaces in fil_system using fil_space_acquire() and fil_space_release(). Similarly, fil_space_keyrotation_next() to iterate new list fil_system->rotation_list where new tables. are added if key rotation is disabled. Removed unnecessary functions fil_get_first_space_safe() fil_get_next_space_safe() fil_node_open_file(): After page 0 is read read also crypt_info if it is not yet read. btr_scrub_lock_dict_func() buf_page_check_corrupt() buf_page_encrypt_before_write() buf_merge_or_delete_for_page() lock_print_info_all_transactions() row_fts_psort_info_init() row_truncate_table_for_mysql() row_drop_table_for_mysql() Use fil_space_acquire()/release() to access fil_space_t. buf_page_decrypt_after_read(): Use fil_space_get_crypt_data() because at this point we might not yet have read page 0. fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly to functions needing it and store fil_space_t* to rotation state. Use fil_space_acquire()/release() when iterating tablespaces and removed unnecessary is_closing from fil_crypt_t. Use fil_space_t::is_stopping() to detect when access to tablespace should be stopped. Removed unnecessary fil_space_get_crypt_data(). fil_space_create(): Inform key rotation that there could be something to do if key rotation is disabled and new table with encryption enabled is created. Remove unnecessary functions fil_get_first_space_safe() and fil_get_next_space_safe(). fil_space_acquire() and fil_space_release() are used instead. Moved fil_space_get_crypt_data() and fil_space_set_crypt_data() to fil0crypt.cc. fsp_header_init(): Acquire fil_space_t*, write crypt_data and release space. check_table_options() Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_* i_s.cc: Added ROTATING_OR_FLUSHING field to information_schema.innodb_tablespace_encryption to show current status of key rotation.
9 years ago
MDEV-11738: Mariadb uses 100% of several of my 8 cpus doing nothing MDEV-11581: Mariadb starts InnoDB encryption threads when key has not changed or data scrubbing turned off Background: Key rotation is based on background threads (innodb-encryption-threads) periodically going through all tablespaces on fil_system. For each tablespace current used key version is compared to max key age (innodb-encryption-rotate-key-age). This process naturally takes CPU. Similarly, in same time need for scrubbing is investigated. Currently, key rotation is fully supported on Amazon AWS key management plugin only but InnoDB does not have knowledge what key management plugin is used. This patch re-purposes innodb-encryption-rotate-key-age=0 to disable key rotation and background data scrubbing. All new tables are added to special list for key rotation and key rotation is based on sending a event to background encryption threads instead of using periodic checking (i.e. timeout). fil0fil.cc: Added functions fil_space_acquire_low() to acquire a tablespace when it could be dropped concurrently. This function is used from fil_space_acquire() or fil_space_acquire_silent() that will not print any messages if we try to acquire space that does not exist. fil_space_release() to release a acquired tablespace. fil_space_next() to iterate tablespaces in fil_system using fil_space_acquire() and fil_space_release(). Similarly, fil_space_keyrotation_next() to iterate new list fil_system->rotation_list where new tables. are added if key rotation is disabled. Removed unnecessary functions fil_get_first_space_safe() fil_get_next_space_safe() fil_node_open_file(): After page 0 is read read also crypt_info if it is not yet read. btr_scrub_lock_dict_func() buf_page_check_corrupt() buf_page_encrypt_before_write() buf_merge_or_delete_for_page() lock_print_info_all_transactions() row_fts_psort_info_init() row_truncate_table_for_mysql() row_drop_table_for_mysql() Use fil_space_acquire()/release() to access fil_space_t. buf_page_decrypt_after_read(): Use fil_space_get_crypt_data() because at this point we might not yet have read page 0. fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly to functions needing it and store fil_space_t* to rotation state. Use fil_space_acquire()/release() when iterating tablespaces and removed unnecessary is_closing from fil_crypt_t. Use fil_space_t::is_stopping() to detect when access to tablespace should be stopped. Removed unnecessary fil_space_get_crypt_data(). fil_space_create(): Inform key rotation that there could be something to do if key rotation is disabled and new table with encryption enabled is created. Remove unnecessary functions fil_get_first_space_safe() and fil_get_next_space_safe(). fil_space_acquire() and fil_space_release() are used instead. Moved fil_space_get_crypt_data() and fil_space_set_crypt_data() to fil0crypt.cc. fsp_header_init(): Acquire fil_space_t*, write crypt_data and release space. check_table_options() Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_* i_s.cc: Added ROTATING_OR_FLUSHING field to information_schema.innodb_tablespace_encryption to show current status of key rotation.
9 years ago
MDEV-11738: Mariadb uses 100% of several of my 8 cpus doing nothing MDEV-11581: Mariadb starts InnoDB encryption threads when key has not changed or data scrubbing turned off Background: Key rotation is based on background threads (innodb-encryption-threads) periodically going through all tablespaces on fil_system. For each tablespace current used key version is compared to max key age (innodb-encryption-rotate-key-age). This process naturally takes CPU. Similarly, in same time need for scrubbing is investigated. Currently, key rotation is fully supported on Amazon AWS key management plugin only but InnoDB does not have knowledge what key management plugin is used. This patch re-purposes innodb-encryption-rotate-key-age=0 to disable key rotation and background data scrubbing. All new tables are added to special list for key rotation and key rotation is based on sending a event to background encryption threads instead of using periodic checking (i.e. timeout). fil0fil.cc: Added functions fil_space_acquire_low() to acquire a tablespace when it could be dropped concurrently. This function is used from fil_space_acquire() or fil_space_acquire_silent() that will not print any messages if we try to acquire space that does not exist. fil_space_release() to release a acquired tablespace. fil_space_next() to iterate tablespaces in fil_system using fil_space_acquire() and fil_space_release(). Similarly, fil_space_keyrotation_next() to iterate new list fil_system->rotation_list where new tables. are added if key rotation is disabled. Removed unnecessary functions fil_get_first_space_safe() fil_get_next_space_safe() fil_node_open_file(): After page 0 is read read also crypt_info if it is not yet read. btr_scrub_lock_dict_func() buf_page_check_corrupt() buf_page_encrypt_before_write() buf_merge_or_delete_for_page() lock_print_info_all_transactions() row_fts_psort_info_init() row_truncate_table_for_mysql() row_drop_table_for_mysql() Use fil_space_acquire()/release() to access fil_space_t. buf_page_decrypt_after_read(): Use fil_space_get_crypt_data() because at this point we might not yet have read page 0. fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly to functions needing it and store fil_space_t* to rotation state. Use fil_space_acquire()/release() when iterating tablespaces and removed unnecessary is_closing from fil_crypt_t. Use fil_space_t::is_stopping() to detect when access to tablespace should be stopped. Removed unnecessary fil_space_get_crypt_data(). fil_space_create(): Inform key rotation that there could be something to do if key rotation is disabled and new table with encryption enabled is created. Remove unnecessary functions fil_get_first_space_safe() and fil_get_next_space_safe(). fil_space_acquire() and fil_space_release() are used instead. Moved fil_space_get_crypt_data() and fil_space_set_crypt_data() to fil0crypt.cc. fsp_header_init(): Acquire fil_space_t*, write crypt_data and release space. check_table_options() Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_* i_s.cc: Added ROTATING_OR_FLUSHING field to information_schema.innodb_tablespace_encryption to show current status of key rotation.
9 years ago
MDEV-11738: Mariadb uses 100% of several of my 8 cpus doing nothing MDEV-11581: Mariadb starts InnoDB encryption threads when key has not changed or data scrubbing turned off Background: Key rotation is based on background threads (innodb-encryption-threads) periodically going through all tablespaces on fil_system. For each tablespace current used key version is compared to max key age (innodb-encryption-rotate-key-age). This process naturally takes CPU. Similarly, in same time need for scrubbing is investigated. Currently, key rotation is fully supported on Amazon AWS key management plugin only but InnoDB does not have knowledge what key management plugin is used. This patch re-purposes innodb-encryption-rotate-key-age=0 to disable key rotation and background data scrubbing. All new tables are added to special list for key rotation and key rotation is based on sending a event to background encryption threads instead of using periodic checking (i.e. timeout). fil0fil.cc: Added functions fil_space_acquire_low() to acquire a tablespace when it could be dropped concurrently. This function is used from fil_space_acquire() or fil_space_acquire_silent() that will not print any messages if we try to acquire space that does not exist. fil_space_release() to release a acquired tablespace. fil_space_next() to iterate tablespaces in fil_system using fil_space_acquire() and fil_space_release(). Similarly, fil_space_keyrotation_next() to iterate new list fil_system->rotation_list where new tables. are added if key rotation is disabled. Removed unnecessary functions fil_get_first_space_safe() fil_get_next_space_safe() fil_node_open_file(): After page 0 is read read also crypt_info if it is not yet read. btr_scrub_lock_dict_func() buf_page_check_corrupt() buf_page_encrypt_before_write() buf_merge_or_delete_for_page() lock_print_info_all_transactions() row_fts_psort_info_init() row_truncate_table_for_mysql() row_drop_table_for_mysql() Use fil_space_acquire()/release() to access fil_space_t. buf_page_decrypt_after_read(): Use fil_space_get_crypt_data() because at this point we might not yet have read page 0. fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly to functions needing it and store fil_space_t* to rotation state. Use fil_space_acquire()/release() when iterating tablespaces and removed unnecessary is_closing from fil_crypt_t. Use fil_space_t::is_stopping() to detect when access to tablespace should be stopped. Removed unnecessary fil_space_get_crypt_data(). fil_space_create(): Inform key rotation that there could be something to do if key rotation is disabled and new table with encryption enabled is created. Remove unnecessary functions fil_get_first_space_safe() and fil_get_next_space_safe(). fil_space_acquire() and fil_space_release() are used instead. Moved fil_space_get_crypt_data() and fil_space_set_crypt_data() to fil0crypt.cc. fsp_header_init(): Acquire fil_space_t*, write crypt_data and release space. check_table_options() Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_* i_s.cc: Added ROTATING_OR_FLUSHING field to information_schema.innodb_tablespace_encryption to show current status of key rotation.
9 years ago
Merge Google encryption commit 195158e9889365dc3298f8c1f3bcaa745992f27f Author: Minli Zhu <minliz@google.com> Date: Mon Nov 25 11:05:55 2013 -0800 Innodb redo log encryption/decryption. Use start lsn of a log block as part of AES CTR counter. Record key version with each checkpoint. Internally key version 0 means no encryption. Tests done (see test_innodb_log_encryption.sh for detail): - Verify flag innodb_encrypt_log on or off, combined with various key versions passed through CLI, and dynamically set after startup, will not corrupt database. This includes tests from being unencrypted to encrypted, and encrypted to unencrypted. - Verify start-up with no redo logs succeeds. - Verify fresh start-up succeeds. Change-Id: I4ce4c2afdf3076be2fce90ebbc2a7ce01184b612 commit c1b97273659f07866758c25f4a56f680a1fbad24 Author: Jonas Oreland <jonaso@google.com> Date: Tue Dec 3 18:47:27 2013 +0100 encryption of aria data&index files this patch implements encryption of aria data & index files. this is implemented as 1) add read/write hooks (renamed from callbacks) that does encrypt/decrypt (also add pre_read and post_write hooks) 2) modify page headers for data/index to contain key version (making the data-page header size different for with/without encryption) 3) modify index page 0 to contain IV (and crypt header) 4) AES CRT crypt functions 5) counter block is implemented using combination of page no, lsn and table specific id NOTE: 1) log files are not encrypted, this is not needed for if aria is only used for internal temporary tables and they are not transactional (i.e not logged) 2) all encrypted tables are using PAGE_CHECKSUM (crc) normal internal temporary tables are (currently) not CHECKSUM:ed 3) This patch adds insert-order semantics to aria block_format. The default behaviour of aria block-format is best-fit, meaning that rows gets allocated to page trying to fill the pages as much as possible. However, certain sql constructs materialize temporary result in tmp-tables, and expect that a table scan will later return the rows in the same order they were inserted. This implementation of insert-order is only enabled when explicitly requested by sql-layer. CHANGES: 1) found bug in ma_write that made code try to abort a record that was never written unsure why this is not exposed Change-Id: Ia82bbaa92e2c0629c08693c5add2f56b815c0509 commit 89dc1ab651fe0205d55b4eb588f62df550aa65fc Author: Jonas Oreland <jonaso@google.com> Date: Mon Feb 17 08:04:50 2014 -0800 Implement encryption of innodb datafiles. Pages are encrypted before written to disk and decrypted when read from disk. Each page except first page (page 0) in tablespace is encrypted. Page 0 is unencrypted and contains IV for the tablespace. FIL_PAGE_FILE_FLUSH_LSN on each page (except page 0) is used to store a 32-bit key-version, so that multiple keys can be active in a tablespace simultaneous. The other 32-bit of the FIL_PAGE_FILE_FLUSH_LSN field contains a checksum that is computed after encryption. This checksum is used by innochecksum and when restoring from double-write-buffer. The encryption is performed using AES CRT. Monitoring of encryption is enabled using new IS-table INNODB_TABLESPACES_ENCRYPTION. In addition to that new status variables innodb_encryption_rotation_{ pages_read_from_cache, pages_read_from_disk, pages_modified,pages_flushed } has been added. The following tunables are introduces - innodb_encrypt_tables - innodb_encryption_threads - innodb_encryption_rotate_key_age - innodb_encryption_rotation_iops Change-Id: I8f651795a30b52e71b16d6bc9cb7559be349d0b2 commit a17eef2f6948e58219c9e26fc35633d6fd4de1de Author: Andrew Ford <andrewford@google.com> Date: Thu Jan 2 15:43:09 2014 -0800 Key management skeleton with debug hooks. Change-Id: Ifd6aa3743d7ea291c70083f433a059c439aed866 commit 68a399838ad72264fd61b3dc67fecd29bbdb0af1 Author: Andrew Ford <andrewford@google.com> Date: Mon Oct 28 16:27:44 2013 -0700 Add AES-128 CTR and GCM encryption classes. Change-Id: I116305eced2a233db15306bc2ef5b9d398d1a3a2
11 years ago
MDEV-11556 InnoDB redo log apply fails to adjust data file sizes fil_space_t::recv_size: New member: recovered tablespace size in pages; 0 if no size change was read from the redo log, or if the size change was implemented. fil_space_set_recv_size(): New function for setting space->recv_size. innodb_data_file_size_debug: A debug parameter for setting the system tablespace size in recovery even when the redo log does not contain any size changes. It is hard to write a small test case that would cause the system tablespace to be extended at the critical moment. recv_parse_log_rec(): Note those tablespaces whose size is being changed by the redo log, by invoking fil_space_set_recv_size(). innobase_init(): Correct an error message, and do not require a larger innodb_buffer_pool_size when starting up with a smaller innodb_page_size. innobase_start_or_create_for_mysql(): Allow startup with any initial size of the ibdata1 file if the autoextend attribute is set. Require the minimum size of fixed-size system tablespaces to be 640 pages, not 10 megabytes. Implement innodb_data_file_size_debug. open_or_create_data_files(): Round the system tablespace size down to pages, not to full megabytes, (Our test truncates the system tablespace to more than 800 pages with innodb_page_size=4k. InnoDB should not imagine that it was truncated to 768 pages and then overwrite good pages in the tablespace.) fil_flush_low(): Refactored from fil_flush(). fil_space_extend_must_retry(): Refactored from fil_extend_space_to_desired_size(). fil_mutex_enter_and_prepare_for_io(): Extend the tablespace if fil_space_set_recv_size() was called. The test case has been successfully run with all the innodb_page_size values 4k, 8k, 16k, 32k, 64k.
9 years ago
MDEV-12253: Buffer pool blocks are accessed after they have been freed Problem was that bpage was referenced after it was already freed from LRU. Fixed by adding a new variable encrypted that is passed down to buf_page_check_corrupt() and used in buf_page_get_gen() to stop processing page read. This patch should also address following test failures and bugs: MDEV-12419: IMPORT should not look up tablespace in PageConverter::validate(). This is now removed. MDEV-10099: encryption.innodb_onlinealter_encryption fails sporadically in buildbot MDEV-11420: encryption.innodb_encryption-page-compression failed in buildbot MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8 Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing and replaced these with dict_table_t::file_unreadable. Table ibd file is missing if fil_get_space(space_id) returns NULL and encrypted if not. Removed dict_table_t::is_corrupted field. Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(), buf_page_decrypt_after_read(), buf_page_encrypt_before_write(), buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats(). Added test cases when enrypted page could be read while doing redo log crash recovery. Also added test case for row compressed blobs. btr_cur_open_at_index_side_func(), btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is NULL. buf_page_get_zip(): Issue error if page read fails. buf_page_get_gen(): Use dberr_t for error detection and do not reference bpage after we hare freed it. buf_mark_space_corrupt(): remove bpage from LRU also when it is encrypted. buf_page_check_corrupt(): @return DB_SUCCESS if page has been read and is not corrupted, DB_PAGE_CORRUPTED if page based on checksum check is corrupted, DB_DECRYPTION_FAILED if page post encryption checksum matches but after decryption normal page checksum does not match. In read case only DB_SUCCESS is possible. buf_page_io_complete(): use dberr_t for error handling. buf_flush_write_block_low(), buf_read_ahead_random(), buf_read_page_async(), buf_read_ahead_linear(), buf_read_ibuf_merge_pages(), buf_read_recv_pages(), fil_aio_wait(): Issue error if page read fails. btr_pcur_move_to_next_page(): Do not reference page if it is NULL. Introduced dict_table_t::is_readable() and dict_index_t::is_readable() that will return true if tablespace exists and pages read from tablespace are not corrupted or page decryption failed. Removed buf_page_t::key_version. After page decryption the key version is not removed from page frame. For unencrypted pages, old key_version is removed at buf_page_encrypt_before_write() dict_stats_update_transient_for_index(), dict_stats_update_transient() Do not continue if table decryption failed or table is corrupted. dict0stats.cc: Introduced a dict_stats_report_error function to avoid code duplication. fil_parse_write_crypt_data(): Check that key read from redo log entry is found from encryption plugin and if it is not, refuse to start. PageConverter::validate(): Removed access to fil_space_t as tablespace is not available during import. Fixed error code on innodb.innodb test. Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown to innodb-bad-key-change2. Removed innodb-bad-key-change5 test. Decreased unnecessary complexity on some long lasting tests. Removed fil_inc_pending_ops(), fil_decr_pending_ops(), fil_get_first_space(), fil_get_next_space(), fil_get_first_space_safe(), fil_get_next_space_safe() functions. fil_space_verify_crypt_checksum(): Fixed bug found using ASAN where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly accessed from row compressed tables. Fixed out of page frame bug for row compressed tables in fil_space_verify_crypt_checksum() found using ASAN. Incorrect function was called for compressed table. Added new tests for discard, rename table and drop (we should allow them even when page decryption fails). Alter table rename is not allowed. Added test for restart with innodb-force-recovery=1 when page read on redo-recovery cant be decrypted. Added test for corrupted table where both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted. Adjusted the test case innodb_bug14147491 so that it does not anymore expect crash. Instead table is just mostly not usable. fil0fil.h: fil_space_acquire_low is not visible function and fil_space_acquire and fil_space_acquire_silent are inline functions. FilSpace class uses fil_space_acquire_low directly. recv_apply_hashed_log_recs() does not return anything.
9 years ago
MDEV-12253: Buffer pool blocks are accessed after they have been freed Problem was that bpage was referenced after it was already freed from LRU. Fixed by adding a new variable encrypted that is passed down to buf_page_check_corrupt() and used in buf_page_get_gen() to stop processing page read. This patch should also address following test failures and bugs: MDEV-12419: IMPORT should not look up tablespace in PageConverter::validate(). This is now removed. MDEV-10099: encryption.innodb_onlinealter_encryption fails sporadically in buildbot MDEV-11420: encryption.innodb_encryption-page-compression failed in buildbot MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8 Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing and replaced these with dict_table_t::file_unreadable. Table ibd file is missing if fil_get_space(space_id) returns NULL and encrypted if not. Removed dict_table_t::is_corrupted field. Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(), buf_page_decrypt_after_read(), buf_page_encrypt_before_write(), buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats(). Added test cases when enrypted page could be read while doing redo log crash recovery. Also added test case for row compressed blobs. btr_cur_open_at_index_side_func(), btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is NULL. buf_page_get_zip(): Issue error if page read fails. buf_page_get_gen(): Use dberr_t for error detection and do not reference bpage after we hare freed it. buf_mark_space_corrupt(): remove bpage from LRU also when it is encrypted. buf_page_check_corrupt(): @return DB_SUCCESS if page has been read and is not corrupted, DB_PAGE_CORRUPTED if page based on checksum check is corrupted, DB_DECRYPTION_FAILED if page post encryption checksum matches but after decryption normal page checksum does not match. In read case only DB_SUCCESS is possible. buf_page_io_complete(): use dberr_t for error handling. buf_flush_write_block_low(), buf_read_ahead_random(), buf_read_page_async(), buf_read_ahead_linear(), buf_read_ibuf_merge_pages(), buf_read_recv_pages(), fil_aio_wait(): Issue error if page read fails. btr_pcur_move_to_next_page(): Do not reference page if it is NULL. Introduced dict_table_t::is_readable() and dict_index_t::is_readable() that will return true if tablespace exists and pages read from tablespace are not corrupted or page decryption failed. Removed buf_page_t::key_version. After page decryption the key version is not removed from page frame. For unencrypted pages, old key_version is removed at buf_page_encrypt_before_write() dict_stats_update_transient_for_index(), dict_stats_update_transient() Do not continue if table decryption failed or table is corrupted. dict0stats.cc: Introduced a dict_stats_report_error function to avoid code duplication. fil_parse_write_crypt_data(): Check that key read from redo log entry is found from encryption plugin and if it is not, refuse to start. PageConverter::validate(): Removed access to fil_space_t as tablespace is not available during import. Fixed error code on innodb.innodb test. Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown to innodb-bad-key-change2. Removed innodb-bad-key-change5 test. Decreased unnecessary complexity on some long lasting tests. Removed fil_inc_pending_ops(), fil_decr_pending_ops(), fil_get_first_space(), fil_get_next_space(), fil_get_first_space_safe(), fil_get_next_space_safe() functions. fil_space_verify_crypt_checksum(): Fixed bug found using ASAN where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly accessed from row compressed tables. Fixed out of page frame bug for row compressed tables in fil_space_verify_crypt_checksum() found using ASAN. Incorrect function was called for compressed table. Added new tests for discard, rename table and drop (we should allow them even when page decryption fails). Alter table rename is not allowed. Added test for restart with innodb-force-recovery=1 when page read on redo-recovery cant be decrypted. Added test for corrupted table where both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted. Adjusted the test case innodb_bug14147491 so that it does not anymore expect crash. Instead table is just mostly not usable. fil0fil.h: fil_space_acquire_low is not visible function and fil_space_acquire and fil_space_acquire_silent are inline functions. FilSpace class uses fil_space_acquire_low directly. recv_apply_hashed_log_recs() does not return anything.
9 years ago
Merge 10.1 into 10.2 This only merges MDEV-12253, adapting it to MDEV-12602 which is already present in 10.2 but not yet in the 10.1 revision that is being merged. TODO: Error handling in crash recovery needs to be improved. If a page cannot be decrypted (or read), we should cleanly abort the startup. If innodb_force_recovery is specified, we should ignore the problematic page and apply redo log to other pages. Currently, the test encryption.innodb-redo-badkey randomly fails like this (the last messages are from cmake -DWITH_ASAN): 2017-05-05 10:19:40 140037071685504 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1635994 2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Missing MLOG_FILE_NAME or MLOG_FILE_DELETE before MLOG_CHECKPOINT for tablespace 1 2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[2201] with error Data structure corruption 2017-05-05 10:19:41 140037071685504 [Note] InnoDB: Starting shutdown... i================================================================= ==5226==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x612000018588 in thread T0 #0 0x736750 in operator delete(void*) (/mariadb/server/build/sql/mysqld+0x736750) #1 0x1e4833f in LatchCounter::~LatchCounter() /mariadb/server/storage/innobase/include/sync0types.h:599:4 #2 0x1e480b8 in LatchMeta<LatchCounter>::~LatchMeta() /mariadb/server/storage/innobase/include/sync0types.h:786:17 #3 0x1e35509 in sync_latch_meta_destroy() /mariadb/server/storage/innobase/sync/sync0debug.cc:1622:3 #4 0x1e35314 in sync_check_close() /mariadb/server/storage/innobase/sync/sync0debug.cc:1839:2 #5 0x1dfdc18 in innodb_shutdown() /mariadb/server/storage/innobase/srv/srv0start.cc:2888:2 #6 0x197e5e6 in innobase_init(void*) /mariadb/server/storage/innobase/handler/ha_innodb.cc:4475:3
9 years ago
MDEV-12253: Buffer pool blocks are accessed after they have been freed Problem was that bpage was referenced after it was already freed from LRU. Fixed by adding a new variable encrypted that is passed down to buf_page_check_corrupt() and used in buf_page_get_gen() to stop processing page read. This patch should also address following test failures and bugs: MDEV-12419: IMPORT should not look up tablespace in PageConverter::validate(). This is now removed. MDEV-10099: encryption.innodb_onlinealter_encryption fails sporadically in buildbot MDEV-11420: encryption.innodb_encryption-page-compression failed in buildbot MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8 Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing and replaced these with dict_table_t::file_unreadable. Table ibd file is missing if fil_get_space(space_id) returns NULL and encrypted if not. Removed dict_table_t::is_corrupted field. Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(), buf_page_decrypt_after_read(), buf_page_encrypt_before_write(), buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats(). Added test cases when enrypted page could be read while doing redo log crash recovery. Also added test case for row compressed blobs. btr_cur_open_at_index_side_func(), btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is NULL. buf_page_get_zip(): Issue error if page read fails. buf_page_get_gen(): Use dberr_t for error detection and do not reference bpage after we hare freed it. buf_mark_space_corrupt(): remove bpage from LRU also when it is encrypted. buf_page_check_corrupt(): @return DB_SUCCESS if page has been read and is not corrupted, DB_PAGE_CORRUPTED if page based on checksum check is corrupted, DB_DECRYPTION_FAILED if page post encryption checksum matches but after decryption normal page checksum does not match. In read case only DB_SUCCESS is possible. buf_page_io_complete(): use dberr_t for error handling. buf_flush_write_block_low(), buf_read_ahead_random(), buf_read_page_async(), buf_read_ahead_linear(), buf_read_ibuf_merge_pages(), buf_read_recv_pages(), fil_aio_wait(): Issue error if page read fails. btr_pcur_move_to_next_page(): Do not reference page if it is NULL. Introduced dict_table_t::is_readable() and dict_index_t::is_readable() that will return true if tablespace exists and pages read from tablespace are not corrupted or page decryption failed. Removed buf_page_t::key_version. After page decryption the key version is not removed from page frame. For unencrypted pages, old key_version is removed at buf_page_encrypt_before_write() dict_stats_update_transient_for_index(), dict_stats_update_transient() Do not continue if table decryption failed or table is corrupted. dict0stats.cc: Introduced a dict_stats_report_error function to avoid code duplication. fil_parse_write_crypt_data(): Check that key read from redo log entry is found from encryption plugin and if it is not, refuse to start. PageConverter::validate(): Removed access to fil_space_t as tablespace is not available during import. Fixed error code on innodb.innodb test. Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown to innodb-bad-key-change2. Removed innodb-bad-key-change5 test. Decreased unnecessary complexity on some long lasting tests. Removed fil_inc_pending_ops(), fil_decr_pending_ops(), fil_get_first_space(), fil_get_next_space(), fil_get_first_space_safe(), fil_get_next_space_safe() functions. fil_space_verify_crypt_checksum(): Fixed bug found using ASAN where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly accessed from row compressed tables. Fixed out of page frame bug for row compressed tables in fil_space_verify_crypt_checksum() found using ASAN. Incorrect function was called for compressed table. Added new tests for discard, rename table and drop (we should allow them even when page decryption fails). Alter table rename is not allowed. Added test for restart with innodb-force-recovery=1 when page read on redo-recovery cant be decrypted. Added test for corrupted table where both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted. Adjusted the test case innodb_bug14147491 so that it does not anymore expect crash. Instead table is just mostly not usable. fil0fil.h: fil_space_acquire_low is not visible function and fil_space_acquire and fil_space_acquire_silent are inline functions. FilSpace class uses fil_space_acquire_low directly. recv_apply_hashed_log_recs() does not return anything.
9 years ago
Merge 10.1 into 10.2 This only merges MDEV-12253, adapting it to MDEV-12602 which is already present in 10.2 but not yet in the 10.1 revision that is being merged. TODO: Error handling in crash recovery needs to be improved. If a page cannot be decrypted (or read), we should cleanly abort the startup. If innodb_force_recovery is specified, we should ignore the problematic page and apply redo log to other pages. Currently, the test encryption.innodb-redo-badkey randomly fails like this (the last messages are from cmake -DWITH_ASAN): 2017-05-05 10:19:40 140037071685504 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1635994 2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Missing MLOG_FILE_NAME or MLOG_FILE_DELETE before MLOG_CHECKPOINT for tablespace 1 2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[2201] with error Data structure corruption 2017-05-05 10:19:41 140037071685504 [Note] InnoDB: Starting shutdown... i================================================================= ==5226==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x612000018588 in thread T0 #0 0x736750 in operator delete(void*) (/mariadb/server/build/sql/mysqld+0x736750) #1 0x1e4833f in LatchCounter::~LatchCounter() /mariadb/server/storage/innobase/include/sync0types.h:599:4 #2 0x1e480b8 in LatchMeta<LatchCounter>::~LatchMeta() /mariadb/server/storage/innobase/include/sync0types.h:786:17 #3 0x1e35509 in sync_latch_meta_destroy() /mariadb/server/storage/innobase/sync/sync0debug.cc:1622:3 #4 0x1e35314 in sync_check_close() /mariadb/server/storage/innobase/sync/sync0debug.cc:1839:2 #5 0x1dfdc18 in innodb_shutdown() /mariadb/server/storage/innobase/srv/srv0start.cc:2888:2 #6 0x197e5e6 in innobase_init(void*) /mariadb/server/storage/innobase/handler/ha_innodb.cc:4475:3
9 years ago
MDEV-12253: Buffer pool blocks are accessed after they have been freed Problem was that bpage was referenced after it was already freed from LRU. Fixed by adding a new variable encrypted that is passed down to buf_page_check_corrupt() and used in buf_page_get_gen() to stop processing page read. This patch should also address following test failures and bugs: MDEV-12419: IMPORT should not look up tablespace in PageConverter::validate(). This is now removed. MDEV-10099: encryption.innodb_onlinealter_encryption fails sporadically in buildbot MDEV-11420: encryption.innodb_encryption-page-compression failed in buildbot MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8 Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing and replaced these with dict_table_t::file_unreadable. Table ibd file is missing if fil_get_space(space_id) returns NULL and encrypted if not. Removed dict_table_t::is_corrupted field. Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(), buf_page_decrypt_after_read(), buf_page_encrypt_before_write(), buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats(). Added test cases when enrypted page could be read while doing redo log crash recovery. Also added test case for row compressed blobs. btr_cur_open_at_index_side_func(), btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is NULL. buf_page_get_zip(): Issue error if page read fails. buf_page_get_gen(): Use dberr_t for error detection and do not reference bpage after we hare freed it. buf_mark_space_corrupt(): remove bpage from LRU also when it is encrypted. buf_page_check_corrupt(): @return DB_SUCCESS if page has been read and is not corrupted, DB_PAGE_CORRUPTED if page based on checksum check is corrupted, DB_DECRYPTION_FAILED if page post encryption checksum matches but after decryption normal page checksum does not match. In read case only DB_SUCCESS is possible. buf_page_io_complete(): use dberr_t for error handling. buf_flush_write_block_low(), buf_read_ahead_random(), buf_read_page_async(), buf_read_ahead_linear(), buf_read_ibuf_merge_pages(), buf_read_recv_pages(), fil_aio_wait(): Issue error if page read fails. btr_pcur_move_to_next_page(): Do not reference page if it is NULL. Introduced dict_table_t::is_readable() and dict_index_t::is_readable() that will return true if tablespace exists and pages read from tablespace are not corrupted or page decryption failed. Removed buf_page_t::key_version. After page decryption the key version is not removed from page frame. For unencrypted pages, old key_version is removed at buf_page_encrypt_before_write() dict_stats_update_transient_for_index(), dict_stats_update_transient() Do not continue if table decryption failed or table is corrupted. dict0stats.cc: Introduced a dict_stats_report_error function to avoid code duplication. fil_parse_write_crypt_data(): Check that key read from redo log entry is found from encryption plugin and if it is not, refuse to start. PageConverter::validate(): Removed access to fil_space_t as tablespace is not available during import. Fixed error code on innodb.innodb test. Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown to innodb-bad-key-change2. Removed innodb-bad-key-change5 test. Decreased unnecessary complexity on some long lasting tests. Removed fil_inc_pending_ops(), fil_decr_pending_ops(), fil_get_first_space(), fil_get_next_space(), fil_get_first_space_safe(), fil_get_next_space_safe() functions. fil_space_verify_crypt_checksum(): Fixed bug found using ASAN where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly accessed from row compressed tables. Fixed out of page frame bug for row compressed tables in fil_space_verify_crypt_checksum() found using ASAN. Incorrect function was called for compressed table. Added new tests for discard, rename table and drop (we should allow them even when page decryption fails). Alter table rename is not allowed. Added test for restart with innodb-force-recovery=1 when page read on redo-recovery cant be decrypted. Added test for corrupted table where both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted. Adjusted the test case innodb_bug14147491 so that it does not anymore expect crash. Instead table is just mostly not usable. fil0fil.h: fil_space_acquire_low is not visible function and fil_space_acquire and fil_space_acquire_silent are inline functions. FilSpace class uses fil_space_acquire_low directly. recv_apply_hashed_log_recs() does not return anything.
9 years ago
Merge 10.1 into 10.2 This only merges MDEV-12253, adapting it to MDEV-12602 which is already present in 10.2 but not yet in the 10.1 revision that is being merged. TODO: Error handling in crash recovery needs to be improved. If a page cannot be decrypted (or read), we should cleanly abort the startup. If innodb_force_recovery is specified, we should ignore the problematic page and apply redo log to other pages. Currently, the test encryption.innodb-redo-badkey randomly fails like this (the last messages are from cmake -DWITH_ASAN): 2017-05-05 10:19:40 140037071685504 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1635994 2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Missing MLOG_FILE_NAME or MLOG_FILE_DELETE before MLOG_CHECKPOINT for tablespace 1 2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[2201] with error Data structure corruption 2017-05-05 10:19:41 140037071685504 [Note] InnoDB: Starting shutdown... i================================================================= ==5226==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x612000018588 in thread T0 #0 0x736750 in operator delete(void*) (/mariadb/server/build/sql/mysqld+0x736750) #1 0x1e4833f in LatchCounter::~LatchCounter() /mariadb/server/storage/innobase/include/sync0types.h:599:4 #2 0x1e480b8 in LatchMeta<LatchCounter>::~LatchMeta() /mariadb/server/storage/innobase/include/sync0types.h:786:17 #3 0x1e35509 in sync_latch_meta_destroy() /mariadb/server/storage/innobase/sync/sync0debug.cc:1622:3 #4 0x1e35314 in sync_check_close() /mariadb/server/storage/innobase/sync/sync0debug.cc:1839:2 #5 0x1dfdc18 in innodb_shutdown() /mariadb/server/storage/innobase/srv/srv0start.cc:2888:2 #6 0x197e5e6 in innobase_init(void*) /mariadb/server/storage/innobase/handler/ha_innodb.cc:4475:3
9 years ago
MDEV-12253: Buffer pool blocks are accessed after they have been freed Problem was that bpage was referenced after it was already freed from LRU. Fixed by adding a new variable encrypted that is passed down to buf_page_check_corrupt() and used in buf_page_get_gen() to stop processing page read. This patch should also address following test failures and bugs: MDEV-12419: IMPORT should not look up tablespace in PageConverter::validate(). This is now removed. MDEV-10099: encryption.innodb_onlinealter_encryption fails sporadically in buildbot MDEV-11420: encryption.innodb_encryption-page-compression failed in buildbot MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8 Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing and replaced these with dict_table_t::file_unreadable. Table ibd file is missing if fil_get_space(space_id) returns NULL and encrypted if not. Removed dict_table_t::is_corrupted field. Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(), buf_page_decrypt_after_read(), buf_page_encrypt_before_write(), buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats(). Added test cases when enrypted page could be read while doing redo log crash recovery. Also added test case for row compressed blobs. btr_cur_open_at_index_side_func(), btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is NULL. buf_page_get_zip(): Issue error if page read fails. buf_page_get_gen(): Use dberr_t for error detection and do not reference bpage after we hare freed it. buf_mark_space_corrupt(): remove bpage from LRU also when it is encrypted. buf_page_check_corrupt(): @return DB_SUCCESS if page has been read and is not corrupted, DB_PAGE_CORRUPTED if page based on checksum check is corrupted, DB_DECRYPTION_FAILED if page post encryption checksum matches but after decryption normal page checksum does not match. In read case only DB_SUCCESS is possible. buf_page_io_complete(): use dberr_t for error handling. buf_flush_write_block_low(), buf_read_ahead_random(), buf_read_page_async(), buf_read_ahead_linear(), buf_read_ibuf_merge_pages(), buf_read_recv_pages(), fil_aio_wait(): Issue error if page read fails. btr_pcur_move_to_next_page(): Do not reference page if it is NULL. Introduced dict_table_t::is_readable() and dict_index_t::is_readable() that will return true if tablespace exists and pages read from tablespace are not corrupted or page decryption failed. Removed buf_page_t::key_version. After page decryption the key version is not removed from page frame. For unencrypted pages, old key_version is removed at buf_page_encrypt_before_write() dict_stats_update_transient_for_index(), dict_stats_update_transient() Do not continue if table decryption failed or table is corrupted. dict0stats.cc: Introduced a dict_stats_report_error function to avoid code duplication. fil_parse_write_crypt_data(): Check that key read from redo log entry is found from encryption plugin and if it is not, refuse to start. PageConverter::validate(): Removed access to fil_space_t as tablespace is not available during import. Fixed error code on innodb.innodb test. Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown to innodb-bad-key-change2. Removed innodb-bad-key-change5 test. Decreased unnecessary complexity on some long lasting tests. Removed fil_inc_pending_ops(), fil_decr_pending_ops(), fil_get_first_space(), fil_get_next_space(), fil_get_first_space_safe(), fil_get_next_space_safe() functions. fil_space_verify_crypt_checksum(): Fixed bug found using ASAN where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly accessed from row compressed tables. Fixed out of page frame bug for row compressed tables in fil_space_verify_crypt_checksum() found using ASAN. Incorrect function was called for compressed table. Added new tests for discard, rename table and drop (we should allow them even when page decryption fails). Alter table rename is not allowed. Added test for restart with innodb-force-recovery=1 when page read on redo-recovery cant be decrypted. Added test for corrupted table where both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted. Adjusted the test case innodb_bug14147491 so that it does not anymore expect crash. Instead table is just mostly not usable. fil0fil.h: fil_space_acquire_low is not visible function and fil_space_acquire and fil_space_acquire_silent are inline functions. FilSpace class uses fil_space_acquire_low directly. recv_apply_hashed_log_recs() does not return anything.
9 years ago
MDEV-12253: Buffer pool blocks are accessed after they have been freed Problem was that bpage was referenced after it was already freed from LRU. Fixed by adding a new variable encrypted that is passed down to buf_page_check_corrupt() and used in buf_page_get_gen() to stop processing page read. This patch should also address following test failures and bugs: MDEV-12419: IMPORT should not look up tablespace in PageConverter::validate(). This is now removed. MDEV-10099: encryption.innodb_onlinealter_encryption fails sporadically in buildbot MDEV-11420: encryption.innodb_encryption-page-compression failed in buildbot MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8 Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing and replaced these with dict_table_t::file_unreadable. Table ibd file is missing if fil_get_space(space_id) returns NULL and encrypted if not. Removed dict_table_t::is_corrupted field. Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(), buf_page_decrypt_after_read(), buf_page_encrypt_before_write(), buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats(). Added test cases when enrypted page could be read while doing redo log crash recovery. Also added test case for row compressed blobs. btr_cur_open_at_index_side_func(), btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is NULL. buf_page_get_zip(): Issue error if page read fails. buf_page_get_gen(): Use dberr_t for error detection and do not reference bpage after we hare freed it. buf_mark_space_corrupt(): remove bpage from LRU also when it is encrypted. buf_page_check_corrupt(): @return DB_SUCCESS if page has been read and is not corrupted, DB_PAGE_CORRUPTED if page based on checksum check is corrupted, DB_DECRYPTION_FAILED if page post encryption checksum matches but after decryption normal page checksum does not match. In read case only DB_SUCCESS is possible. buf_page_io_complete(): use dberr_t for error handling. buf_flush_write_block_low(), buf_read_ahead_random(), buf_read_page_async(), buf_read_ahead_linear(), buf_read_ibuf_merge_pages(), buf_read_recv_pages(), fil_aio_wait(): Issue error if page read fails. btr_pcur_move_to_next_page(): Do not reference page if it is NULL. Introduced dict_table_t::is_readable() and dict_index_t::is_readable() that will return true if tablespace exists and pages read from tablespace are not corrupted or page decryption failed. Removed buf_page_t::key_version. After page decryption the key version is not removed from page frame. For unencrypted pages, old key_version is removed at buf_page_encrypt_before_write() dict_stats_update_transient_for_index(), dict_stats_update_transient() Do not continue if table decryption failed or table is corrupted. dict0stats.cc: Introduced a dict_stats_report_error function to avoid code duplication. fil_parse_write_crypt_data(): Check that key read from redo log entry is found from encryption plugin and if it is not, refuse to start. PageConverter::validate(): Removed access to fil_space_t as tablespace is not available during import. Fixed error code on innodb.innodb test. Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown to innodb-bad-key-change2. Removed innodb-bad-key-change5 test. Decreased unnecessary complexity on some long lasting tests. Removed fil_inc_pending_ops(), fil_decr_pending_ops(), fil_get_first_space(), fil_get_next_space(), fil_get_first_space_safe(), fil_get_next_space_safe() functions. fil_space_verify_crypt_checksum(): Fixed bug found using ASAN where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly accessed from row compressed tables. Fixed out of page frame bug for row compressed tables in fil_space_verify_crypt_checksum() found using ASAN. Incorrect function was called for compressed table. Added new tests for discard, rename table and drop (we should allow them even when page decryption fails). Alter table rename is not allowed. Added test for restart with innodb-force-recovery=1 when page read on redo-recovery cant be decrypted. Added test for corrupted table where both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted. Adjusted the test case innodb_bug14147491 so that it does not anymore expect crash. Instead table is just mostly not usable. fil0fil.h: fil_space_acquire_low is not visible function and fil_space_acquire and fil_space_acquire_silent are inline functions. FilSpace class uses fil_space_acquire_low directly. recv_apply_hashed_log_recs() does not return anything.
9 years ago
MDEV-12253: Buffer pool blocks are accessed after they have been freed Problem was that bpage was referenced after it was already freed from LRU. Fixed by adding a new variable encrypted that is passed down to buf_page_check_corrupt() and used in buf_page_get_gen() to stop processing page read. This patch should also address following test failures and bugs: MDEV-12419: IMPORT should not look up tablespace in PageConverter::validate(). This is now removed. MDEV-10099: encryption.innodb_onlinealter_encryption fails sporadically in buildbot MDEV-11420: encryption.innodb_encryption-page-compression failed in buildbot MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8 Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing and replaced these with dict_table_t::file_unreadable. Table ibd file is missing if fil_get_space(space_id) returns NULL and encrypted if not. Removed dict_table_t::is_corrupted field. Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(), buf_page_decrypt_after_read(), buf_page_encrypt_before_write(), buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats(). Added test cases when enrypted page could be read while doing redo log crash recovery. Also added test case for row compressed blobs. btr_cur_open_at_index_side_func(), btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is NULL. buf_page_get_zip(): Issue error if page read fails. buf_page_get_gen(): Use dberr_t for error detection and do not reference bpage after we hare freed it. buf_mark_space_corrupt(): remove bpage from LRU also when it is encrypted. buf_page_check_corrupt(): @return DB_SUCCESS if page has been read and is not corrupted, DB_PAGE_CORRUPTED if page based on checksum check is corrupted, DB_DECRYPTION_FAILED if page post encryption checksum matches but after decryption normal page checksum does not match. In read case only DB_SUCCESS is possible. buf_page_io_complete(): use dberr_t for error handling. buf_flush_write_block_low(), buf_read_ahead_random(), buf_read_page_async(), buf_read_ahead_linear(), buf_read_ibuf_merge_pages(), buf_read_recv_pages(), fil_aio_wait(): Issue error if page read fails. btr_pcur_move_to_next_page(): Do not reference page if it is NULL. Introduced dict_table_t::is_readable() and dict_index_t::is_readable() that will return true if tablespace exists and pages read from tablespace are not corrupted or page decryption failed. Removed buf_page_t::key_version. After page decryption the key version is not removed from page frame. For unencrypted pages, old key_version is removed at buf_page_encrypt_before_write() dict_stats_update_transient_for_index(), dict_stats_update_transient() Do not continue if table decryption failed or table is corrupted. dict0stats.cc: Introduced a dict_stats_report_error function to avoid code duplication. fil_parse_write_crypt_data(): Check that key read from redo log entry is found from encryption plugin and if it is not, refuse to start. PageConverter::validate(): Removed access to fil_space_t as tablespace is not available during import. Fixed error code on innodb.innodb test. Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown to innodb-bad-key-change2. Removed innodb-bad-key-change5 test. Decreased unnecessary complexity on some long lasting tests. Removed fil_inc_pending_ops(), fil_decr_pending_ops(), fil_get_first_space(), fil_get_next_space(), fil_get_first_space_safe(), fil_get_next_space_safe() functions. fil_space_verify_crypt_checksum(): Fixed bug found using ASAN where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly accessed from row compressed tables. Fixed out of page frame bug for row compressed tables in fil_space_verify_crypt_checksum() found using ASAN. Incorrect function was called for compressed table. Added new tests for discard, rename table and drop (we should allow them even when page decryption fails). Alter table rename is not allowed. Added test for restart with innodb-force-recovery=1 when page read on redo-recovery cant be decrypted. Added test for corrupted table where both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted. Adjusted the test case innodb_bug14147491 so that it does not anymore expect crash. Instead table is just mostly not usable. fil0fil.h: fil_space_acquire_low is not visible function and fil_space_acquire and fil_space_acquire_silent are inline functions. FilSpace class uses fil_space_acquire_low directly. recv_apply_hashed_log_recs() does not return anything.
9 years ago
Merge 10.1 into 10.2 This only merges MDEV-12253, adapting it to MDEV-12602 which is already present in 10.2 but not yet in the 10.1 revision that is being merged. TODO: Error handling in crash recovery needs to be improved. If a page cannot be decrypted (or read), we should cleanly abort the startup. If innodb_force_recovery is specified, we should ignore the problematic page and apply redo log to other pages. Currently, the test encryption.innodb-redo-badkey randomly fails like this (the last messages are from cmake -DWITH_ASAN): 2017-05-05 10:19:40 140037071685504 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1635994 2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Missing MLOG_FILE_NAME or MLOG_FILE_DELETE before MLOG_CHECKPOINT for tablespace 1 2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[2201] with error Data structure corruption 2017-05-05 10:19:41 140037071685504 [Note] InnoDB: Starting shutdown... i================================================================= ==5226==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x612000018588 in thread T0 #0 0x736750 in operator delete(void*) (/mariadb/server/build/sql/mysqld+0x736750) #1 0x1e4833f in LatchCounter::~LatchCounter() /mariadb/server/storage/innobase/include/sync0types.h:599:4 #2 0x1e480b8 in LatchMeta<LatchCounter>::~LatchMeta() /mariadb/server/storage/innobase/include/sync0types.h:786:17 #3 0x1e35509 in sync_latch_meta_destroy() /mariadb/server/storage/innobase/sync/sync0debug.cc:1622:3 #4 0x1e35314 in sync_check_close() /mariadb/server/storage/innobase/sync/sync0debug.cc:1839:2 #5 0x1dfdc18 in innodb_shutdown() /mariadb/server/storage/innobase/srv/srv0start.cc:2888:2 #6 0x197e5e6 in innobase_init(void*) /mariadb/server/storage/innobase/handler/ha_innodb.cc:4475:3
9 years ago
MDEV-12253: Buffer pool blocks are accessed after they have been freed Problem was that bpage was referenced after it was already freed from LRU. Fixed by adding a new variable encrypted that is passed down to buf_page_check_corrupt() and used in buf_page_get_gen() to stop processing page read. This patch should also address following test failures and bugs: MDEV-12419: IMPORT should not look up tablespace in PageConverter::validate(). This is now removed. MDEV-10099: encryption.innodb_onlinealter_encryption fails sporadically in buildbot MDEV-11420: encryption.innodb_encryption-page-compression failed in buildbot MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8 Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing and replaced these with dict_table_t::file_unreadable. Table ibd file is missing if fil_get_space(space_id) returns NULL and encrypted if not. Removed dict_table_t::is_corrupted field. Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(), buf_page_decrypt_after_read(), buf_page_encrypt_before_write(), buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats(). Added test cases when enrypted page could be read while doing redo log crash recovery. Also added test case for row compressed blobs. btr_cur_open_at_index_side_func(), btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is NULL. buf_page_get_zip(): Issue error if page read fails. buf_page_get_gen(): Use dberr_t for error detection and do not reference bpage after we hare freed it. buf_mark_space_corrupt(): remove bpage from LRU also when it is encrypted. buf_page_check_corrupt(): @return DB_SUCCESS if page has been read and is not corrupted, DB_PAGE_CORRUPTED if page based on checksum check is corrupted, DB_DECRYPTION_FAILED if page post encryption checksum matches but after decryption normal page checksum does not match. In read case only DB_SUCCESS is possible. buf_page_io_complete(): use dberr_t for error handling. buf_flush_write_block_low(), buf_read_ahead_random(), buf_read_page_async(), buf_read_ahead_linear(), buf_read_ibuf_merge_pages(), buf_read_recv_pages(), fil_aio_wait(): Issue error if page read fails. btr_pcur_move_to_next_page(): Do not reference page if it is NULL. Introduced dict_table_t::is_readable() and dict_index_t::is_readable() that will return true if tablespace exists and pages read from tablespace are not corrupted or page decryption failed. Removed buf_page_t::key_version. After page decryption the key version is not removed from page frame. For unencrypted pages, old key_version is removed at buf_page_encrypt_before_write() dict_stats_update_transient_for_index(), dict_stats_update_transient() Do not continue if table decryption failed or table is corrupted. dict0stats.cc: Introduced a dict_stats_report_error function to avoid code duplication. fil_parse_write_crypt_data(): Check that key read from redo log entry is found from encryption plugin and if it is not, refuse to start. PageConverter::validate(): Removed access to fil_space_t as tablespace is not available during import. Fixed error code on innodb.innodb test. Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown to innodb-bad-key-change2. Removed innodb-bad-key-change5 test. Decreased unnecessary complexity on some long lasting tests. Removed fil_inc_pending_ops(), fil_decr_pending_ops(), fil_get_first_space(), fil_get_next_space(), fil_get_first_space_safe(), fil_get_next_space_safe() functions. fil_space_verify_crypt_checksum(): Fixed bug found using ASAN where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly accessed from row compressed tables. Fixed out of page frame bug for row compressed tables in fil_space_verify_crypt_checksum() found using ASAN. Incorrect function was called for compressed table. Added new tests for discard, rename table and drop (we should allow them even when page decryption fails). Alter table rename is not allowed. Added test for restart with innodb-force-recovery=1 when page read on redo-recovery cant be decrypted. Added test for corrupted table where both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted. Adjusted the test case innodb_bug14147491 so that it does not anymore expect crash. Instead table is just mostly not usable. fil0fil.h: fil_space_acquire_low is not visible function and fil_space_acquire and fil_space_acquire_silent are inline functions. FilSpace class uses fil_space_acquire_low directly. recv_apply_hashed_log_recs() does not return anything.
9 years ago
MDEV-11738: Mariadb uses 100% of several of my 8 cpus doing nothing MDEV-11581: Mariadb starts InnoDB encryption threads when key has not changed or data scrubbing turned off Background: Key rotation is based on background threads (innodb-encryption-threads) periodically going through all tablespaces on fil_system. For each tablespace current used key version is compared to max key age (innodb-encryption-rotate-key-age). This process naturally takes CPU. Similarly, in same time need for scrubbing is investigated. Currently, key rotation is fully supported on Amazon AWS key management plugin only but InnoDB does not have knowledge what key management plugin is used. This patch re-purposes innodb-encryption-rotate-key-age=0 to disable key rotation and background data scrubbing. All new tables are added to special list for key rotation and key rotation is based on sending a event to background encryption threads instead of using periodic checking (i.e. timeout). fil0fil.cc: Added functions fil_space_acquire_low() to acquire a tablespace when it could be dropped concurrently. This function is used from fil_space_acquire() or fil_space_acquire_silent() that will not print any messages if we try to acquire space that does not exist. fil_space_release() to release a acquired tablespace. fil_space_next() to iterate tablespaces in fil_system using fil_space_acquire() and fil_space_release(). Similarly, fil_space_keyrotation_next() to iterate new list fil_system->rotation_list where new tables. are added if key rotation is disabled. Removed unnecessary functions fil_get_first_space_safe() fil_get_next_space_safe() fil_node_open_file(): After page 0 is read read also crypt_info if it is not yet read. btr_scrub_lock_dict_func() buf_page_check_corrupt() buf_page_encrypt_before_write() buf_merge_or_delete_for_page() lock_print_info_all_transactions() row_fts_psort_info_init() row_truncate_table_for_mysql() row_drop_table_for_mysql() Use fil_space_acquire()/release() to access fil_space_t. buf_page_decrypt_after_read(): Use fil_space_get_crypt_data() because at this point we might not yet have read page 0. fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly to functions needing it and store fil_space_t* to rotation state. Use fil_space_acquire()/release() when iterating tablespaces and removed unnecessary is_closing from fil_crypt_t. Use fil_space_t::is_stopping() to detect when access to tablespace should be stopped. Removed unnecessary fil_space_get_crypt_data(). fil_space_create(): Inform key rotation that there could be something to do if key rotation is disabled and new table with encryption enabled is created. Remove unnecessary functions fil_get_first_space_safe() and fil_get_next_space_safe(). fil_space_acquire() and fil_space_release() are used instead. Moved fil_space_get_crypt_data() and fil_space_set_crypt_data() to fil0crypt.cc. fsp_header_init(): Acquire fil_space_t*, write crypt_data and release space. check_table_options() Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_* i_s.cc: Added ROTATING_OR_FLUSHING field to information_schema.innodb_tablespace_encryption to show current status of key rotation.
9 years ago
MDEV-12253: Buffer pool blocks are accessed after they have been freed Problem was that bpage was referenced after it was already freed from LRU. Fixed by adding a new variable encrypted that is passed down to buf_page_check_corrupt() and used in buf_page_get_gen() to stop processing page read. This patch should also address following test failures and bugs: MDEV-12419: IMPORT should not look up tablespace in PageConverter::validate(). This is now removed. MDEV-10099: encryption.innodb_onlinealter_encryption fails sporadically in buildbot MDEV-11420: encryption.innodb_encryption-page-compression failed in buildbot MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8 Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing and replaced these with dict_table_t::file_unreadable. Table ibd file is missing if fil_get_space(space_id) returns NULL and encrypted if not. Removed dict_table_t::is_corrupted field. Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(), buf_page_decrypt_after_read(), buf_page_encrypt_before_write(), buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats(). Added test cases when enrypted page could be read while doing redo log crash recovery. Also added test case for row compressed blobs. btr_cur_open_at_index_side_func(), btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is NULL. buf_page_get_zip(): Issue error if page read fails. buf_page_get_gen(): Use dberr_t for error detection and do not reference bpage after we hare freed it. buf_mark_space_corrupt(): remove bpage from LRU also when it is encrypted. buf_page_check_corrupt(): @return DB_SUCCESS if page has been read and is not corrupted, DB_PAGE_CORRUPTED if page based on checksum check is corrupted, DB_DECRYPTION_FAILED if page post encryption checksum matches but after decryption normal page checksum does not match. In read case only DB_SUCCESS is possible. buf_page_io_complete(): use dberr_t for error handling. buf_flush_write_block_low(), buf_read_ahead_random(), buf_read_page_async(), buf_read_ahead_linear(), buf_read_ibuf_merge_pages(), buf_read_recv_pages(), fil_aio_wait(): Issue error if page read fails. btr_pcur_move_to_next_page(): Do not reference page if it is NULL. Introduced dict_table_t::is_readable() and dict_index_t::is_readable() that will return true if tablespace exists and pages read from tablespace are not corrupted or page decryption failed. Removed buf_page_t::key_version. After page decryption the key version is not removed from page frame. For unencrypted pages, old key_version is removed at buf_page_encrypt_before_write() dict_stats_update_transient_for_index(), dict_stats_update_transient() Do not continue if table decryption failed or table is corrupted. dict0stats.cc: Introduced a dict_stats_report_error function to avoid code duplication. fil_parse_write_crypt_data(): Check that key read from redo log entry is found from encryption plugin and if it is not, refuse to start. PageConverter::validate(): Removed access to fil_space_t as tablespace is not available during import. Fixed error code on innodb.innodb test. Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown to innodb-bad-key-change2. Removed innodb-bad-key-change5 test. Decreased unnecessary complexity on some long lasting tests. Removed fil_inc_pending_ops(), fil_decr_pending_ops(), fil_get_first_space(), fil_get_next_space(), fil_get_first_space_safe(), fil_get_next_space_safe() functions. fil_space_verify_crypt_checksum(): Fixed bug found using ASAN where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly accessed from row compressed tables. Fixed out of page frame bug for row compressed tables in fil_space_verify_crypt_checksum() found using ASAN. Incorrect function was called for compressed table. Added new tests for discard, rename table and drop (we should allow them even when page decryption fails). Alter table rename is not allowed. Added test for restart with innodb-force-recovery=1 when page read on redo-recovery cant be decrypted. Added test for corrupted table where both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted. Adjusted the test case innodb_bug14147491 so that it does not anymore expect crash. Instead table is just mostly not usable. fil0fil.h: fil_space_acquire_low is not visible function and fil_space_acquire and fil_space_acquire_silent are inline functions. FilSpace class uses fil_space_acquire_low directly. recv_apply_hashed_log_recs() does not return anything.
9 years ago
Merge 10.1 into 10.2 This only merges MDEV-12253, adapting it to MDEV-12602 which is already present in 10.2 but not yet in the 10.1 revision that is being merged. TODO: Error handling in crash recovery needs to be improved. If a page cannot be decrypted (or read), we should cleanly abort the startup. If innodb_force_recovery is specified, we should ignore the problematic page and apply redo log to other pages. Currently, the test encryption.innodb-redo-badkey randomly fails like this (the last messages are from cmake -DWITH_ASAN): 2017-05-05 10:19:40 140037071685504 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1635994 2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Missing MLOG_FILE_NAME or MLOG_FILE_DELETE before MLOG_CHECKPOINT for tablespace 1 2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[2201] with error Data structure corruption 2017-05-05 10:19:41 140037071685504 [Note] InnoDB: Starting shutdown... i================================================================= ==5226==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x612000018588 in thread T0 #0 0x736750 in operator delete(void*) (/mariadb/server/build/sql/mysqld+0x736750) #1 0x1e4833f in LatchCounter::~LatchCounter() /mariadb/server/storage/innobase/include/sync0types.h:599:4 #2 0x1e480b8 in LatchMeta<LatchCounter>::~LatchMeta() /mariadb/server/storage/innobase/include/sync0types.h:786:17 #3 0x1e35509 in sync_latch_meta_destroy() /mariadb/server/storage/innobase/sync/sync0debug.cc:1622:3 #4 0x1e35314 in sync_check_close() /mariadb/server/storage/innobase/sync/sync0debug.cc:1839:2 #5 0x1dfdc18 in innodb_shutdown() /mariadb/server/storage/innobase/srv/srv0start.cc:2888:2 #6 0x197e5e6 in innobase_init(void*) /mariadb/server/storage/innobase/handler/ha_innodb.cc:4475:3
9 years ago
9 years ago
10 years ago
MDEV-11623 MariaDB 10.1 fails to start datadir created with MariaDB 10.0/MySQL 5.6 using innodb-page-size!=16K The storage format of FSP_SPACE_FLAGS was accidentally broken already in MariaDB 10.1.0. This fix is bringing the format in line with other MySQL and MariaDB release series. Please refer to the comments that were added to fsp0fsp.h for details. This is an INCOMPATIBLE CHANGE that affects users of page_compression and non-default innodb_page_size. Upgrading to this release will correct the flags in the data files. If you want to downgrade to earlier MariaDB 10.1.x, please refer to the test innodb.101_compatibility how to reset the FSP_SPACE_FLAGS in the files. NOTE: MariaDB 10.1.0 to 10.1.20 can misinterpret uncompressed data files with innodb_page_size=4k or 64k as compressed innodb_page_size=16k files, and then probably fail when trying to access the pages. See the comments in the function fsp_flags_convert_from_101() for detailed analysis. Move PAGE_COMPRESSION to FSP_SPACE_FLAGS bit position 16. In this way, compressed innodb_page_size=16k tablespaces will not be mistaken for uncompressed ones by MariaDB 10.1.0 to 10.1.20. Derive PAGE_COMPRESSION_LEVEL, ATOMIC_WRITES and DATA_DIR from the dict_table_t::flags when the table is available, in fil_space_for_table_exists_in_mem() or fil_open_single_table_tablespace(). During crash recovery, fil_load_single_table_tablespace() will use innodb_compression_level for the PAGE_COMPRESSION_LEVEL. FSP_FLAGS_MEM_MASK: A bitmap of the memory-only fil_space_t::flags that are not to be written to FSP_SPACE_FLAGS. Currently, these will include PAGE_COMPRESSION_LEVEL, ATOMIC_WRITES and DATA_DIR. Introduce the macro FSP_FLAGS_PAGE_SSIZE(). We only support one innodb_page_size for the whole instance. When creating a dummy tablespace for the redo log, use fil_space_t::flags=0. The flags are never written to the redo log files. Remove many FSP_FLAGS_SET_ macros. dict_tf_verify_flags(): Remove. This is basically only duplicating the logic of dict_tf_to_fsp_flags(), used in a debug assertion. fil_space_t::mark: Remove. This flag was not used for anything. fil_space_for_table_exists_in_mem(): Remove the unnecessary parameter mark_space, and add a parameter for table flags. Check that fil_space_t::flags match the table flags, and adjust the (memory-only) flags based on the table flags. fil_node_open_file(): Remove some redundant or unreachable conditions, do not use stderr for output, and avoid unnecessary server aborts. fil_user_tablespace_restore_page(): Convert the flags, so that the correct page_size will be used when restoring a page from the doublewrite buffer. fil_space_get_page_compressed(), fsp_flags_is_page_compressed(): Remove. It suffices to have fil_space_is_page_compressed(). FSP_FLAGS_WIDTH_DATA_DIR, FSP_FLAGS_WIDTH_PAGE_COMPRESSION_LEVEL, FSP_FLAGS_WIDTH_ATOMIC_WRITES: Remove, because these flags do not exist in the FSP_SPACE_FLAGS but only in memory. fsp_flags_try_adjust(): New function, to adjust the FSP_SPACE_FLAGS in page 0. Called by fil_open_single_table_tablespace(), fil_space_for_table_exists_in_mem(), innobase_start_or_create_for_mysql() except if --innodb-read-only is active. fsp_flags_is_valid(ulint): Reimplement from the scratch, with accurate comments. Do not display any details of detected inconsistencies, because the output could be confusing when dealing with MariaDB 10.1.x data files. fsp_flags_convert_from_101(ulint): Convert flags from buggy MariaDB 10.1.x format, or return ULINT_UNDEFINED if the flags cannot be in MariaDB 10.1.x format. fsp_flags_match(): Check the flags when probing files. Implemented based on fsp_flags_is_valid() and fsp_flags_convert_from_101(). dict_check_tablespaces_and_store_max_id(): Do not access the page after committing the mini-transaction. IMPORT TABLESPACE fixes: AbstractCallback::init(): Convert the flags. FetchIndexRootPages::operator(): Check that the tablespace flags match the table flags. Do not attempt to convert tablespace flags to table flags, because the conversion would necessarily be lossy. PageConverter::update_header(): Write back the correct flags. This takes care of the flags in IMPORT TABLESPACE.
9 years ago
MDEV-11623 MariaDB 10.1 fails to start datadir created with MariaDB 10.0/MySQL 5.6 using innodb-page-size!=16K The storage format of FSP_SPACE_FLAGS was accidentally broken already in MariaDB 10.1.0. This fix is bringing the format in line with other MySQL and MariaDB release series. Please refer to the comments that were added to fsp0fsp.h for details. This is an INCOMPATIBLE CHANGE that affects users of page_compression and non-default innodb_page_size. Upgrading to this release will correct the flags in the data files. If you want to downgrade to earlier MariaDB 10.1.x, please refer to the test innodb.101_compatibility how to reset the FSP_SPACE_FLAGS in the files. NOTE: MariaDB 10.1.0 to 10.1.20 can misinterpret uncompressed data files with innodb_page_size=4k or 64k as compressed innodb_page_size=16k files, and then probably fail when trying to access the pages. See the comments in the function fsp_flags_convert_from_101() for detailed analysis. Move PAGE_COMPRESSION to FSP_SPACE_FLAGS bit position 16. In this way, compressed innodb_page_size=16k tablespaces will not be mistaken for uncompressed ones by MariaDB 10.1.0 to 10.1.20. Derive PAGE_COMPRESSION_LEVEL, ATOMIC_WRITES and DATA_DIR from the dict_table_t::flags when the table is available, in fil_space_for_table_exists_in_mem() or fil_open_single_table_tablespace(). During crash recovery, fil_load_single_table_tablespace() will use innodb_compression_level for the PAGE_COMPRESSION_LEVEL. FSP_FLAGS_MEM_MASK: A bitmap of the memory-only fil_space_t::flags that are not to be written to FSP_SPACE_FLAGS. Currently, these will include PAGE_COMPRESSION_LEVEL, ATOMIC_WRITES and DATA_DIR. Introduce the macro FSP_FLAGS_PAGE_SSIZE(). We only support one innodb_page_size for the whole instance. When creating a dummy tablespace for the redo log, use fil_space_t::flags=0. The flags are never written to the redo log files. Remove many FSP_FLAGS_SET_ macros. dict_tf_verify_flags(): Remove. This is basically only duplicating the logic of dict_tf_to_fsp_flags(), used in a debug assertion. fil_space_t::mark: Remove. This flag was not used for anything. fil_space_for_table_exists_in_mem(): Remove the unnecessary parameter mark_space, and add a parameter for table flags. Check that fil_space_t::flags match the table flags, and adjust the (memory-only) flags based on the table flags. fil_node_open_file(): Remove some redundant or unreachable conditions, do not use stderr for output, and avoid unnecessary server aborts. fil_user_tablespace_restore_page(): Convert the flags, so that the correct page_size will be used when restoring a page from the doublewrite buffer. fil_space_get_page_compressed(), fsp_flags_is_page_compressed(): Remove. It suffices to have fil_space_is_page_compressed(). FSP_FLAGS_WIDTH_DATA_DIR, FSP_FLAGS_WIDTH_PAGE_COMPRESSION_LEVEL, FSP_FLAGS_WIDTH_ATOMIC_WRITES: Remove, because these flags do not exist in the FSP_SPACE_FLAGS but only in memory. fsp_flags_try_adjust(): New function, to adjust the FSP_SPACE_FLAGS in page 0. Called by fil_open_single_table_tablespace(), fil_space_for_table_exists_in_mem(), innobase_start_or_create_for_mysql() except if --innodb-read-only is active. fsp_flags_is_valid(ulint): Reimplement from the scratch, with accurate comments. Do not display any details of detected inconsistencies, because the output could be confusing when dealing with MariaDB 10.1.x data files. fsp_flags_convert_from_101(ulint): Convert flags from buggy MariaDB 10.1.x format, or return ULINT_UNDEFINED if the flags cannot be in MariaDB 10.1.x format. fsp_flags_match(): Check the flags when probing files. Implemented based on fsp_flags_is_valid() and fsp_flags_convert_from_101(). dict_check_tablespaces_and_store_max_id(): Do not access the page after committing the mini-transaction. IMPORT TABLESPACE fixes: AbstractCallback::init(): Convert the flags. FetchIndexRootPages::operator(): Check that the tablespace flags match the table flags. Do not attempt to convert tablespace flags to table flags, because the conversion would necessarily be lossy. PageConverter::update_header(): Write back the correct flags. This takes care of the flags in IMPORT TABLESPACE.
9 years ago
MDEV-11254: innodb-use-trim has no effect in 10.2 Problem was that implementation merged from 10.1 was incompatible with InnoDB 5.7. buf0buf.cc: Add functions to return should we punch hole and how big. buf0flu.cc: Add written page to IORequest fil0fil.cc: Remove unneeded status call and add test is sparse files and punch hole supported by file system when tablespace is created. Add call to get file system block size. Used file node is added to IORequest. Added functions to check is punch hole supported and setting punch hole. ha_innodb.cc: Remove unneeded status variables (trim512-32768) and trim_op_saved. Deprecate innodb_use_trim and set it ON by default. Add function to set innodb-use-trim dynamically. dberr.h: Add error code DB_IO_NO_PUNCH_HOLE if punch hole operation fails. fil0fil.h: Add punch_hole variable to fil_space_t and block size to fil_node_t. os0api.h: Header to helper functions on buf0buf.cc and fil0fil.cc for os0file.h os0file.h: Remove unneeded m_block_size from IORequest and add bpage to IORequest to know actual size of the block and m_fil_node to know tablespace file system block size and does it support punch hole. os0file.cc: Add function punch_hole() to IORequest to do punch_hole operation, get the file system block size and determine does file system support sparse files (for punch hole). page0size.h: remove implicit copy disable and use this implicit copy to implement copy_from() function. buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h, os0file.h, os0file.cc, log0log.cc, log0recv.cc: Remove unneeded write_size parameter from fil_io calls. srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded trim512-trim32678 status variables. Removed these from monitor tests.
9 years ago
9 years ago
9 years ago
9 years ago
9 years ago
9 years ago
10 years ago
10 years ago
12 years ago
12 years ago
  1. /*****************************************************************************
  2. Copyright (c) 1995, 2017, Oracle and/or its affiliates. All Rights Reserved.
  3. Copyright (c) 2013, 2017, MariaDB Corporation.
  4. This program is free software; you can redistribute it and/or modify it under
  5. the terms of the GNU General Public License as published by the Free Software
  6. Foundation; version 2 of the License.
  7. This program is distributed in the hope that it will be useful, but WITHOUT
  8. ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
  9. FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
  10. You should have received a copy of the GNU General Public License along with
  11. this program; if not, write to the Free Software Foundation, Inc.,
  12. 51 Franklin Street, Suite 500, Boston, MA 02110-1335 USA
  13. *****************************************************************************/
  14. /**************************************************//**
  15. @file include/fil0fil.h
  16. The low-level file system
  17. Created 10/25/1995 Heikki Tuuri
  18. *******************************************************/
  19. #ifndef fil0fil_h
  20. #define fil0fil_h
  21. #include "univ.i"
  22. struct fil_space_t;
  23. #ifndef UNIV_INNOCHECKSUM
  24. #include "log0recv.h"
  25. #include "dict0types.h"
  26. #include "page0size.h"
  27. #include "ibuf0types.h"
  28. #include <list>
  29. #include <vector>
  30. // Forward declaration
  31. struct trx_t;
  32. class page_id_t;
  33. class truncate_t;
  34. typedef std::list<char*, ut_allocator<char*> > space_name_list_t;
  35. /** Structure containing encryption specification */
  36. struct fil_space_crypt_t;
  37. /** File types */
  38. enum fil_type_t {
  39. /** temporary tablespace (temporary undo log or tables) */
  40. FIL_TYPE_TEMPORARY,
  41. /** a tablespace that is being imported (no logging until finished) */
  42. FIL_TYPE_IMPORT,
  43. /** persistent tablespace (for system, undo log or tables) */
  44. FIL_TYPE_TABLESPACE,
  45. /** redo log covering changes to files of FIL_TYPE_TABLESPACE */
  46. FIL_TYPE_LOG
  47. };
  48. /** Check if fil_type is any of FIL_TYPE_TEMPORARY, FIL_TYPE_IMPORT
  49. or FIL_TYPE_TABLESPACE.
  50. @param[in] type variable of type fil_type_t
  51. @return true if any of FIL_TYPE_TEMPORARY, FIL_TYPE_IMPORT
  52. or FIL_TYPE_TABLESPACE */
  53. inline
  54. bool
  55. fil_type_is_data(
  56. fil_type_t type)
  57. {
  58. return(type == FIL_TYPE_TEMPORARY
  59. || type == FIL_TYPE_IMPORT
  60. || type == FIL_TYPE_TABLESPACE);
  61. }
  62. struct fil_node_t;
  63. /** Tablespace or log data space */
  64. struct fil_space_t {
  65. char* name; /*!< Tablespace name */
  66. ulint id; /*!< space id */
  67. lsn_t max_lsn;
  68. /*!< LSN of the most recent
  69. fil_names_write_if_was_clean().
  70. Reset to 0 by fil_names_clear().
  71. Protected by log_sys->mutex.
  72. If and only if this is nonzero, the
  73. tablespace will be in named_spaces. */
  74. bool stop_ios;/*!< true if we want to rename the
  75. .ibd file of tablespace and want to
  76. stop temporarily posting of new i/o
  77. requests on the file */
  78. bool stop_new_ops;
  79. /*!< we set this true when we start
  80. deleting a single-table tablespace.
  81. When this is set following new ops
  82. are not allowed:
  83. * read IO request
  84. * ibuf merge
  85. * file flush
  86. Note that we can still possibly have
  87. new write operations because we don't
  88. check this flag when doing flush
  89. batches. */
  90. bool is_being_truncated;
  91. /*!< this is set to true when we prepare to
  92. truncate a single-table tablespace and its
  93. .ibd file */
  94. #ifdef UNIV_DEBUG
  95. ulint redo_skipped_count;
  96. /*!< reference count for operations who want
  97. to skip redo log in the file space in order
  98. to make fsp_space_modify_check pass. */
  99. #endif
  100. fil_type_t purpose;/*!< purpose */
  101. UT_LIST_BASE_NODE_T(fil_node_t) chain;
  102. /*!< base node for the file chain */
  103. ulint size; /*!< tablespace file size in pages;
  104. 0 if not known yet */
  105. ulint size_in_header;
  106. /* FSP_SIZE in the tablespace header;
  107. 0 if not known yet */
  108. ulint free_len;
  109. /*!< length of the FSP_FREE list */
  110. ulint free_limit;
  111. /*!< contents of FSP_FREE_LIMIT */
  112. ulint recv_size;
  113. /*!< recovered tablespace size in pages;
  114. 0 if no size change was read from the redo log,
  115. or if the size change was implemented */
  116. ulint flags; /*!< FSP_SPACE_FLAGS and FSP_FLAGS_MEM_ flags;
  117. see fsp0types.h,
  118. fsp_flags_is_valid(),
  119. page_size_t(ulint) (constructor) */
  120. ulint n_reserved_extents;
  121. /*!< number of reserved free extents for
  122. ongoing operations like B-tree page split */
  123. ulint n_pending_flushes; /*!< this is positive when flushing
  124. the tablespace to disk; dropping of the
  125. tablespace is forbidden if this is positive */
  126. /** Number of pending buffer pool operations accessing the tablespace
  127. without holding a table lock or dict_operation_lock S-latch
  128. that would prevent the table (and tablespace) from being
  129. dropped. An example is change buffer merge.
  130. The tablespace cannot be dropped while this is nonzero,
  131. or while fil_node_t::n_pending is nonzero.
  132. Protected by fil_system->mutex. */
  133. ulint n_pending_ops;
  134. /** Number of pending block read or write operations
  135. (when a write is imminent or a read has recently completed).
  136. The tablespace object cannot be freed while this is nonzero,
  137. but it can be detached from fil_system.
  138. Note that fil_node_t::n_pending tracks actual pending I/O requests.
  139. Protected by fil_system->mutex. */
  140. ulint n_pending_ios;
  141. hash_node_t hash; /*!< hash chain node */
  142. hash_node_t name_hash;/*!< hash chain the name_hash table */
  143. rw_lock_t latch; /*!< latch protecting the file space storage
  144. allocation */
  145. UT_LIST_NODE_T(fil_space_t) unflushed_spaces;
  146. /*!< list of spaces with at least one unflushed
  147. file we have written to */
  148. UT_LIST_NODE_T(fil_space_t) named_spaces;
  149. /*!< list of spaces for which MLOG_FILE_NAME
  150. records have been issued */
  151. bool is_in_unflushed_spaces;
  152. /*!< true if this space is currently in
  153. unflushed_spaces */
  154. UT_LIST_NODE_T(fil_space_t) space_list;
  155. /*!< list of all spaces */
  156. /** other tablespaces needing key rotation */
  157. UT_LIST_NODE_T(fil_space_t) rotation_list;
  158. /** whether this tablespace needs key rotation */
  159. bool is_in_rotation_list;
  160. /** MariaDB encryption data */
  161. fil_space_crypt_t* crypt_data;
  162. /** True if we have already printed compression failure */
  163. bool printed_compression_failure;
  164. /** True if the device this filespace is on supports atomic writes */
  165. bool atomic_write_supported;
  166. /** Release the reserved free extents.
  167. @param[in] n_reserved number of reserved extents */
  168. void release_free_extents(ulint n_reserved);
  169. /** True if file system storing this tablespace supports
  170. punch hole */
  171. bool punch_hole;
  172. ulint magic_n;/*!< FIL_SPACE_MAGIC_N */
  173. /** @return whether the tablespace is about to be dropped or
  174. truncated */
  175. bool is_stopping() const
  176. {
  177. return stop_new_ops || is_being_truncated;
  178. }
  179. };
  180. /** Value of fil_space_t::magic_n */
  181. #define FIL_SPACE_MAGIC_N 89472
  182. /** File node of a tablespace or the log data space */
  183. struct fil_node_t {
  184. /** tablespace containing this file */
  185. fil_space_t* space;
  186. /** file name; protected by fil_system->mutex and log_sys->mutex. */
  187. char* name;
  188. /** file handle (valid if is_open) */
  189. pfs_os_file_t handle;
  190. /** event that groups and serializes calls to fsync;
  191. os_event_set() and os_event_reset() are protected by
  192. fil_system_t::mutex */
  193. os_event_t sync_event;
  194. /** whether the file actually is a raw device or disk partition */
  195. bool is_raw_disk;
  196. /** size of the file in database pages (0 if not known yet);
  197. the possible last incomplete megabyte may be ignored
  198. if space->id == 0 */
  199. ulint size;
  200. /** initial size of the file in database pages;
  201. FIL_IBD_FILE_INITIAL_SIZE by default */
  202. ulint init_size;
  203. /** maximum size of the file in database pages (0 if unlimited) */
  204. ulint max_size;
  205. /** count of pending i/o's; is_open must be true if nonzero */
  206. ulint n_pending;
  207. /** count of pending flushes; is_open must be true if nonzero */
  208. ulint n_pending_flushes;
  209. /** whether the file is currently being extended */
  210. bool being_extended;
  211. /** number of writes to the file since the system was started */
  212. int64_t modification_counter;
  213. /** the modification_counter of the latest flush to disk */
  214. int64_t flush_counter;
  215. /** link to other files in this tablespace */
  216. UT_LIST_NODE_T(fil_node_t) chain;
  217. /** link to the fil_system->LRU list (keeping track of open files) */
  218. UT_LIST_NODE_T(fil_node_t) LRU;
  219. /** whether this file could use atomic write (data file) */
  220. bool atomic_write;
  221. /** Filesystem block size */
  222. ulint block_size;
  223. /** FIL_NODE_MAGIC_N */
  224. ulint magic_n;
  225. /** @return whether this file is open */
  226. bool is_open() const
  227. {
  228. return(handle != OS_FILE_CLOSED);
  229. }
  230. };
  231. /** Value of fil_node_t::magic_n */
  232. #define FIL_NODE_MAGIC_N 89389
  233. /** Common InnoDB file extentions */
  234. enum ib_extention {
  235. NO_EXT = 0,
  236. IBD = 1,
  237. ISL = 2,
  238. CFG = 3
  239. };
  240. extern const char* dot_ext[];
  241. #define DOT_IBD dot_ext[IBD]
  242. #define DOT_ISL dot_ext[ISL]
  243. #define DOT_CFG dot_ext[CFG]
  244. /** When mysqld is run, the default directory "." is the mysqld datadir,
  245. but in the MySQL Embedded Server Library and mysqlbackup it is not the default
  246. directory, and we must set the base file path explicitly */
  247. extern const char* fil_path_to_mysql_datadir;
  248. /** Initial size of a single-table tablespace in pages */
  249. #define FIL_IBD_FILE_INITIAL_SIZE 4
  250. /** 'null' (undefined) page offset in the context of file spaces */
  251. #define FIL_NULL ULINT32_UNDEFINED
  252. /* Space address data type; this is intended to be used when
  253. addresses accurate to a byte are stored in file pages. If the page part
  254. of the address is FIL_NULL, the address is considered undefined. */
  255. typedef byte fil_faddr_t; /*!< 'type' definition in C: an address
  256. stored in a file page is a string of bytes */
  257. #define FIL_ADDR_PAGE 0 /* first in address is the page offset */
  258. #define FIL_ADDR_BYTE 4 /* then comes 2-byte byte offset within page*/
  259. #endif /* !UNIV_INNOCHECKSUM */
  260. #define FIL_ADDR_SIZE 6 /* address size is 6 bytes */
  261. #ifndef UNIV_INNOCHECKSUM
  262. /** File space address */
  263. struct fil_addr_t {
  264. ulint page; /*!< page number within a space */
  265. ulint boffset; /*!< byte offset within the page */
  266. };
  267. /** The null file address */
  268. extern fil_addr_t fil_addr_null;
  269. #endif /* !UNIV_INNOCHECKSUM */
  270. /** The byte offsets on a file page for various variables @{ */
  271. #define FIL_PAGE_SPACE_OR_CHKSUM 0 /*!< in < MySQL-4.0.14 space id the
  272. page belongs to (== 0) but in later
  273. versions the 'new' checksum of the
  274. page */
  275. #define FIL_PAGE_OFFSET 4 /*!< page offset inside space */
  276. #define FIL_PAGE_PREV 8 /*!< if there is a 'natural'
  277. predecessor of the page, its
  278. offset. Otherwise FIL_NULL.
  279. This field is not set on BLOB
  280. pages, which are stored as a
  281. singly-linked list. See also
  282. FIL_PAGE_NEXT. */
  283. #define FIL_PAGE_NEXT 12 /*!< if there is a 'natural' successor
  284. of the page, its offset.
  285. Otherwise FIL_NULL.
  286. B-tree index pages
  287. (FIL_PAGE_TYPE contains FIL_PAGE_INDEX)
  288. on the same PAGE_LEVEL are maintained
  289. as a doubly linked list via
  290. FIL_PAGE_PREV and FIL_PAGE_NEXT
  291. in the collation order of the
  292. smallest user record on each page. */
  293. #define FIL_PAGE_LSN 16 /*!< lsn of the end of the newest
  294. modification log record to the page */
  295. #define FIL_PAGE_TYPE 24 /*!< file page type: FIL_PAGE_INDEX,...,
  296. 2 bytes.
  297. The contents of this field can only
  298. be trusted in the following case:
  299. if the page is an uncompressed
  300. B-tree index page, then it is
  301. guaranteed that the value is
  302. FIL_PAGE_INDEX.
  303. The opposite does not hold.
  304. In tablespaces created by
  305. MySQL/InnoDB 5.1.7 or later, the
  306. contents of this field is valid
  307. for all uncompressed pages. */
  308. #define FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION 26 /*!< for the first page
  309. in a system tablespace data file
  310. (ibdata*, not *.ibd): the file has
  311. been flushed to disk at least up
  312. to this lsn
  313. for other pages: a 32-bit key version
  314. used to encrypt the page + 32-bit checksum
  315. or 64 bits of zero if no encryption
  316. */
  317. /** This overloads FIL_PAGE_FILE_FLUSH_LSN for RTREE Split Sequence Number */
  318. #define FIL_RTREE_SPLIT_SEQ_NUM FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION
  319. /** starting from 4.1.x this contains the space id of the page */
  320. #define FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID 34
  321. #define FIL_PAGE_SPACE_ID FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID
  322. #define FIL_PAGE_DATA 38U /*!< start of the data on the page */
  323. /* Following are used when page compression is used */
  324. #define FIL_PAGE_COMPRESSED_SIZE 2 /*!< Number of bytes used to store
  325. actual payload data size on
  326. compressed pages. */
  327. #define FIL_PAGE_COMPRESSION_METHOD_SIZE 2
  328. /*!< Number of bytes used to store
  329. actual compression method. */
  330. /* @} */
  331. /** File page trailer @{ */
  332. #define FIL_PAGE_END_LSN_OLD_CHKSUM 8 /*!< the low 4 bytes of this are used
  333. to store the page checksum, the
  334. last 4 bytes should be identical
  335. to the last 4 bytes of FIL_PAGE_LSN */
  336. #define FIL_PAGE_DATA_END 8 /*!< size of the page trailer */
  337. /* @} */
  338. /** File page types (values of FIL_PAGE_TYPE) @{ */
  339. #define FIL_PAGE_PAGE_COMPRESSED_ENCRYPTED 37401 /*!< Page is compressed and
  340. then encrypted */
  341. #define FIL_PAGE_PAGE_COMPRESSED 34354 /*!< page compressed page */
  342. #define FIL_PAGE_INDEX 17855 /*!< B-tree node */
  343. #define FIL_PAGE_RTREE 17854 /*!< B-tree node */
  344. #define FIL_PAGE_UNDO_LOG 2 /*!< Undo log page */
  345. #define FIL_PAGE_INODE 3 /*!< Index node */
  346. #define FIL_PAGE_IBUF_FREE_LIST 4 /*!< Insert buffer free list */
  347. /* File page types introduced in MySQL/InnoDB 5.1.7 */
  348. #define FIL_PAGE_TYPE_ALLOCATED 0 /*!< Freshly allocated page */
  349. #define FIL_PAGE_IBUF_BITMAP 5 /*!< Insert buffer bitmap */
  350. #define FIL_PAGE_TYPE_SYS 6 /*!< System page */
  351. #define FIL_PAGE_TYPE_TRX_SYS 7 /*!< Transaction system data */
  352. #define FIL_PAGE_TYPE_FSP_HDR 8 /*!< File space header */
  353. #define FIL_PAGE_TYPE_XDES 9 /*!< Extent descriptor page */
  354. #define FIL_PAGE_TYPE_BLOB 10 /*!< Uncompressed BLOB page */
  355. #define FIL_PAGE_TYPE_ZBLOB 11 /*!< First compressed BLOB page */
  356. #define FIL_PAGE_TYPE_ZBLOB2 12 /*!< Subsequent compressed BLOB page */
  357. #define FIL_PAGE_TYPE_UNKNOWN 13 /*!< In old tablespaces, garbage
  358. in FIL_PAGE_TYPE is replaced with this
  359. value when flushing pages. */
  360. /* File page types introduced in MySQL 5.7, not supported in MariaDB */
  361. //#define FIL_PAGE_COMPRESSED 14
  362. //#define FIL_PAGE_ENCRYPTED 15
  363. //#define FIL_PAGE_COMPRESSED_AND_ENCRYPTED 16
  364. //#define FIL_PAGE_ENCRYPTED_RTREE 17
  365. /** Used by i_s.cc to index into the text description. */
  366. #define FIL_PAGE_TYPE_LAST FIL_PAGE_TYPE_UNKNOWN
  367. /*!< Last page type */
  368. /* @} */
  369. /** macro to check whether the page type is index (Btree or Rtree) type */
  370. #define fil_page_type_is_index(page_type) \
  371. (page_type == FIL_PAGE_INDEX || page_type == FIL_PAGE_RTREE)
  372. /** Check whether the page is index page (either regular Btree index or Rtree
  373. index */
  374. #define fil_page_index_page_check(page) \
  375. fil_page_type_is_index(fil_page_get_type(page))
  376. #ifndef UNIV_INNOCHECKSUM
  377. /** Enum values for encryption table option */
  378. enum fil_encryption_t {
  379. /** Encrypted if innodb_encrypt_tables=ON (srv_encrypt_tables) */
  380. FIL_ENCRYPTION_DEFAULT,
  381. /** Encrypted */
  382. FIL_ENCRYPTION_ON,
  383. /** Not encrypted */
  384. FIL_ENCRYPTION_OFF
  385. };
  386. /** The number of fsyncs done to the log */
  387. extern ulint fil_n_log_flushes;
  388. /** Number of pending redo log flushes */
  389. extern ulint fil_n_pending_log_flushes;
  390. /** Number of pending tablespace flushes */
  391. extern ulint fil_n_pending_tablespace_flushes;
  392. /** Number of files currently open */
  393. extern ulint fil_n_file_opened;
  394. /** Look up a tablespace.
  395. The caller should hold an InnoDB table lock or a MDL that prevents
  396. the tablespace from being dropped during the operation,
  397. or the caller should be in single-threaded crash recovery mode
  398. (no user connections that could drop tablespaces).
  399. If this is not the case, fil_space_acquire() and fil_space_release()
  400. should be used instead.
  401. @param[in] id tablespace ID
  402. @return tablespace, or NULL if not found */
  403. fil_space_t*
  404. fil_space_get(
  405. ulint id)
  406. MY_ATTRIBUTE((warn_unused_result));
  407. /** The tablespace memory cache; also the totality of logs (the log
  408. data space) is stored here; below we talk about tablespaces, but also
  409. the ib_logfiles form a 'space' and it is handled here */
  410. struct fil_system_t {
  411. ib_mutex_t mutex; /*!< The mutex protecting the cache */
  412. hash_table_t* spaces; /*!< The hash table of spaces in the
  413. system; they are hashed on the space
  414. id */
  415. hash_table_t* name_hash; /*!< hash table based on the space
  416. name */
  417. UT_LIST_BASE_NODE_T(fil_node_t) LRU;
  418. /*!< base node for the LRU list of the
  419. most recently used open files with no
  420. pending i/o's; if we start an i/o on
  421. the file, we first remove it from this
  422. list, and return it to the start of
  423. the list when the i/o ends;
  424. log files and the system tablespace are
  425. not put to this list: they are opened
  426. after the startup, and kept open until
  427. shutdown */
  428. UT_LIST_BASE_NODE_T(fil_space_t) unflushed_spaces;
  429. /*!< base node for the list of those
  430. tablespaces whose files contain
  431. unflushed writes; those spaces have
  432. at least one file node where
  433. modification_counter > flush_counter */
  434. ulint n_open; /*!< number of files currently open */
  435. ulint max_n_open; /*!< n_open is not allowed to exceed
  436. this */
  437. int64_t modification_counter;/*!< when we write to a file we
  438. increment this by one */
  439. ulint max_assigned_id;/*!< maximum space id in the existing
  440. tables, or assigned during the time
  441. mysqld has been up; at an InnoDB
  442. startup we scan the data dictionary
  443. and set here the maximum of the
  444. space id's of the tables there */
  445. UT_LIST_BASE_NODE_T(fil_space_t) space_list;
  446. /*!< list of all file spaces */
  447. UT_LIST_BASE_NODE_T(fil_space_t) named_spaces;
  448. /*!< list of all file spaces
  449. for which a MLOG_FILE_NAME
  450. record has been written since
  451. the latest redo log checkpoint.
  452. Protected only by log_sys->mutex. */
  453. UT_LIST_BASE_NODE_T(fil_space_t) rotation_list;
  454. /*!< list of all file spaces needing
  455. key rotation.*/
  456. ibool space_id_reuse_warned;
  457. /* !< TRUE if fil_space_create()
  458. has issued a warning about
  459. potential space_id reuse */
  460. };
  461. /** The tablespace memory cache. This variable is NULL before the module is
  462. initialized. */
  463. extern fil_system_t* fil_system;
  464. #include "fil0crypt.h"
  465. /** Returns the latch of a file space.
  466. @param[in] id space id
  467. @param[out] flags tablespace flags
  468. @return latch protecting storage allocation */
  469. rw_lock_t*
  470. fil_space_get_latch(
  471. ulint id,
  472. ulint* flags);
  473. /** Gets the type of a file space.
  474. @param[in] id tablespace identifier
  475. @return file type */
  476. fil_type_t
  477. fil_space_get_type(
  478. ulint id);
  479. /** Note that a tablespace has been imported.
  480. It is initially marked as FIL_TYPE_IMPORT so that no logging is
  481. done during the import process when the space ID is stamped to each page.
  482. Now we change it to FIL_SPACE_TABLESPACE to start redo and undo logging.
  483. NOTE: temporary tablespaces are never imported.
  484. @param[in] id tablespace identifier */
  485. void
  486. fil_space_set_imported(
  487. ulint id);
  488. /** Append a file to the chain of files of a space.
  489. @param[in] name file name of a file that is not open
  490. @param[in] size file size in entire database blocks
  491. @param[in,out] space tablespace from fil_space_create()
  492. @param[in] is_raw whether this is a raw device or partition
  493. @param[in] atomic_write true if atomic write could be enabled
  494. @param[in] max_pages maximum number of pages in file,
  495. ULINT_MAX means the file size is unlimited.
  496. @return pointer to the file name
  497. @retval NULL if error */
  498. char*
  499. fil_node_create(
  500. const char* name,
  501. ulint size,
  502. fil_space_t* space,
  503. bool is_raw,
  504. bool atomic_write,
  505. ulint max_pages = ULINT_MAX)
  506. MY_ATTRIBUTE((warn_unused_result));
  507. /** Create a space memory object and put it to the fil_system hash table.
  508. Error messages are issued to the server log.
  509. @param[in] name tablespace name
  510. @param[in] id tablespace identifier
  511. @param[in] flags tablespace flags
  512. @param[in] purpose tablespace purpose
  513. @param[in,out] crypt_data encryption information
  514. @param[in] mode encryption mode
  515. @return pointer to created tablespace, to be filled in with fil_node_create()
  516. @retval NULL on failure (such as when the same tablespace exists) */
  517. fil_space_t*
  518. fil_space_create(
  519. const char* name,
  520. ulint id,
  521. ulint flags,
  522. fil_type_t purpose,
  523. fil_space_crypt_t* crypt_data,
  524. fil_encryption_t mode = FIL_ENCRYPTION_DEFAULT)
  525. MY_ATTRIBUTE((warn_unused_result));
  526. /*******************************************************************//**
  527. Assigns a new space id for a new single-table tablespace. This works simply by
  528. incrementing the global counter. If 4 billion id's is not enough, we may need
  529. to recycle id's.
  530. @return true if assigned, false if not */
  531. bool
  532. fil_assign_new_space_id(
  533. /*====================*/
  534. ulint* space_id); /*!< in/out: space id */
  535. /** Frees a space object from the tablespace memory cache.
  536. Closes the files in the chain but does not delete them.
  537. There must not be any pending i/o's or flushes on the files.
  538. @param[in] id tablespace identifier
  539. @param[in] x_latched whether the caller holds X-mode space->latch
  540. @return true if success */
  541. bool
  542. fil_space_free(
  543. ulint id,
  544. bool x_latched);
  545. /** Returns the path from the first fil_node_t found with this space ID.
  546. The caller is responsible for freeing the memory allocated here for the
  547. value returned.
  548. @param[in] id Tablespace ID
  549. @return own: A copy of fil_node_t::path, NULL if space ID is zero
  550. or not found. */
  551. char*
  552. fil_space_get_first_path(
  553. ulint id);
  554. /** Set the recovered size of a tablespace in pages.
  555. @param id tablespace ID
  556. @param size recovered size in pages */
  557. UNIV_INTERN
  558. void
  559. fil_space_set_recv_size(ulint id, ulint size);
  560. /*******************************************************************//**
  561. Returns the size of the space in pages. The tablespace must be cached in the
  562. memory cache.
  563. @return space size, 0 if space not found */
  564. ulint
  565. fil_space_get_size(
  566. /*===============*/
  567. ulint id); /*!< in: space id */
  568. /*******************************************************************//**
  569. Returns the flags of the space. The tablespace must be cached
  570. in the memory cache.
  571. @return flags, ULINT_UNDEFINED if space not found */
  572. ulint
  573. fil_space_get_flags(
  574. /*================*/
  575. ulint id); /*!< in: space id */
  576. /** Open each fil_node_t of a named fil_space_t if not already open.
  577. @param[in] name Tablespace name
  578. @return true if all file nodes are opened. */
  579. bool
  580. fil_space_open(
  581. const char* name);
  582. /** Close each fil_node_t of a named fil_space_t if open.
  583. @param[in] name Tablespace name */
  584. void
  585. fil_space_close(
  586. const char* name);
  587. /** Returns the page size of the space and whether it is compressed or not.
  588. The tablespace must be cached in the memory cache.
  589. @param[in] id space id
  590. @param[out] found true if tablespace was found
  591. @return page size */
  592. const page_size_t
  593. fil_space_get_page_size(
  594. ulint id,
  595. bool* found);
  596. /****************************************************************//**
  597. Initializes the tablespace memory cache. */
  598. void
  599. fil_init(
  600. /*=====*/
  601. ulint hash_size, /*!< in: hash table size */
  602. ulint max_n_open); /*!< in: max number of open files */
  603. /*******************************************************************//**
  604. Initializes the tablespace memory cache. */
  605. void
  606. fil_close(void);
  607. /*===========*/
  608. /*******************************************************************//**
  609. Opens all log files and system tablespace data files. They stay open until the
  610. database server shutdown. This should be called at a server startup after the
  611. space objects for the log and the system tablespace have been created. The
  612. purpose of this operation is to make sure we never run out of file descriptors
  613. if we need to read from the insert buffer or to write to the log. */
  614. void
  615. fil_open_log_and_system_tablespace_files(void);
  616. /*==========================================*/
  617. /*******************************************************************//**
  618. Closes all open files. There must not be any pending i/o's or not flushed
  619. modifications in the files. */
  620. void
  621. fil_close_all_files(void);
  622. /*=====================*/
  623. /*******************************************************************//**
  624. Closes the redo log files. There must not be any pending i/o's or not
  625. flushed modifications in the files. */
  626. void
  627. fil_close_log_files(
  628. /*================*/
  629. bool free); /*!< in: whether to free the memory object */
  630. /*******************************************************************//**
  631. Sets the max tablespace id counter if the given number is bigger than the
  632. previous value. */
  633. void
  634. fil_set_max_space_id_if_bigger(
  635. /*===========================*/
  636. ulint max_id);/*!< in: maximum known id */
  637. /** Write the flushed LSN to the page header of the first page in the
  638. system tablespace.
  639. @param[in] lsn flushed LSN
  640. @return DB_SUCCESS or error number */
  641. dberr_t
  642. fil_write_flushed_lsn(
  643. lsn_t lsn)
  644. MY_ATTRIBUTE((warn_unused_result));
  645. /** Acquire a tablespace when it could be dropped concurrently.
  646. Used by background threads that do not necessarily hold proper locks
  647. for concurrency control.
  648. @param[in] id tablespace ID
  649. @param[in] silent whether to silently ignore missing tablespaces
  650. @return the tablespace
  651. @retval NULL if missing or being deleted or truncated */
  652. UNIV_INTERN
  653. fil_space_t*
  654. fil_space_acquire_low(ulint id, bool silent)
  655. MY_ATTRIBUTE((warn_unused_result));
  656. /** Acquire a tablespace when it could be dropped concurrently.
  657. Used by background threads that do not necessarily hold proper locks
  658. for concurrency control.
  659. @param[in] id tablespace ID
  660. @return the tablespace
  661. @retval NULL if missing or being deleted or truncated */
  662. inline
  663. fil_space_t*
  664. fil_space_acquire(ulint id)
  665. {
  666. return (fil_space_acquire_low(id, false));
  667. }
  668. /** Acquire a tablespace that may not exist.
  669. Used by background threads that do not necessarily hold proper locks
  670. for concurrency control.
  671. @param[in] id tablespace ID
  672. @return the tablespace
  673. @retval NULL if missing or being deleted */
  674. inline
  675. fil_space_t*
  676. fil_space_acquire_silent(ulint id)
  677. {
  678. return (fil_space_acquire_low(id, true));
  679. }
  680. /** Release a tablespace acquired with fil_space_acquire().
  681. @param[in,out] space tablespace to release */
  682. void
  683. fil_space_release(fil_space_t* space);
  684. /** Acquire a tablespace for reading or writing a block,
  685. when it could be dropped concurrently.
  686. @param[in] id tablespace ID
  687. @return the tablespace
  688. @retval NULL if missing */
  689. fil_space_t*
  690. fil_space_acquire_for_io(ulint id);
  691. /** Release a tablespace acquired with fil_space_acquire_for_io().
  692. @param[in,out] space tablespace to release */
  693. void
  694. fil_space_release_for_io(fil_space_t* space);
  695. /** Return the next fil_space_t.
  696. Once started, the caller must keep calling this until it returns NULL.
  697. fil_space_acquire() and fil_space_release() are invoked here which
  698. blocks a concurrent operation from dropping the tablespace.
  699. @param[in,out] prev_space Pointer to the previous fil_space_t.
  700. If NULL, use the first fil_space_t on fil_system->space_list.
  701. @return pointer to the next fil_space_t.
  702. @retval NULL if this was the last */
  703. fil_space_t*
  704. fil_space_next(
  705. fil_space_t* prev_space)
  706. MY_ATTRIBUTE((warn_unused_result));
  707. /** Return the next fil_space_t from key rotation list.
  708. Once started, the caller must keep calling this until it returns NULL.
  709. fil_space_acquire() and fil_space_release() are invoked here which
  710. blocks a concurrent operation from dropping the tablespace.
  711. @param[in,out] prev_space Pointer to the previous fil_space_t.
  712. If NULL, use the first fil_space_t on fil_system->space_list.
  713. @return pointer to the next fil_space_t.
  714. @retval NULL if this was the last*/
  715. fil_space_t*
  716. fil_space_keyrotate_next(
  717. fil_space_t* prev_space)
  718. MY_ATTRIBUTE((warn_unused_result));
  719. /** Wrapper with reference-counting for a fil_space_t. */
  720. class FilSpace
  721. {
  722. public:
  723. /** Default constructor: Use this when reference counting
  724. is done outside this wrapper. */
  725. FilSpace() : m_space(NULL) {}
  726. /** Constructor: Look up the tablespace and increment the
  727. reference count if found.
  728. @param[in] space_id tablespace ID
  729. @param[in] silent whether not to display errors */
  730. explicit FilSpace(ulint space_id, bool silent = false)
  731. : m_space(fil_space_acquire_low(space_id, silent)) {}
  732. /** Assignment operator: This assumes that fil_space_acquire()
  733. has already been done for the fil_space_t. The caller must
  734. assign NULL if it calls fil_space_release().
  735. @param[in] space tablespace to assign */
  736. class FilSpace& operator=(fil_space_t* space)
  737. {
  738. /* fil_space_acquire() must have been invoked. */
  739. ut_ad(space == NULL || space->n_pending_ops > 0);
  740. m_space = space;
  741. return(*this);
  742. }
  743. /** Destructor - Decrement the reference count if a fil_space_t
  744. is still assigned. */
  745. ~FilSpace()
  746. {
  747. if (m_space != NULL) {
  748. fil_space_release(m_space);
  749. }
  750. }
  751. /** Implicit type conversion
  752. @return the wrapped object */
  753. operator const fil_space_t*() const
  754. {
  755. return(m_space);
  756. }
  757. /** Explicit type conversion
  758. @return the wrapped object */
  759. const fil_space_t* operator()() const
  760. {
  761. return(m_space);
  762. }
  763. private:
  764. /** The wrapped pointer */
  765. fil_space_t* m_space;
  766. };
  767. /********************************************************//**
  768. Creates the database directory for a table if it does not exist yet. */
  769. void
  770. fil_create_directory_for_tablename(
  771. /*===============================*/
  772. const char* name); /*!< in: name in the standard
  773. 'databasename/tablename' format */
  774. /********************************************************//**
  775. Recreates table indexes by applying
  776. TRUNCATE log record during recovery.
  777. @return DB_SUCCESS or error code */
  778. dberr_t
  779. fil_recreate_table(
  780. /*===============*/
  781. ulint space_id, /*!< in: space id */
  782. ulint format_flags, /*!< in: page format */
  783. ulint flags, /*!< in: tablespace flags */
  784. const char* name, /*!< in: table name */
  785. truncate_t& truncate); /*!< in/out: The information of
  786. TRUNCATE log record */
  787. /********************************************************//**
  788. Recreates the tablespace and table indexes by applying
  789. TRUNCATE log record during recovery.
  790. @return DB_SUCCESS or error code */
  791. dberr_t
  792. fil_recreate_tablespace(
  793. /*====================*/
  794. ulint space_id, /*!< in: space id */
  795. ulint format_flags, /*!< in: page format */
  796. ulint flags, /*!< in: tablespace flags */
  797. const char* name, /*!< in: table name */
  798. truncate_t& truncate, /*!< in/out: The information of
  799. TRUNCATE log record */
  800. lsn_t recv_lsn); /*!< in: the end LSN of
  801. the log record */
  802. /** Replay a file rename operation if possible.
  803. @param[in] space_id tablespace identifier
  804. @param[in] first_page_no first page number in the file
  805. @param[in] name old file name
  806. @param[in] new_name new file name
  807. @return whether the operation was successfully applied
  808. (the name did not exist, or new_name did not exist and
  809. name was successfully renamed to new_name) */
  810. bool
  811. fil_op_replay_rename(
  812. ulint space_id,
  813. ulint first_page_no,
  814. const char* name,
  815. const char* new_name)
  816. MY_ATTRIBUTE((warn_unused_result));
  817. /** Determine whether a table can be accessed in operations that are
  818. not (necessarily) protected by meta-data locks.
  819. (Rollback would generally be protected, but rollback of
  820. FOREIGN KEY CASCADE/SET NULL is not protected by meta-data locks
  821. but only by InnoDB table locks, which may be broken by TRUNCATE TABLE.)
  822. @param[in] table persistent table
  823. checked @return whether the table is accessible */
  824. bool
  825. fil_table_accessible(const dict_table_t* table)
  826. MY_ATTRIBUTE((warn_unused_result, nonnull));
  827. /** Deletes an IBD tablespace, either general or single-table.
  828. The tablespace must be cached in the memory cache. This will delete the
  829. datafile, fil_space_t & fil_node_t entries from the file_system_t cache.
  830. @param[in] space_id Tablespace id
  831. @param[in] buf_remove Specify the action to take on the pages
  832. for this table in the buffer pool.
  833. @return true if success */
  834. dberr_t
  835. fil_delete_tablespace(
  836. ulint id,
  837. buf_remove_t buf_remove);
  838. /** Truncate the tablespace to needed size.
  839. @param[in] space_id id of tablespace to truncate
  840. @param[in] size_in_pages truncate size.
  841. @return true if truncate was successful. */
  842. bool
  843. fil_truncate_tablespace(
  844. ulint space_id,
  845. ulint size_in_pages);
  846. /*******************************************************************//**
  847. Prepare for truncating a single-table tablespace. The tablespace
  848. must be cached in the memory cache.
  849. 1) Check pending operations on a tablespace;
  850. 2) Remove all insert buffer entries for the tablespace;
  851. @return DB_SUCCESS or error */
  852. dberr_t
  853. fil_prepare_for_truncate(
  854. /*=====================*/
  855. ulint id); /*!< in: space id */
  856. /** Reinitialize the original tablespace header with the same space id
  857. for single tablespace
  858. @param[in] id space id of the tablespace
  859. @param[in] size size in blocks
  860. @param[in] trx Transaction covering truncate */
  861. void
  862. fil_reinit_space_header(
  863. ulint id,
  864. ulint size,
  865. trx_t* trx);
  866. /*******************************************************************//**
  867. Closes a single-table tablespace. The tablespace must be cached in the
  868. memory cache. Free all pages used by the tablespace.
  869. @return DB_SUCCESS or error */
  870. dberr_t
  871. fil_close_tablespace(
  872. /*=================*/
  873. trx_t* trx, /*!< in/out: Transaction covering the close */
  874. ulint id); /*!< in: space id */
  875. /*******************************************************************//**
  876. Discards a single-table tablespace. The tablespace must be cached in the
  877. memory cache. Discarding is like deleting a tablespace, but
  878. 1. We do not drop the table from the data dictionary;
  879. 2. We remove all insert buffer entries for the tablespace immediately;
  880. in DROP TABLE they are only removed gradually in the background;
  881. 3. When the user does IMPORT TABLESPACE, the tablespace will have the
  882. same id as it originally had.
  883. 4. Free all the pages in use by the tablespace if rename=true.
  884. @return DB_SUCCESS or error */
  885. dberr_t
  886. fil_discard_tablespace(
  887. /*===================*/
  888. ulint id) /*!< in: space id */
  889. MY_ATTRIBUTE((warn_unused_result));
  890. /** Test if a tablespace file can be renamed to a new filepath by checking
  891. if that the old filepath exists and the new filepath does not exist.
  892. @param[in] space_id tablespace id
  893. @param[in] old_path old filepath
  894. @param[in] new_path new filepath
  895. @param[in] is_discarded whether the tablespace is discarded
  896. @return innodb error code */
  897. dberr_t
  898. fil_rename_tablespace_check(
  899. ulint space_id,
  900. const char* old_path,
  901. const char* new_path,
  902. bool is_discarded);
  903. /** Rename a single-table tablespace.
  904. The tablespace must exist in the memory cache.
  905. @param[in] id tablespace identifier
  906. @param[in] old_path old file name
  907. @param[in] new_name new table name in the
  908. databasename/tablename format
  909. @param[in] new_path_in new file name,
  910. or NULL if it is located in the normal data directory
  911. @return true if success */
  912. bool
  913. fil_rename_tablespace(
  914. ulint id,
  915. const char* old_path,
  916. const char* new_name,
  917. const char* new_path_in);
  918. /*******************************************************************//**
  919. Allocates and builds a file name from a path, a table or tablespace name
  920. and a suffix. The string must be freed by caller with ut_free().
  921. @param[in] path NULL or the direcory path or the full path and filename.
  922. @param[in] name NULL if path is full, or Table/Tablespace name
  923. @param[in] suffix NULL or the file extention to use.
  924. @return own: file name */
  925. char*
  926. fil_make_filepath(
  927. const char* path,
  928. const char* name,
  929. ib_extention suffix,
  930. bool strip_name);
  931. /** Create a tablespace file.
  932. @param[in] space_id Tablespace ID
  933. @param[in] name Tablespace name in dbname/tablename format.
  934. @param[in] path Path and filename of the datafile to create.
  935. @param[in] flags Tablespace flags
  936. @param[in] size Initial size of the tablespace file in pages,
  937. must be >= FIL_IBD_FILE_INITIAL_SIZE
  938. @param[in] mode MariaDB encryption mode
  939. @param[in] key_id MariaDB encryption key_id
  940. @return DB_SUCCESS or error code */
  941. dberr_t
  942. fil_ibd_create(
  943. ulint space_id,
  944. const char* name,
  945. const char* path,
  946. ulint flags,
  947. ulint size,
  948. fil_encryption_t mode,
  949. uint32_t key_id)
  950. MY_ATTRIBUTE((nonnull(2), warn_unused_result));
  951. /** Try to adjust FSP_SPACE_FLAGS if they differ from the expectations.
  952. (Typically when upgrading from MariaDB 10.1.0..10.1.20.)
  953. @param[in] space_id tablespace ID
  954. @param[in] flags desired tablespace flags */
  955. UNIV_INTERN
  956. void
  957. fsp_flags_try_adjust(ulint space_id, ulint flags);
  958. /********************************************************************//**
  959. Tries to open a single-table tablespace and optionally checks the space id is
  960. right in it. If does not succeed, prints an error message to the .err log. This
  961. function is used to open a tablespace when we start up mysqld, and also in
  962. IMPORT TABLESPACE.
  963. NOTE that we assume this operation is used either at the database startup
  964. or under the protection of the dictionary mutex, so that two users cannot
  965. race here. This operation does not leave the file associated with the
  966. tablespace open, but closes it after we have looked at the space id in it.
  967. If the validate boolean is set, we read the first page of the file and
  968. check that the space id in the file is what we expect. We assume that
  969. this function runs much faster if no check is made, since accessing the
  970. file inode probably is much faster (the OS caches them) than accessing
  971. the first page of the file. This boolean may be initially false, but if
  972. a remote tablespace is found it will be changed to true.
  973. If the fix_dict boolean is set, then it is safe to use an internal SQL
  974. statement to update the dictionary tables if they are incorrect.
  975. @param[in] validate true if we should validate the tablespace
  976. @param[in] fix_dict true if the dictionary is available to be fixed
  977. @param[in] purpose FIL_TYPE_TABLESPACE or FIL_TYPE_TEMPORARY
  978. @param[in] id tablespace ID
  979. @param[in] flags expected FSP_SPACE_FLAGS
  980. @param[in] space_name tablespace name of the datafile
  981. If file-per-table, it is the table name in the databasename/tablename format
  982. @param[in] path_in expected filepath, usually read from dictionary
  983. @return DB_SUCCESS or error code */
  984. dberr_t
  985. fil_ibd_open(
  986. bool validate,
  987. bool fix_dict,
  988. fil_type_t purpose,
  989. ulint id,
  990. ulint flags,
  991. const char* tablename,
  992. const char* path_in)
  993. MY_ATTRIBUTE((warn_unused_result));
  994. enum fil_load_status {
  995. /** The tablespace file(s) were found and valid. */
  996. FIL_LOAD_OK,
  997. /** The name no longer matches space_id */
  998. FIL_LOAD_ID_CHANGED,
  999. /** The file(s) were not found */
  1000. FIL_LOAD_NOT_FOUND,
  1001. /** The file(s) were not valid */
  1002. FIL_LOAD_INVALID
  1003. };
  1004. /** Open a single-file tablespace and add it to the InnoDB data structures.
  1005. @param[in] space_id tablespace ID
  1006. @param[in] filename path/to/databasename/tablename.ibd
  1007. @param[out] space the tablespace, or NULL on error
  1008. @return status of the operation */
  1009. enum fil_load_status
  1010. fil_ibd_load(
  1011. ulint space_id,
  1012. const char* filename,
  1013. fil_space_t*& space)
  1014. MY_ATTRIBUTE((warn_unused_result));
  1015. /***********************************************************************//**
  1016. A fault-tolerant function that tries to read the next file name in the
  1017. directory. We retry 100 times if os_file_readdir_next_file() returns -1. The
  1018. idea is to read as much good data as we can and jump over bad data.
  1019. @return 0 if ok, -1 if error even after the retries, 1 if at the end
  1020. of the directory */
  1021. int
  1022. fil_file_readdir_next_file(
  1023. /*=======================*/
  1024. dberr_t* err, /*!< out: this is set to DB_ERROR if an error
  1025. was encountered, otherwise not changed */
  1026. const char* dirname,/*!< in: directory name or path */
  1027. os_file_dir_t dir, /*!< in: directory stream */
  1028. os_file_stat_t* info); /*!< in/out: buffer where the
  1029. info is returned */
  1030. /*******************************************************************//**
  1031. Returns true if a matching tablespace exists in the InnoDB tablespace memory
  1032. cache. Note that if we have not done a crash recovery at the database startup,
  1033. there may be many tablespaces which are not yet in the memory cache.
  1034. @return true if a matching tablespace exists in the memory cache */
  1035. bool
  1036. fil_space_for_table_exists_in_mem(
  1037. /*==============================*/
  1038. ulint id, /*!< in: space id */
  1039. const char* name, /*!< in: table name in the standard
  1040. 'databasename/tablename' format */
  1041. bool print_error_if_does_not_exist,
  1042. /*!< in: print detailed error
  1043. information to the .err log if a
  1044. matching tablespace is not found from
  1045. memory */
  1046. bool adjust_space, /*!< in: whether to adjust space id
  1047. when find table space mismatch */
  1048. mem_heap_t* heap, /*!< in: heap memory */
  1049. table_id_t table_id, /*!< in: table id */
  1050. ulint table_flags); /*!< in: table flags */
  1051. /** Try to extend a tablespace if it is smaller than the specified size.
  1052. @param[in,out] space tablespace
  1053. @param[in] size desired size in pages
  1054. @return whether the tablespace is at least as big as requested */
  1055. bool
  1056. fil_space_extend(
  1057. fil_space_t* space,
  1058. ulint size);
  1059. /*******************************************************************//**
  1060. Tries to reserve free extents in a file space.
  1061. @return true if succeed */
  1062. bool
  1063. fil_space_reserve_free_extents(
  1064. /*===========================*/
  1065. ulint id, /*!< in: space id */
  1066. ulint n_free_now, /*!< in: number of free extents now */
  1067. ulint n_to_reserve); /*!< in: how many one wants to reserve */
  1068. /*******************************************************************//**
  1069. Releases free extents in a file space. */
  1070. void
  1071. fil_space_release_free_extents(
  1072. /*===========================*/
  1073. ulint id, /*!< in: space id */
  1074. ulint n_reserved); /*!< in: how many one reserved */
  1075. /*******************************************************************//**
  1076. Gets the number of reserved extents. If the database is silent, this number
  1077. should be zero. */
  1078. ulint
  1079. fil_space_get_n_reserved_extents(
  1080. /*=============================*/
  1081. ulint id); /*!< in: space id */
  1082. /** Reads or writes data. This operation could be asynchronous (aio).
  1083. @param[in] type IO context
  1084. @param[in] sync true if synchronous aio is desired
  1085. @param[in] page_id page id
  1086. @param[in] page_size page size
  1087. @param[in] byte_offset remainder of offset in bytes; in aio this
  1088. must be divisible by the OS block size
  1089. @param[in] len how many bytes to read or write; this must
  1090. not cross a file boundary; in aio this must
  1091. be a block size multiple
  1092. @param[in,out] buf buffer where to store read data or from where
  1093. to write; in aio this must be appropriately
  1094. aligned
  1095. @param[in] message message for aio handler if non-sync aio
  1096. used, else ignored
  1097. @return DB_SUCCESS, DB_TABLESPACE_DELETED or DB_TABLESPACE_TRUNCATED
  1098. if we are trying to do i/o on a tablespace which does not exist */
  1099. dberr_t
  1100. fil_io(
  1101. const IORequest& type,
  1102. bool sync,
  1103. const page_id_t& page_id,
  1104. const page_size_t& page_size,
  1105. ulint byte_offset,
  1106. ulint len,
  1107. void* buf,
  1108. void* message);
  1109. /**********************************************************************//**
  1110. Waits for an aio operation to complete. This function is used to write the
  1111. handler for completed requests. The aio array of pending requests is divided
  1112. into segments (see os0file.cc for more info). The thread specifies which
  1113. segment it wants to wait for. */
  1114. void
  1115. fil_aio_wait(
  1116. /*=========*/
  1117. ulint segment); /*!< in: the number of the segment in the aio
  1118. array to wait for */
  1119. /**********************************************************************//**
  1120. Flushes to disk possible writes cached by the OS. If the space does not exist
  1121. or is being dropped, does not do anything. */
  1122. void
  1123. fil_flush(
  1124. /*======*/
  1125. ulint space_id); /*!< in: file space id (this can be a group of
  1126. log files or a tablespace of the database) */
  1127. /** Flush a tablespace.
  1128. @param[in,out] space tablespace to flush */
  1129. void
  1130. fil_flush(fil_space_t* space);
  1131. /** Flush to disk the writes in file spaces of the given type
  1132. possibly cached by the OS.
  1133. @param[in] purpose FIL_TYPE_TABLESPACE or FIL_TYPE_LOG */
  1134. void
  1135. fil_flush_file_spaces(
  1136. fil_type_t purpose);
  1137. /******************************************************************//**
  1138. Checks the consistency of the tablespace cache.
  1139. @return true if ok */
  1140. bool
  1141. fil_validate(void);
  1142. /*==============*/
  1143. /********************************************************************//**
  1144. Returns true if file address is undefined.
  1145. @return true if undefined */
  1146. bool
  1147. fil_addr_is_null(
  1148. /*=============*/
  1149. fil_addr_t addr); /*!< in: address */
  1150. /********************************************************************//**
  1151. Get the predecessor of a file page.
  1152. @return FIL_PAGE_PREV */
  1153. ulint
  1154. fil_page_get_prev(
  1155. /*==============*/
  1156. const byte* page); /*!< in: file page */
  1157. /********************************************************************//**
  1158. Get the successor of a file page.
  1159. @return FIL_PAGE_NEXT */
  1160. ulint
  1161. fil_page_get_next(
  1162. /*==============*/
  1163. const byte* page); /*!< in: file page */
  1164. /*********************************************************************//**
  1165. Sets the file page type. */
  1166. void
  1167. fil_page_set_type(
  1168. /*==============*/
  1169. byte* page, /*!< in/out: file page */
  1170. ulint type); /*!< in: type */
  1171. /** Reset the page type.
  1172. Data files created before MySQL 5.1 may contain garbage in FIL_PAGE_TYPE.
  1173. In MySQL 3.23.53, only undo log pages and index pages were tagged.
  1174. Any other pages were written with uninitialized bytes in FIL_PAGE_TYPE.
  1175. @param[in] page_id page number
  1176. @param[in,out] page page with invalid FIL_PAGE_TYPE
  1177. @param[in] type expected page type
  1178. @param[in,out] mtr mini-transaction */
  1179. void
  1180. fil_page_reset_type(
  1181. const page_id_t& page_id,
  1182. byte* page,
  1183. ulint type,
  1184. mtr_t* mtr);
  1185. /** Get the file page type.
  1186. @param[in] page file page
  1187. @return page type */
  1188. inline
  1189. ulint
  1190. fil_page_get_type(
  1191. const byte* page)
  1192. {
  1193. return(mach_read_from_2(page + FIL_PAGE_TYPE));
  1194. }
  1195. /** Check (and if needed, reset) the page type.
  1196. Data files created before MySQL 5.1 may contain
  1197. garbage in the FIL_PAGE_TYPE field.
  1198. In MySQL 3.23.53, only undo log pages and index pages were tagged.
  1199. Any other pages were written with uninitialized bytes in FIL_PAGE_TYPE.
  1200. @param[in] page_id page number
  1201. @param[in,out] page page with possibly invalid FIL_PAGE_TYPE
  1202. @param[in] type expected page type
  1203. @param[in,out] mtr mini-transaction */
  1204. inline
  1205. void
  1206. fil_page_check_type(
  1207. const page_id_t& page_id,
  1208. byte* page,
  1209. ulint type,
  1210. mtr_t* mtr)
  1211. {
  1212. ulint page_type = fil_page_get_type(page);
  1213. if (page_type != type) {
  1214. fil_page_reset_type(page_id, page, type, mtr);
  1215. }
  1216. }
  1217. /** Check (and if needed, reset) the page type.
  1218. Data files created before MySQL 5.1 may contain
  1219. garbage in the FIL_PAGE_TYPE field.
  1220. In MySQL 3.23.53, only undo log pages and index pages were tagged.
  1221. Any other pages were written with uninitialized bytes in FIL_PAGE_TYPE.
  1222. @param[in,out] block block with possibly invalid FIL_PAGE_TYPE
  1223. @param[in] type expected page type
  1224. @param[in,out] mtr mini-transaction */
  1225. #define fil_block_check_type(block, type, mtr) \
  1226. fil_page_check_type(block->page.id, block->frame, type, mtr)
  1227. #ifdef UNIV_DEBUG
  1228. /** Increase redo skipped of a tablespace.
  1229. @param[in] id space id */
  1230. void
  1231. fil_space_inc_redo_skipped_count(
  1232. ulint id);
  1233. /** Decrease redo skipped of a tablespace.
  1234. @param[in] id space id */
  1235. void
  1236. fil_space_dec_redo_skipped_count(
  1237. ulint id);
  1238. #endif
  1239. /********************************************************************//**
  1240. Delete the tablespace file and any related files like .cfg.
  1241. This should not be called for temporary tables. */
  1242. void
  1243. fil_delete_file(
  1244. /*============*/
  1245. const char* path); /*!< in: filepath of the ibd tablespace */
  1246. /** Callback functor. */
  1247. struct PageCallback {
  1248. /** Default constructor */
  1249. PageCallback()
  1250. :
  1251. m_page_size(0, 0, false),
  1252. m_filepath() UNIV_NOTHROW {}
  1253. virtual ~PageCallback() UNIV_NOTHROW {}
  1254. /** Called for page 0 in the tablespace file at the start.
  1255. @param file_size size of the file in bytes
  1256. @param block contents of the first page in the tablespace file
  1257. @retval DB_SUCCESS or error code. */
  1258. virtual dberr_t init(
  1259. os_offset_t file_size,
  1260. const buf_block_t* block) UNIV_NOTHROW = 0;
  1261. /** Called for every page in the tablespace. If the page was not
  1262. updated then its state must be set to BUF_PAGE_NOT_USED. For
  1263. compressed tables the page descriptor memory will be at offset:
  1264. block->frame + UNIV_PAGE_SIZE;
  1265. @param offset physical offset within the file
  1266. @param block block read from file, note it is not from the buffer pool
  1267. @retval DB_SUCCESS or error code. */
  1268. virtual dberr_t operator()(
  1269. os_offset_t offset,
  1270. buf_block_t* block) UNIV_NOTHROW = 0;
  1271. /** Set the name of the physical file and the file handle that is used
  1272. to open it for the file that is being iterated over.
  1273. @param filename the name of the tablespace file
  1274. @param file OS file handle */
  1275. void set_file(const char* filename, pfs_os_file_t file) UNIV_NOTHROW
  1276. {
  1277. m_file = file;
  1278. m_filepath = filename;
  1279. }
  1280. /**
  1281. @return the space id of the tablespace */
  1282. virtual ulint get_space_id() const UNIV_NOTHROW = 0;
  1283. /**
  1284. @retval the space flags of the tablespace being iterated over */
  1285. virtual ulint get_space_flags() const UNIV_NOTHROW = 0;
  1286. /** The compressed page size
  1287. @return the compressed page size */
  1288. const page_size_t& get_page_size() const
  1289. {
  1290. return(m_page_size);
  1291. }
  1292. /** The tablespace page size. */
  1293. page_size_t m_page_size;
  1294. /** File handle to the tablespace */
  1295. pfs_os_file_t m_file;
  1296. /** Physical file path. */
  1297. const char* m_filepath;
  1298. protected:
  1299. // Disable copying
  1300. PageCallback(const PageCallback&);
  1301. PageCallback& operator=(const PageCallback&);
  1302. };
  1303. /********************************************************************//**
  1304. Iterate over all the pages in the tablespace.
  1305. @param table the table definiton in the server
  1306. @param n_io_buffers number of blocks to read and write together
  1307. @param callback functor that will do the page updates
  1308. @return DB_SUCCESS or error code */
  1309. dberr_t
  1310. fil_tablespace_iterate(
  1311. /*===================*/
  1312. dict_table_t* table,
  1313. ulint n_io_buffers,
  1314. PageCallback& callback)
  1315. MY_ATTRIBUTE((warn_unused_result));
  1316. /********************************************************************//**
  1317. Looks for a pre-existing fil_space_t with the given tablespace ID
  1318. and, if found, returns the name and filepath in newly allocated buffers that the caller must free.
  1319. @param[in] space_id The tablespace ID to search for.
  1320. @param[out] name Name of the tablespace found.
  1321. @param[out] fileapth The filepath of the first datafile for thtablespace found.
  1322. @return true if tablespace is found, false if not. */
  1323. bool
  1324. fil_space_read_name_and_filepath(
  1325. ulint space_id,
  1326. char** name,
  1327. char** filepath);
  1328. /** Convert a file name to a tablespace name.
  1329. @param[in] filename directory/databasename/tablename.ibd
  1330. @return database/tablename string, to be freed with ut_free() */
  1331. char*
  1332. fil_path_to_space_name(
  1333. const char* filename);
  1334. /** Returns the space ID based on the tablespace name.
  1335. The tablespace must be found in the tablespace memory cache.
  1336. This call is made from external to this module, so the mutex is not owned.
  1337. @param[in] tablespace Tablespace name
  1338. @return space ID if tablespace found, ULINT_UNDEFINED if space not. */
  1339. ulint
  1340. fil_space_get_id_by_name(
  1341. const char* tablespace);
  1342. /**
  1343. Iterate over all the spaces in the space list and fetch the
  1344. tablespace names. It will return a copy of the name that must be
  1345. freed by the caller using: delete[].
  1346. @return DB_SUCCESS if all OK. */
  1347. dberr_t
  1348. fil_get_space_names(
  1349. /*================*/
  1350. space_name_list_t& space_name_list)
  1351. /*!< in/out: Vector for collecting the names. */
  1352. MY_ATTRIBUTE((warn_unused_result));
  1353. /** Generate redo log for swapping two .ibd files
  1354. @param[in] old_table old table
  1355. @param[in] new_table new table
  1356. @param[in] tmp_name temporary table name
  1357. @param[in,out] mtr mini-transaction
  1358. @return innodb error code */
  1359. dberr_t
  1360. fil_mtr_rename_log(
  1361. const dict_table_t* old_table,
  1362. const dict_table_t* new_table,
  1363. const char* tmp_name,
  1364. mtr_t* mtr)
  1365. MY_ATTRIBUTE((nonnull, warn_unused_result));
  1366. /** Acquire the fil_system mutex. */
  1367. #define fil_system_enter() mutex_enter(&fil_system->mutex)
  1368. /** Release the fil_system mutex. */
  1369. #define fil_system_exit() mutex_exit(&fil_system->mutex)
  1370. /*******************************************************************//**
  1371. Returns the table space by a given id, NULL if not found. */
  1372. fil_space_t*
  1373. fil_space_get_by_id(
  1374. /*================*/
  1375. ulint id); /*!< in: space id */
  1376. /*******************************************************************//**
  1377. by redo log.
  1378. @param[in,out] space tablespace */
  1379. void
  1380. fil_names_dirty(
  1381. fil_space_t* space);
  1382. /** Write MLOG_FILE_NAME records when a non-predefined persistent
  1383. tablespace was modified for the first time since the latest
  1384. fil_names_clear().
  1385. @param[in,out] space tablespace
  1386. @param[in,out] mtr mini-transaction */
  1387. void
  1388. fil_names_dirty_and_write(
  1389. fil_space_t* space,
  1390. mtr_t* mtr);
  1391. /** Write MLOG_FILE_NAME records if a persistent tablespace was modified
  1392. for the first time since the latest fil_names_clear().
  1393. @param[in,out] space tablespace
  1394. @param[in,out] mtr mini-transaction
  1395. @return whether any MLOG_FILE_NAME record was written */
  1396. inline MY_ATTRIBUTE((warn_unused_result))
  1397. bool
  1398. fil_names_write_if_was_clean(
  1399. fil_space_t* space,
  1400. mtr_t* mtr)
  1401. {
  1402. ut_ad(log_mutex_own());
  1403. if (space == NULL) {
  1404. return(false);
  1405. }
  1406. const bool was_clean = space->max_lsn == 0;
  1407. ut_ad(space->max_lsn <= log_sys->lsn);
  1408. space->max_lsn = log_sys->lsn;
  1409. if (was_clean) {
  1410. fil_names_dirty_and_write(space, mtr);
  1411. }
  1412. return(was_clean);
  1413. }
  1414. extern volatile bool recv_recovery_on;
  1415. /** During crash recovery, open a tablespace if it had not been opened
  1416. yet, to get valid size and flags.
  1417. @param[in,out] space tablespace */
  1418. inline
  1419. void
  1420. fil_space_open_if_needed(
  1421. fil_space_t* space)
  1422. {
  1423. ut_ad(recv_recovery_on);
  1424. if (space->size == 0) {
  1425. /* Initially, size and flags will be set to 0,
  1426. until the files are opened for the first time.
  1427. fil_space_get_size() will open the file
  1428. and adjust the size and flags. */
  1429. #ifdef UNIV_DEBUG
  1430. ulint size =
  1431. #endif /* UNIV_DEBUG */
  1432. fil_space_get_size(space->id);
  1433. ut_ad(size == space->size);
  1434. }
  1435. }
  1436. /** On a log checkpoint, reset fil_names_dirty_and_write() flags
  1437. and write out MLOG_FILE_NAME and MLOG_CHECKPOINT if needed.
  1438. @param[in] lsn checkpoint LSN
  1439. @param[in] do_write whether to always write MLOG_CHECKPOINT
  1440. @return whether anything was written to the redo log
  1441. @retval false if no flags were set and nothing written
  1442. @retval true if anything was written to the redo log */
  1443. bool
  1444. fil_names_clear(
  1445. lsn_t lsn,
  1446. bool do_write);
  1447. #ifdef UNIV_ENABLE_UNIT_TEST_MAKE_FILEPATH
  1448. void test_make_filepath();
  1449. #endif /* UNIV_ENABLE_UNIT_TEST_MAKE_FILEPATH */
  1450. /** Determine the block size of the data file.
  1451. @param[in] space tablespace
  1452. @param[in] offset page number
  1453. @return block size */
  1454. UNIV_INTERN
  1455. ulint
  1456. fil_space_get_block_size(const fil_space_t* space, unsigned offset);
  1457. #include "fil0fil.ic"
  1458. #endif /* UNIV_INNOCHECKSUM */
  1459. #endif /* fil0fil_h */