You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

4499 lines
126 KiB

branches/zip: Improve the LRU algorithm with a separate unzip_LRU list of blocks that contains uncompressed and compressed frames. This patch was designed by Heikki and Inaam, implemented by Inaam, and refined and reviewed by Marko and Sunny. buf_buddy_n_frames, buf_buddy_min_n_frames, buf_buddy_max_n_frames: Remove. buf_page_belongs_to_unzip_LRU(): New predicate: bpage->zip.data && buf_page_get_state(bpage) == BUF_BLOCK_FILE_PAGE. buf_pool_t, buf_block_t: Add the linked list unzip_LRU. A block in the regular LRU list is in unzip_LRU iff buf_page_belongs_to_unzip_LRU() holds. buf_LRU_free_block(): Add a third return value to refine the case "cannot free the block". buf_LRU_search_and_free_block(): Update the documentation to reflect the implementation. buf_LRU_stat_t, buf_LRU_stat_cur, buf_LRU_stat_sum, buf_LRU_stat_arr[]: Statistics for the unzip_LRU algorithm. buf_LRU_stat_update(): New function: Update the statistics. Called once per second by srv_error_monitor_thread(). buf_LRU_validate(): Validate the unzip_LRU list as well. buf_LRU_evict_from_unzip_LRU(): New predicate: Use the unzip_LRU before falling back to the regular LRU? buf_LRU_free_from_unzip_LRU_list(), buf_LRU_free_from_common_LRU_list(): Subfunctions of buf_LRU_search_and_free_block(). buf_LRU_search_and_free_block(): Reimplement. Try to evict an uncompressed page from the unzip_LRU list before falling back to evicting an entire block from the common LRU list. buf_unzip_LRU_remove_block_if_needed(): New function. buf_unzip_LRU_add_block(): New function: Add a block to the unzip_LRU list.
18 years ago
branches/zip: Fix most MSVC (Windows) compilation warnings. lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return after ut_error. On Windows, ut_error is not declared as "noreturn". Add explicit type casts when assigning ulint to byte to get rid of "possible loss of precision" warnings. struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint instead of ullint. 32 bits should be enough. fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned integers to longlong when calling Field::store(longlong, bool is_unsigned). Otherwise, the compiler would implicitly convert them to double and invoke Field::store(double) instead. recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add(): Cast ib_uint64_t expressions to ulint to get rid of "possible loss of precision" warnings. (There should not be any loss of precision in these cases.) log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t instead of ulint, so that there won't be any potential loss of precision. mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint. OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE. row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*) in order to get rid of the bogus MSVC warning C4090, which has been reported as MSVC bug 101661: <http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661> row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090, drop a const qualifier.
18 years ago
branches/zip: Fix most MSVC (Windows) compilation warnings. lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return after ut_error. On Windows, ut_error is not declared as "noreturn". Add explicit type casts when assigning ulint to byte to get rid of "possible loss of precision" warnings. struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint instead of ullint. 32 bits should be enough. fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned integers to longlong when calling Field::store(longlong, bool is_unsigned). Otherwise, the compiler would implicitly convert them to double and invoke Field::store(double) instead. recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add(): Cast ib_uint64_t expressions to ulint to get rid of "possible loss of precision" warnings. (There should not be any loss of precision in these cases.) log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t instead of ulint, so that there won't be any potential loss of precision. mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint. OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE. row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*) in order to get rid of the bogus MSVC warning C4090, which has been reported as MSVC bug 101661: <http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661> row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090, drop a const qualifier.
18 years ago
branches/zip: Fix most MSVC (Windows) compilation warnings. lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return after ut_error. On Windows, ut_error is not declared as "noreturn". Add explicit type casts when assigning ulint to byte to get rid of "possible loss of precision" warnings. struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint instead of ullint. 32 bits should be enough. fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned integers to longlong when calling Field::store(longlong, bool is_unsigned). Otherwise, the compiler would implicitly convert them to double and invoke Field::store(double) instead. recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add(): Cast ib_uint64_t expressions to ulint to get rid of "possible loss of precision" warnings. (There should not be any loss of precision in these cases.) log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t instead of ulint, so that there won't be any potential loss of precision. mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint. OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE. row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*) in order to get rid of the bogus MSVC warning C4090, which has been reported as MSVC bug 101661: <http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661> row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090, drop a const qualifier.
18 years ago
branches/zip: Fix most MSVC (Windows) compilation warnings. lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return after ut_error. On Windows, ut_error is not declared as "noreturn". Add explicit type casts when assigning ulint to byte to get rid of "possible loss of precision" warnings. struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint instead of ullint. 32 bits should be enough. fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned integers to longlong when calling Field::store(longlong, bool is_unsigned). Otherwise, the compiler would implicitly convert them to double and invoke Field::store(double) instead. recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add(): Cast ib_uint64_t expressions to ulint to get rid of "possible loss of precision" warnings. (There should not be any loss of precision in these cases.) log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t instead of ulint, so that there won't be any potential loss of precision. mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint. OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE. row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*) in order to get rid of the bogus MSVC warning C4090, which has been reported as MSVC bug 101661: <http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661> row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090, drop a const qualifier.
18 years ago
branches/zip: Fix most MSVC (Windows) compilation warnings. lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return after ut_error. On Windows, ut_error is not declared as "noreturn". Add explicit type casts when assigning ulint to byte to get rid of "possible loss of precision" warnings. struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint instead of ullint. 32 bits should be enough. fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned integers to longlong when calling Field::store(longlong, bool is_unsigned). Otherwise, the compiler would implicitly convert them to double and invoke Field::store(double) instead. recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add(): Cast ib_uint64_t expressions to ulint to get rid of "possible loss of precision" warnings. (There should not be any loss of precision in these cases.) log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t instead of ulint, so that there won't be any potential loss of precision. mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint. OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE. row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*) in order to get rid of the bogus MSVC warning C4090, which has been reported as MSVC bug 101661: <http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661> row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090, drop a const qualifier.
18 years ago
branches/zip: Fix most MSVC (Windows) compilation warnings. lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return after ut_error. On Windows, ut_error is not declared as "noreturn". Add explicit type casts when assigning ulint to byte to get rid of "possible loss of precision" warnings. struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint instead of ullint. 32 bits should be enough. fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned integers to longlong when calling Field::store(longlong, bool is_unsigned). Otherwise, the compiler would implicitly convert them to double and invoke Field::store(double) instead. recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add(): Cast ib_uint64_t expressions to ulint to get rid of "possible loss of precision" warnings. (There should not be any loss of precision in these cases.) log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t instead of ulint, so that there won't be any potential loss of precision. mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint. OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE. row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*) in order to get rid of the bogus MSVC warning C4090, which has been reported as MSVC bug 101661: <http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661> row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090, drop a const qualifier.
18 years ago
branches/zip: Fix most MSVC (Windows) compilation warnings. lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return after ut_error. On Windows, ut_error is not declared as "noreturn". Add explicit type casts when assigning ulint to byte to get rid of "possible loss of precision" warnings. struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint instead of ullint. 32 bits should be enough. fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned integers to longlong when calling Field::store(longlong, bool is_unsigned). Otherwise, the compiler would implicitly convert them to double and invoke Field::store(double) instead. recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add(): Cast ib_uint64_t expressions to ulint to get rid of "possible loss of precision" warnings. (There should not be any loss of precision in these cases.) log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t instead of ulint, so that there won't be any potential loss of precision. mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint. OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE. row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*) in order to get rid of the bogus MSVC warning C4090, which has been reported as MSVC bug 101661: <http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661> row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090, drop a const qualifier.
18 years ago
branches/zip: Improve the LRU algorithm with a separate unzip_LRU list of blocks that contains uncompressed and compressed frames. This patch was designed by Heikki and Inaam, implemented by Inaam, and refined and reviewed by Marko and Sunny. buf_buddy_n_frames, buf_buddy_min_n_frames, buf_buddy_max_n_frames: Remove. buf_page_belongs_to_unzip_LRU(): New predicate: bpage->zip.data && buf_page_get_state(bpage) == BUF_BLOCK_FILE_PAGE. buf_pool_t, buf_block_t: Add the linked list unzip_LRU. A block in the regular LRU list is in unzip_LRU iff buf_page_belongs_to_unzip_LRU() holds. buf_LRU_free_block(): Add a third return value to refine the case "cannot free the block". buf_LRU_search_and_free_block(): Update the documentation to reflect the implementation. buf_LRU_stat_t, buf_LRU_stat_cur, buf_LRU_stat_sum, buf_LRU_stat_arr[]: Statistics for the unzip_LRU algorithm. buf_LRU_stat_update(): New function: Update the statistics. Called once per second by srv_error_monitor_thread(). buf_LRU_validate(): Validate the unzip_LRU list as well. buf_LRU_evict_from_unzip_LRU(): New predicate: Use the unzip_LRU before falling back to the regular LRU? buf_LRU_free_from_unzip_LRU_list(), buf_LRU_free_from_common_LRU_list(): Subfunctions of buf_LRU_search_and_free_block(). buf_LRU_search_and_free_block(): Reimplement. Try to evict an uncompressed page from the unzip_LRU list before falling back to evicting an entire block from the common LRU list. buf_unzip_LRU_remove_block_if_needed(): New function. buf_unzip_LRU_add_block(): New function: Add a block to the unzip_LRU list.
18 years ago
branches/zip: Fix most MSVC (Windows) compilation warnings. lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return after ut_error. On Windows, ut_error is not declared as "noreturn". Add explicit type casts when assigning ulint to byte to get rid of "possible loss of precision" warnings. struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint instead of ullint. 32 bits should be enough. fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned integers to longlong when calling Field::store(longlong, bool is_unsigned). Otherwise, the compiler would implicitly convert them to double and invoke Field::store(double) instead. recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add(): Cast ib_uint64_t expressions to ulint to get rid of "possible loss of precision" warnings. (There should not be any loss of precision in these cases.) log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t instead of ulint, so that there won't be any potential loss of precision. mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint. OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE. row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*) in order to get rid of the bogus MSVC warning C4090, which has been reported as MSVC bug 101661: <http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661> row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090, drop a const qualifier.
18 years ago
branches/zip: Fix most MSVC (Windows) compilation warnings. lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return after ut_error. On Windows, ut_error is not declared as "noreturn". Add explicit type casts when assigning ulint to byte to get rid of "possible loss of precision" warnings. struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint instead of ullint. 32 bits should be enough. fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned integers to longlong when calling Field::store(longlong, bool is_unsigned). Otherwise, the compiler would implicitly convert them to double and invoke Field::store(double) instead. recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add(): Cast ib_uint64_t expressions to ulint to get rid of "possible loss of precision" warnings. (There should not be any loss of precision in these cases.) log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t instead of ulint, so that there won't be any potential loss of precision. mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint. OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE. row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*) in order to get rid of the bogus MSVC warning C4090, which has been reported as MSVC bug 101661: <http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661> row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090, drop a const qualifier.
18 years ago
branches/zip: Fix most MSVC (Windows) compilation warnings. lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return after ut_error. On Windows, ut_error is not declared as "noreturn". Add explicit type casts when assigning ulint to byte to get rid of "possible loss of precision" warnings. struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint instead of ullint. 32 bits should be enough. fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned integers to longlong when calling Field::store(longlong, bool is_unsigned). Otherwise, the compiler would implicitly convert them to double and invoke Field::store(double) instead. recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add(): Cast ib_uint64_t expressions to ulint to get rid of "possible loss of precision" warnings. (There should not be any loss of precision in these cases.) log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t instead of ulint, so that there won't be any potential loss of precision. mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint. OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE. row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*) in order to get rid of the bogus MSVC warning C4090, which has been reported as MSVC bug 101661: <http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661> row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090, drop a const qualifier.
18 years ago
branches/zip: Fix most MSVC (Windows) compilation warnings. lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return after ut_error. On Windows, ut_error is not declared as "noreturn". Add explicit type casts when assigning ulint to byte to get rid of "possible loss of precision" warnings. struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint instead of ullint. 32 bits should be enough. fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned integers to longlong when calling Field::store(longlong, bool is_unsigned). Otherwise, the compiler would implicitly convert them to double and invoke Field::store(double) instead. recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add(): Cast ib_uint64_t expressions to ulint to get rid of "possible loss of precision" warnings. (There should not be any loss of precision in these cases.) log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t instead of ulint, so that there won't be any potential loss of precision. mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint. OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE. row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*) in order to get rid of the bogus MSVC warning C4090, which has been reported as MSVC bug 101661: <http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661> row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090, drop a const qualifier.
18 years ago
branches/zip: Try to synchronize the updates of uncompressed and compressed pages. btr_root_raise_and_insert(): Distinguish root_page_zip and new_page_zip. btr_cur_set_ownership_of_extern_field(): Do not log the write on the uncompressed page if it will be logged for page_zip. lock_rec_insert_check_and_lock(), lock_sec_rec_modify_check_and_lock(): Update the max_trx_id field also on the compressed page. mlog_write_ulint(): Add UNIV_UNLIKELY hints. Remove trailing white space. mlog_log_string(): Remove trailing white space. rec_set_field_extern_bits(): Remove parameter mtr, as the write will either occur in the heap, or it will be logged at a higher level. recv_parse_or_apply_log_rec_body(), page_zip_write_header(): Add log record type MLOG_ZIP_WRITE_HEADER. page_header_set_field(): Pass mtr=NULL to page_zip_write_header(). page_header_reset_last_insert(): Pass mtr to page_zip_write_header(). btr_page_set_index_id(), btr_page_set_level(), btr_page_set_next(), btr_page_set_prev(): Pass mtr to page_zip_write_header(). row_upd_rec_sys_fields(): Pass mtr=NULL to page_zip_write_trx_id() and page_zip_write_roll_ptr(), since the write will be logged at a higher level. page_zip_write_header(): Add parameter mtr. page_zip_write_header_log(): New function. Remove rec_set_nth_field_extern_bit(). Make rec_set_nth_field_extern_bit_old() static. Rename rec_set_nth_field_extern_bit_new() to rec_set_field_extern_bits_new() and make it static. row_ins_index_entry_low(): Remove bogus TODO comment.
20 years ago
branches/zip: Try to synchronize the updates of uncompressed and compressed pages. btr_root_raise_and_insert(): Distinguish root_page_zip and new_page_zip. btr_cur_set_ownership_of_extern_field(): Do not log the write on the uncompressed page if it will be logged for page_zip. lock_rec_insert_check_and_lock(), lock_sec_rec_modify_check_and_lock(): Update the max_trx_id field also on the compressed page. mlog_write_ulint(): Add UNIV_UNLIKELY hints. Remove trailing white space. mlog_log_string(): Remove trailing white space. rec_set_field_extern_bits(): Remove parameter mtr, as the write will either occur in the heap, or it will be logged at a higher level. recv_parse_or_apply_log_rec_body(), page_zip_write_header(): Add log record type MLOG_ZIP_WRITE_HEADER. page_header_set_field(): Pass mtr=NULL to page_zip_write_header(). page_header_reset_last_insert(): Pass mtr to page_zip_write_header(). btr_page_set_index_id(), btr_page_set_level(), btr_page_set_next(), btr_page_set_prev(): Pass mtr to page_zip_write_header(). row_upd_rec_sys_fields(): Pass mtr=NULL to page_zip_write_trx_id() and page_zip_write_roll_ptr(), since the write will be logged at a higher level. page_zip_write_header(): Add parameter mtr. page_zip_write_header_log(): New function. Remove rec_set_nth_field_extern_bit(). Make rec_set_nth_field_extern_bit_old() static. Rename rec_set_nth_field_extern_bit_new() to rec_set_field_extern_bits_new() and make it static. row_ins_index_entry_low(): Remove bogus TODO comment.
20 years ago
branches/zip: Try to synchronize the updates of uncompressed and compressed pages. btr_root_raise_and_insert(): Distinguish root_page_zip and new_page_zip. btr_cur_set_ownership_of_extern_field(): Do not log the write on the uncompressed page if it will be logged for page_zip. lock_rec_insert_check_and_lock(), lock_sec_rec_modify_check_and_lock(): Update the max_trx_id field also on the compressed page. mlog_write_ulint(): Add UNIV_UNLIKELY hints. Remove trailing white space. mlog_log_string(): Remove trailing white space. rec_set_field_extern_bits(): Remove parameter mtr, as the write will either occur in the heap, or it will be logged at a higher level. recv_parse_or_apply_log_rec_body(), page_zip_write_header(): Add log record type MLOG_ZIP_WRITE_HEADER. page_header_set_field(): Pass mtr=NULL to page_zip_write_header(). page_header_reset_last_insert(): Pass mtr to page_zip_write_header(). btr_page_set_index_id(), btr_page_set_level(), btr_page_set_next(), btr_page_set_prev(): Pass mtr to page_zip_write_header(). row_upd_rec_sys_fields(): Pass mtr=NULL to page_zip_write_trx_id() and page_zip_write_roll_ptr(), since the write will be logged at a higher level. page_zip_write_header(): Add parameter mtr. page_zip_write_header_log(): New function. Remove rec_set_nth_field_extern_bit(). Make rec_set_nth_field_extern_bit_old() static. Rename rec_set_nth_field_extern_bit_new() to rec_set_field_extern_bits_new() and make it static. row_ins_index_entry_low(): Remove bogus TODO comment.
20 years ago
branches/zip: Try to synchronize the updates of uncompressed and compressed pages. btr_root_raise_and_insert(): Distinguish root_page_zip and new_page_zip. btr_cur_set_ownership_of_extern_field(): Do not log the write on the uncompressed page if it will be logged for page_zip. lock_rec_insert_check_and_lock(), lock_sec_rec_modify_check_and_lock(): Update the max_trx_id field also on the compressed page. mlog_write_ulint(): Add UNIV_UNLIKELY hints. Remove trailing white space. mlog_log_string(): Remove trailing white space. rec_set_field_extern_bits(): Remove parameter mtr, as the write will either occur in the heap, or it will be logged at a higher level. recv_parse_or_apply_log_rec_body(), page_zip_write_header(): Add log record type MLOG_ZIP_WRITE_HEADER. page_header_set_field(): Pass mtr=NULL to page_zip_write_header(). page_header_reset_last_insert(): Pass mtr to page_zip_write_header(). btr_page_set_index_id(), btr_page_set_level(), btr_page_set_next(), btr_page_set_prev(): Pass mtr to page_zip_write_header(). row_upd_rec_sys_fields(): Pass mtr=NULL to page_zip_write_trx_id() and page_zip_write_roll_ptr(), since the write will be logged at a higher level. page_zip_write_header(): Add parameter mtr. page_zip_write_header_log(): New function. Remove rec_set_nth_field_extern_bit(). Make rec_set_nth_field_extern_bit_old() static. Rename rec_set_nth_field_extern_bit_new() to rec_set_field_extern_bits_new() and make it static. row_ins_index_entry_low(): Remove bogus TODO comment.
20 years ago
branches/zip: Try to synchronize the updates of uncompressed and compressed pages. btr_root_raise_and_insert(): Distinguish root_page_zip and new_page_zip. btr_cur_set_ownership_of_extern_field(): Do not log the write on the uncompressed page if it will be logged for page_zip. lock_rec_insert_check_and_lock(), lock_sec_rec_modify_check_and_lock(): Update the max_trx_id field also on the compressed page. mlog_write_ulint(): Add UNIV_UNLIKELY hints. Remove trailing white space. mlog_log_string(): Remove trailing white space. rec_set_field_extern_bits(): Remove parameter mtr, as the write will either occur in the heap, or it will be logged at a higher level. recv_parse_or_apply_log_rec_body(), page_zip_write_header(): Add log record type MLOG_ZIP_WRITE_HEADER. page_header_set_field(): Pass mtr=NULL to page_zip_write_header(). page_header_reset_last_insert(): Pass mtr to page_zip_write_header(). btr_page_set_index_id(), btr_page_set_level(), btr_page_set_next(), btr_page_set_prev(): Pass mtr to page_zip_write_header(). row_upd_rec_sys_fields(): Pass mtr=NULL to page_zip_write_trx_id() and page_zip_write_roll_ptr(), since the write will be logged at a higher level. page_zip_write_header(): Add parameter mtr. page_zip_write_header_log(): New function. Remove rec_set_nth_field_extern_bit(). Make rec_set_nth_field_extern_bit_old() static. Rename rec_set_nth_field_extern_bit_new() to rec_set_field_extern_bits_new() and make it static. row_ins_index_entry_low(): Remove bogus TODO comment.
20 years ago
branches/zip: Try to synchronize the updates of uncompressed and compressed pages. btr_root_raise_and_insert(): Distinguish root_page_zip and new_page_zip. btr_cur_set_ownership_of_extern_field(): Do not log the write on the uncompressed page if it will be logged for page_zip. lock_rec_insert_check_and_lock(), lock_sec_rec_modify_check_and_lock(): Update the max_trx_id field also on the compressed page. mlog_write_ulint(): Add UNIV_UNLIKELY hints. Remove trailing white space. mlog_log_string(): Remove trailing white space. rec_set_field_extern_bits(): Remove parameter mtr, as the write will either occur in the heap, or it will be logged at a higher level. recv_parse_or_apply_log_rec_body(), page_zip_write_header(): Add log record type MLOG_ZIP_WRITE_HEADER. page_header_set_field(): Pass mtr=NULL to page_zip_write_header(). page_header_reset_last_insert(): Pass mtr to page_zip_write_header(). btr_page_set_index_id(), btr_page_set_level(), btr_page_set_next(), btr_page_set_prev(): Pass mtr to page_zip_write_header(). row_upd_rec_sys_fields(): Pass mtr=NULL to page_zip_write_trx_id() and page_zip_write_roll_ptr(), since the write will be logged at a higher level. page_zip_write_header(): Add parameter mtr. page_zip_write_header_log(): New function. Remove rec_set_nth_field_extern_bit(). Make rec_set_nth_field_extern_bit_old() static. Rename rec_set_nth_field_extern_bit_new() to rec_set_field_extern_bits_new() and make it static. row_ins_index_entry_low(): Remove bogus TODO comment.
20 years ago
branches/zip: Try to synchronize the updates of uncompressed and compressed pages. btr_root_raise_and_insert(): Distinguish root_page_zip and new_page_zip. btr_cur_set_ownership_of_extern_field(): Do not log the write on the uncompressed page if it will be logged for page_zip. lock_rec_insert_check_and_lock(), lock_sec_rec_modify_check_and_lock(): Update the max_trx_id field also on the compressed page. mlog_write_ulint(): Add UNIV_UNLIKELY hints. Remove trailing white space. mlog_log_string(): Remove trailing white space. rec_set_field_extern_bits(): Remove parameter mtr, as the write will either occur in the heap, or it will be logged at a higher level. recv_parse_or_apply_log_rec_body(), page_zip_write_header(): Add log record type MLOG_ZIP_WRITE_HEADER. page_header_set_field(): Pass mtr=NULL to page_zip_write_header(). page_header_reset_last_insert(): Pass mtr to page_zip_write_header(). btr_page_set_index_id(), btr_page_set_level(), btr_page_set_next(), btr_page_set_prev(): Pass mtr to page_zip_write_header(). row_upd_rec_sys_fields(): Pass mtr=NULL to page_zip_write_trx_id() and page_zip_write_roll_ptr(), since the write will be logged at a higher level. page_zip_write_header(): Add parameter mtr. page_zip_write_header_log(): New function. Remove rec_set_nth_field_extern_bit(). Make rec_set_nth_field_extern_bit_old() static. Rename rec_set_nth_field_extern_bit_new() to rec_set_field_extern_bits_new() and make it static. row_ins_index_entry_low(): Remove bogus TODO comment.
20 years ago
branches/zip: Try to synchronize the updates of uncompressed and compressed pages. btr_root_raise_and_insert(): Distinguish root_page_zip and new_page_zip. btr_cur_set_ownership_of_extern_field(): Do not log the write on the uncompressed page if it will be logged for page_zip. lock_rec_insert_check_and_lock(), lock_sec_rec_modify_check_and_lock(): Update the max_trx_id field also on the compressed page. mlog_write_ulint(): Add UNIV_UNLIKELY hints. Remove trailing white space. mlog_log_string(): Remove trailing white space. rec_set_field_extern_bits(): Remove parameter mtr, as the write will either occur in the heap, or it will be logged at a higher level. recv_parse_or_apply_log_rec_body(), page_zip_write_header(): Add log record type MLOG_ZIP_WRITE_HEADER. page_header_set_field(): Pass mtr=NULL to page_zip_write_header(). page_header_reset_last_insert(): Pass mtr to page_zip_write_header(). btr_page_set_index_id(), btr_page_set_level(), btr_page_set_next(), btr_page_set_prev(): Pass mtr to page_zip_write_header(). row_upd_rec_sys_fields(): Pass mtr=NULL to page_zip_write_trx_id() and page_zip_write_roll_ptr(), since the write will be logged at a higher level. page_zip_write_header(): Add parameter mtr. page_zip_write_header_log(): New function. Remove rec_set_nth_field_extern_bit(). Make rec_set_nth_field_extern_bit_old() static. Rename rec_set_nth_field_extern_bit_new() to rec_set_field_extern_bits_new() and make it static. row_ins_index_entry_low(): Remove bogus TODO comment.
20 years ago
  1. /******************************************************
  2. Compressed page interface
  3. (c) 2005 Innobase Oy
  4. Created June 2005 by Marko Makela
  5. *******************************************************/
  6. #define THIS_MODULE
  7. #include "page0zip.h"
  8. #ifdef UNIV_NONINL
  9. # include "page0zip.ic"
  10. #endif
  11. #undef THIS_MODULE
  12. #include "page0page.h"
  13. #include "mtr0log.h"
  14. #include "ut0sort.h"
  15. #include "dict0boot.h"
  16. #include "dict0dict.h"
  17. #include "btr0sea.h"
  18. #include "btr0cur.h"
  19. #include "page0types.h"
  20. #include "lock0lock.h"
  21. #include "log0recv.h"
  22. #include "zlib.h"
  23. #include "buf0lru.h"
  24. /** Statistics on compression, indexed by page_zip_des_t::ssize - 1 */
  25. UNIV_INTERN page_zip_stat_t page_zip_stat[PAGE_ZIP_NUM_SSIZE - 1];
  26. /* Please refer to ../include/page0zip.ic for a description of the
  27. compressed page format. */
  28. /* The infimum and supremum records are omitted from the compressed page.
  29. On compress, we compare that the records are there, and on uncompress we
  30. restore the records. */
  31. static const byte infimum_extra[] = {
  32. 0x01, /* info_bits=0, n_owned=1 */
  33. 0x00, 0x02 /* heap_no=0, status=2 */
  34. /* ?, ? */ /* next=(first user rec, or supremum) */
  35. };
  36. static const byte infimum_data[] = {
  37. 0x69, 0x6e, 0x66, 0x69,
  38. 0x6d, 0x75, 0x6d, 0x00 /* "infimum\0" */
  39. };
  40. static const byte supremum_extra_data[] = {
  41. /* 0x0?, */ /* info_bits=0, n_owned=1..8 */
  42. 0x00, 0x0b, /* heap_no=1, status=3 */
  43. 0x00, 0x00, /* next=0 */
  44. 0x73, 0x75, 0x70, 0x72,
  45. 0x65, 0x6d, 0x75, 0x6d /* "supremum" */
  46. };
  47. /** Assert that a block of memory is filled with zero bytes.
  48. Compare at most sizeof(field_ref_zero) bytes. */
  49. #define ASSERT_ZERO(b, s) \
  50. ut_ad(!memcmp(b, field_ref_zero, ut_min(s, sizeof field_ref_zero)))
  51. /** Assert that a BLOB pointer is filled with zero bytes. */
  52. #define ASSERT_ZERO_BLOB(b) \
  53. ut_ad(!memcmp(b, field_ref_zero, sizeof field_ref_zero))
  54. /* Enable some extra debugging output. This code can be enabled
  55. independently of any UNIV_ debugging conditions. */
  56. #if defined UNIV_DEBUG || defined UNIV_ZIP_DEBUG
  57. # include <stdarg.h>
  58. __attribute__((format (printf, 1, 2)))
  59. /**************************************************************************
  60. Report a failure to decompress or compress. */
  61. static
  62. int
  63. page_zip_fail_func(
  64. /*===============*/
  65. /* out: number of characters printed */
  66. const char* fmt, /* in: printf(3) format string */
  67. ...) /* in: arguments corresponding to fmt */
  68. {
  69. int res;
  70. va_list ap;
  71. va_start(ap, fmt);
  72. res = vfprintf(stderr, fmt, ap);
  73. va_end(ap);
  74. return(res);
  75. }
  76. # define page_zip_fail(fmt_args) page_zip_fail_func fmt_args
  77. #else /* UNIV_DEBUG || UNIV_ZIP_DEBUG */
  78. # define page_zip_fail(fmt_args) /* empty */
  79. #endif /* UNIV_DEBUG || UNIV_ZIP_DEBUG */
  80. /**************************************************************************
  81. Determine the guaranteed free space on an empty page. */
  82. UNIV_INTERN
  83. ulint
  84. page_zip_empty_size(
  85. /*================*/
  86. /* out: minimum payload size on the page */
  87. ulint n_fields, /* in: number of columns in the index */
  88. ulint zip_size) /* in: compressed page size in bytes */
  89. {
  90. lint size = zip_size
  91. /* subtract the page header and the longest
  92. uncompressed data needed for one record */
  93. - (PAGE_DATA
  94. + PAGE_ZIP_DIR_SLOT_SIZE
  95. + DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN
  96. + 1/* encoded heap_no==2 in page_zip_write_rec() */
  97. + 1/* end of modification log */
  98. - REC_N_NEW_EXTRA_BYTES/* omitted bytes */)
  99. /* subtract the space for page_zip_fields_encode() */
  100. - compressBound(2 * (n_fields + 1));
  101. return(size > 0 ? (ulint) size : 0);
  102. }
  103. /*****************************************************************
  104. Gets the size of the compressed page trailer (the dense page directory),
  105. including deleted records (the free list). */
  106. UNIV_INLINE
  107. ulint
  108. page_zip_dir_size(
  109. /*==============*/
  110. /* out: length of dense page
  111. directory, in bytes */
  112. const page_zip_des_t* page_zip) /* in: compressed page */
  113. {
  114. /* Exclude the page infimum and supremum from the record count. */
  115. ulint size = PAGE_ZIP_DIR_SLOT_SIZE
  116. * (page_dir_get_n_heap(page_zip->data)
  117. - PAGE_HEAP_NO_USER_LOW);
  118. return(size);
  119. }
  120. /*****************************************************************
  121. Gets the size of the compressed page trailer (the dense page directory),
  122. only including user records (excluding the free list). */
  123. UNIV_INLINE
  124. ulint
  125. page_zip_dir_user_size(
  126. /*===================*/
  127. /* out: length of dense page
  128. directory comprising existing
  129. records, in bytes */
  130. const page_zip_des_t* page_zip) /* in: compressed page */
  131. {
  132. ulint size = PAGE_ZIP_DIR_SLOT_SIZE
  133. * page_get_n_recs(page_zip->data);
  134. ut_ad(size <= page_zip_dir_size(page_zip));
  135. return(size);
  136. }
  137. /*****************************************************************
  138. Find the slot of the given record in the dense page directory. */
  139. UNIV_INLINE
  140. byte*
  141. page_zip_dir_find_low(
  142. /*==================*/
  143. /* out: dense directory slot,
  144. or NULL if record not found */
  145. byte* slot, /* in: start of records */
  146. byte* end, /* in: end of records */
  147. ulint offset) /* in: offset of user record */
  148. {
  149. ut_ad(slot <= end);
  150. for (; slot < end; slot += PAGE_ZIP_DIR_SLOT_SIZE) {
  151. if ((mach_read_from_2(slot) & PAGE_ZIP_DIR_SLOT_MASK)
  152. == offset) {
  153. return(slot);
  154. }
  155. }
  156. return(NULL);
  157. }
  158. /*****************************************************************
  159. Find the slot of the given non-free record in the dense page directory. */
  160. UNIV_INLINE
  161. byte*
  162. page_zip_dir_find(
  163. /*==============*/
  164. /* out: dense directory slot,
  165. or NULL if record not found */
  166. page_zip_des_t* page_zip, /* in: compressed page */
  167. ulint offset) /* in: offset of user record */
  168. {
  169. byte* end = page_zip->data + page_zip_get_size(page_zip);
  170. ut_ad(page_zip_simple_validate(page_zip));
  171. return(page_zip_dir_find_low(end - page_zip_dir_user_size(page_zip),
  172. end,
  173. offset));
  174. }
  175. /*****************************************************************
  176. Find the slot of the given free record in the dense page directory. */
  177. UNIV_INLINE
  178. byte*
  179. page_zip_dir_find_free(
  180. /*===================*/
  181. /* out: dense directory slot,
  182. or NULL if record not found */
  183. page_zip_des_t* page_zip, /* in: compressed page */
  184. ulint offset) /* in: offset of user record */
  185. {
  186. byte* end = page_zip->data + page_zip_get_size(page_zip);
  187. ut_ad(page_zip_simple_validate(page_zip));
  188. return(page_zip_dir_find_low(end - page_zip_dir_size(page_zip),
  189. end - page_zip_dir_user_size(page_zip),
  190. offset));
  191. }
  192. /*****************************************************************
  193. Read a given slot in the dense page directory. */
  194. UNIV_INLINE
  195. ulint
  196. page_zip_dir_get(
  197. /*=============*/
  198. /* out: record offset
  199. on the uncompressed page,
  200. possibly ORed with
  201. PAGE_ZIP_DIR_SLOT_DEL or
  202. PAGE_ZIP_DIR_SLOT_OWNED */
  203. const page_zip_des_t* page_zip, /* in: compressed page */
  204. ulint slot) /* in: slot
  205. (0=first user record) */
  206. {
  207. ut_ad(page_zip_simple_validate(page_zip));
  208. ut_ad(slot < page_zip_dir_size(page_zip) / PAGE_ZIP_DIR_SLOT_SIZE);
  209. return(mach_read_from_2(page_zip->data + page_zip_get_size(page_zip)
  210. - PAGE_ZIP_DIR_SLOT_SIZE * (slot + 1)));
  211. }
  212. /**************************************************************************
  213. Write a log record of compressing an index page. */
  214. static
  215. void
  216. page_zip_compress_write_log(
  217. /*========================*/
  218. const page_zip_des_t* page_zip,/* in: compressed page */
  219. const page_t* page, /* in: uncompressed page */
  220. dict_index_t* index, /* in: index of the B-tree node */
  221. mtr_t* mtr) /* in: mini-transaction */
  222. {
  223. byte* log_ptr;
  224. ulint trailer_size;
  225. log_ptr = mlog_open(mtr, 11 + 2 + 2);
  226. if (!log_ptr) {
  227. return;
  228. }
  229. /* Read the number of user records. */
  230. trailer_size = page_dir_get_n_heap(page_zip->data)
  231. - PAGE_HEAP_NO_USER_LOW;
  232. /* Multiply by uncompressed of size stored per record */
  233. if (!page_is_leaf(page)) {
  234. trailer_size *= PAGE_ZIP_DIR_SLOT_SIZE + REC_NODE_PTR_SIZE;
  235. } else if (dict_index_is_clust(index)) {
  236. trailer_size *= PAGE_ZIP_DIR_SLOT_SIZE
  237. + DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN;
  238. } else {
  239. trailer_size *= PAGE_ZIP_DIR_SLOT_SIZE;
  240. }
  241. /* Add the space occupied by BLOB pointers. */
  242. trailer_size += page_zip->n_blobs * BTR_EXTERN_FIELD_REF_SIZE;
  243. ut_a(page_zip->m_end > PAGE_DATA);
  244. #if FIL_PAGE_DATA > PAGE_DATA
  245. # error "FIL_PAGE_DATA > PAGE_DATA"
  246. #endif
  247. ut_a(page_zip->m_end + trailer_size <= page_zip_get_size(page_zip));
  248. log_ptr = mlog_write_initial_log_record_fast((page_t*) page,
  249. MLOG_ZIP_PAGE_COMPRESS,
  250. log_ptr, mtr);
  251. mach_write_to_2(log_ptr, page_zip->m_end - FIL_PAGE_TYPE);
  252. log_ptr += 2;
  253. mach_write_to_2(log_ptr, trailer_size);
  254. log_ptr += 2;
  255. mlog_close(mtr, log_ptr);
  256. /* Write FIL_PAGE_PREV and FIL_PAGE_NEXT */
  257. mlog_catenate_string(mtr, page_zip->data + FIL_PAGE_PREV, 4);
  258. mlog_catenate_string(mtr, page_zip->data + FIL_PAGE_NEXT, 4);
  259. /* Write most of the page header, the compressed stream and
  260. the modification log. */
  261. mlog_catenate_string(mtr, page_zip->data + FIL_PAGE_TYPE,
  262. page_zip->m_end - FIL_PAGE_TYPE);
  263. /* Write the uncompressed trailer of the compressed page. */
  264. mlog_catenate_string(mtr, page_zip->data + page_zip_get_size(page_zip)
  265. - trailer_size, trailer_size);
  266. }
  267. /**********************************************************
  268. Determine how many externally stored columns are contained
  269. in existing records with smaller heap_no than rec. */
  270. static
  271. ulint
  272. page_zip_get_n_prev_extern(
  273. /*=======================*/
  274. const page_zip_des_t* page_zip,/* in: dense page directory on
  275. compressed page */
  276. const rec_t* rec, /* in: compact physical record
  277. on a B-tree leaf page */
  278. dict_index_t* index) /* in: record descriptor */
  279. {
  280. const page_t* page = page_align(rec);
  281. ulint n_ext = 0;
  282. ulint i;
  283. ulint left;
  284. ulint heap_no;
  285. ulint n_recs = page_get_n_recs(page_zip->data);
  286. ut_ad(page_is_leaf(page));
  287. ut_ad(page_is_comp(page));
  288. ut_ad(dict_table_is_comp(index->table));
  289. ut_ad(dict_index_is_clust(index));
  290. heap_no = rec_get_heap_no_new(rec);
  291. ut_ad(heap_no >= PAGE_HEAP_NO_USER_LOW);
  292. left = heap_no - PAGE_HEAP_NO_USER_LOW;
  293. if (UNIV_UNLIKELY(!left)) {
  294. return(0);
  295. }
  296. for (i = 0; i < n_recs; i++) {
  297. const rec_t* r = page + (page_zip_dir_get(page_zip, i)
  298. & PAGE_ZIP_DIR_SLOT_MASK);
  299. if (rec_get_heap_no_new(r) < heap_no) {
  300. n_ext += rec_get_n_extern_new(r, index,
  301. ULINT_UNDEFINED);
  302. if (!--left) {
  303. break;
  304. }
  305. }
  306. }
  307. return(n_ext);
  308. }
  309. /**************************************************************************
  310. Encode the length of a fixed-length column. */
  311. static
  312. byte*
  313. page_zip_fixed_field_encode(
  314. /*========================*/
  315. /* out: buf + length of encoded val */
  316. byte* buf, /* in: pointer to buffer where to write */
  317. ulint val) /* in: value to write */
  318. {
  319. ut_ad(val >= 2);
  320. if (UNIV_LIKELY(val < 126)) {
  321. /*
  322. 0 = nullable variable field of at most 255 bytes length;
  323. 1 = not null variable field of at most 255 bytes length;
  324. 126 = nullable variable field with maximum length >255;
  325. 127 = not null variable field with maximum length >255
  326. */
  327. *buf++ = (byte) val;
  328. } else {
  329. *buf++ = (byte) (0x80 | val >> 8);
  330. *buf++ = (byte) val;
  331. }
  332. return(buf);
  333. }
  334. /**************************************************************************
  335. Write the index information for the compressed page. */
  336. static
  337. ulint
  338. page_zip_fields_encode(
  339. /*===================*/
  340. /* out: used size of buf */
  341. ulint n, /* in: number of fields to compress */
  342. dict_index_t* index, /* in: index comprising at least n fields */
  343. ulint trx_id_pos,/* in: position of the trx_id column
  344. in the index, or ULINT_UNDEFINED if
  345. this is a non-leaf page */
  346. byte* buf) /* out: buffer of (n + 1) * 2 bytes */
  347. {
  348. const byte* buf_start = buf;
  349. ulint i;
  350. ulint col;
  351. ulint trx_id_col = 0;
  352. /* sum of lengths of preceding non-nullable fixed fields, or 0 */
  353. ulint fixed_sum = 0;
  354. ut_ad(trx_id_pos == ULINT_UNDEFINED || trx_id_pos < n);
  355. for (i = col = 0; i < n; i++) {
  356. dict_field_t* field = dict_index_get_nth_field(index, i);
  357. ulint val;
  358. if (dict_field_get_col(field)->prtype & DATA_NOT_NULL) {
  359. val = 1; /* set the "not nullable" flag */
  360. } else {
  361. val = 0; /* nullable field */
  362. }
  363. if (!field->fixed_len) {
  364. /* variable-length field */
  365. const dict_col_t* column
  366. = dict_field_get_col(field);
  367. if (UNIV_UNLIKELY(column->len > 255)
  368. || UNIV_UNLIKELY(column->mtype == DATA_BLOB)) {
  369. val |= 0x7e; /* max > 255 bytes */
  370. }
  371. if (fixed_sum) {
  372. /* write out the length of any
  373. preceding non-nullable fields */
  374. buf = page_zip_fixed_field_encode(
  375. buf, fixed_sum << 1 | 1);
  376. fixed_sum = 0;
  377. col++;
  378. }
  379. *buf++ = (byte) val;
  380. col++;
  381. } else if (val) {
  382. /* fixed-length non-nullable field */
  383. if (fixed_sum && UNIV_UNLIKELY
  384. (fixed_sum + field->fixed_len
  385. > DICT_MAX_INDEX_COL_LEN)) {
  386. /* Write out the length of the
  387. preceding non-nullable fields,
  388. to avoid exceeding the maximum
  389. length of a fixed-length column. */
  390. buf = page_zip_fixed_field_encode(
  391. buf, fixed_sum << 1 | 1);
  392. fixed_sum = 0;
  393. col++;
  394. }
  395. if (i && UNIV_UNLIKELY(i == trx_id_pos)) {
  396. if (fixed_sum) {
  397. /* Write out the length of any
  398. preceding non-nullable fields,
  399. and start a new trx_id column. */
  400. buf = page_zip_fixed_field_encode(
  401. buf, fixed_sum << 1 | 1);
  402. col++;
  403. }
  404. trx_id_col = col;
  405. fixed_sum = field->fixed_len;
  406. } else {
  407. /* add to the sum */
  408. fixed_sum += field->fixed_len;
  409. }
  410. } else {
  411. /* fixed-length nullable field */
  412. if (fixed_sum) {
  413. /* write out the length of any
  414. preceding non-nullable fields */
  415. buf = page_zip_fixed_field_encode(
  416. buf, fixed_sum << 1 | 1);
  417. fixed_sum = 0;
  418. col++;
  419. }
  420. buf = page_zip_fixed_field_encode(
  421. buf, field->fixed_len << 1);
  422. col++;
  423. }
  424. }
  425. if (fixed_sum) {
  426. /* Write out the lengths of last fixed-length columns. */
  427. buf = page_zip_fixed_field_encode(buf, fixed_sum << 1 | 1);
  428. }
  429. if (trx_id_pos != ULINT_UNDEFINED) {
  430. /* Write out the position of the trx_id column */
  431. i = trx_id_col;
  432. } else {
  433. /* Write out the number of nullable fields */
  434. i = index->n_nullable;
  435. }
  436. if (i < 128) {
  437. *buf++ = (byte) i;
  438. } else {
  439. *buf++ = (byte) (0x80 | i >> 8);
  440. *buf++ = (byte) i;
  441. }
  442. ut_ad((ulint) (buf - buf_start) <= (n + 2) * 2);
  443. return((ulint) (buf - buf_start));
  444. }
  445. /**************************************************************************
  446. Populate the dense page directory from the sparse directory. */
  447. static
  448. void
  449. page_zip_dir_encode(
  450. /*================*/
  451. const page_t* page, /* in: compact page */
  452. byte* buf, /* in: pointer to dense page directory[-1];
  453. out: dense directory on compressed page */
  454. const rec_t** recs) /* in: pointer to an array of 0, or NULL;
  455. out: dense page directory sorted by ascending
  456. address (and heap_no) */
  457. {
  458. const byte* rec;
  459. ulint status;
  460. ulint min_mark;
  461. ulint heap_no;
  462. ulint i;
  463. ulint n_heap;
  464. ulint offs;
  465. min_mark = 0;
  466. if (page_is_leaf(page)) {
  467. status = REC_STATUS_ORDINARY;
  468. } else {
  469. status = REC_STATUS_NODE_PTR;
  470. if (UNIV_UNLIKELY
  471. (mach_read_from_4(page + FIL_PAGE_PREV) == FIL_NULL)) {
  472. min_mark = REC_INFO_MIN_REC_FLAG;
  473. }
  474. }
  475. n_heap = page_dir_get_n_heap(page);
  476. /* Traverse the list of stored records in the collation order,
  477. starting from the first user record. */
  478. rec = page + PAGE_NEW_INFIMUM, TRUE;
  479. i = 0;
  480. for (;;) {
  481. ulint info_bits;
  482. offs = rec_get_next_offs(rec, TRUE);
  483. if (UNIV_UNLIKELY(offs == PAGE_NEW_SUPREMUM)) {
  484. break;
  485. }
  486. rec = page + offs;
  487. heap_no = rec_get_heap_no_new(rec);
  488. ut_a(heap_no >= PAGE_HEAP_NO_USER_LOW);
  489. ut_a(heap_no < n_heap);
  490. ut_a(offs < UNIV_PAGE_SIZE - PAGE_DIR);
  491. ut_a(offs >= PAGE_ZIP_START);
  492. #if PAGE_ZIP_DIR_SLOT_MASK & (PAGE_ZIP_DIR_SLOT_MASK + 1)
  493. # error "PAGE_ZIP_DIR_SLOT_MASK is not 1 less than a power of 2"
  494. #endif
  495. #if PAGE_ZIP_DIR_SLOT_MASK < UNIV_PAGE_SIZE - 1
  496. # error "PAGE_ZIP_DIR_SLOT_MASK < UNIV_PAGE_SIZE - 1"
  497. #endif
  498. if (UNIV_UNLIKELY(rec_get_n_owned_new(rec))) {
  499. offs |= PAGE_ZIP_DIR_SLOT_OWNED;
  500. }
  501. info_bits = rec_get_info_bits(rec, TRUE);
  502. if (UNIV_UNLIKELY(info_bits & REC_INFO_DELETED_FLAG)) {
  503. info_bits &= ~REC_INFO_DELETED_FLAG;
  504. offs |= PAGE_ZIP_DIR_SLOT_DEL;
  505. }
  506. ut_a(info_bits == min_mark);
  507. /* Only the smallest user record can have
  508. REC_INFO_MIN_REC_FLAG set. */
  509. min_mark = 0;
  510. mach_write_to_2(buf - PAGE_ZIP_DIR_SLOT_SIZE * ++i, offs);
  511. if (UNIV_LIKELY_NULL(recs)) {
  512. /* Ensure that each heap_no occurs at most once. */
  513. ut_a(!recs[heap_no - PAGE_HEAP_NO_USER_LOW]);
  514. /* exclude infimum and supremum */
  515. recs[heap_no - PAGE_HEAP_NO_USER_LOW] = rec;
  516. }
  517. ut_a(rec_get_status(rec) == status);
  518. }
  519. offs = page_header_get_field(page, PAGE_FREE);
  520. /* Traverse the free list (of deleted records). */
  521. while (offs) {
  522. ut_ad(!(offs & ~PAGE_ZIP_DIR_SLOT_MASK));
  523. rec = page + offs;
  524. heap_no = rec_get_heap_no_new(rec);
  525. ut_a(heap_no >= PAGE_HEAP_NO_USER_LOW);
  526. ut_a(heap_no < n_heap);
  527. ut_a(!rec[-REC_N_NEW_EXTRA_BYTES]); /* info_bits and n_owned */
  528. ut_a(rec_get_status(rec) == status);
  529. mach_write_to_2(buf - PAGE_ZIP_DIR_SLOT_SIZE * ++i, offs);
  530. if (UNIV_LIKELY_NULL(recs)) {
  531. /* Ensure that each heap_no occurs at most once. */
  532. ut_a(!recs[heap_no - PAGE_HEAP_NO_USER_LOW]);
  533. /* exclude infimum and supremum */
  534. recs[heap_no - PAGE_HEAP_NO_USER_LOW] = rec;
  535. }
  536. offs = rec_get_next_offs(rec, TRUE);
  537. }
  538. /* Ensure that each heap no occurs at least once. */
  539. ut_a(i + PAGE_HEAP_NO_USER_LOW == n_heap);
  540. }
  541. /**************************************************************************
  542. Allocate memory for zlib. */
  543. static
  544. void*
  545. page_zip_malloc(
  546. /*============*/
  547. void* opaque,
  548. uInt items,
  549. uInt size)
  550. {
  551. return(mem_heap_alloc(opaque, items * size));
  552. }
  553. /**************************************************************************
  554. Deallocate memory for zlib. */
  555. static
  556. void
  557. page_zip_free(
  558. /*==========*/
  559. void* opaque __attribute__((unused)),
  560. void* address __attribute__((unused)))
  561. {
  562. }
  563. /**************************************************************************
  564. Configure the zlib allocator to use the given memory heap. */
  565. UNIV_INTERN
  566. void
  567. page_zip_set_alloc(
  568. /*===============*/
  569. void* stream, /* in/out: zlib stream */
  570. mem_heap_t* heap) /* in: memory heap to use */
  571. {
  572. z_stream* strm = stream;
  573. strm->zalloc = page_zip_malloc;
  574. strm->zfree = page_zip_free;
  575. strm->opaque = heap;
  576. }
  577. #if 0 || defined UNIV_DEBUG || defined UNIV_ZIP_DEBUG
  578. # define PAGE_ZIP_COMPRESS_DBG
  579. #endif
  580. #ifdef PAGE_ZIP_COMPRESS_DBG
  581. /* Set this variable in a debugger to enable
  582. excessive logging in page_zip_compress(). */
  583. UNIV_INTERN ibool page_zip_compress_dbg;
  584. /* Set this variable in a debugger to enable
  585. binary logging of the data passed to deflate().
  586. When this variable is nonzero, it will act
  587. as a log file name generator. */
  588. UNIV_INTERN unsigned page_zip_compress_log;
  589. /**************************************************************************
  590. Wrapper for deflate(). Log the operation if page_zip_compress_dbg is set. */
  591. static
  592. ibool
  593. page_zip_compress_deflate(
  594. /*======================*/
  595. FILE* logfile,/* in: log file, or NULL */
  596. z_streamp strm, /* in/out: compressed stream for deflate() */
  597. int flush) /* in: deflate() flushing method */
  598. {
  599. int status;
  600. if (UNIV_UNLIKELY(page_zip_compress_dbg)) {
  601. ut_print_buf(stderr, strm->next_in, strm->avail_in);
  602. }
  603. if (UNIV_LIKELY_NULL(logfile)) {
  604. fwrite(strm->next_in, 1, strm->avail_in, logfile);
  605. }
  606. status = deflate(strm, flush);
  607. if (UNIV_UNLIKELY(page_zip_compress_dbg)) {
  608. fprintf(stderr, " -> %d\n", status);
  609. }
  610. return(status);
  611. }
  612. /* Redefine deflate(). */
  613. # undef deflate
  614. # define deflate(strm, flush) page_zip_compress_deflate(logfile, strm, flush)
  615. # define FILE_LOGFILE FILE* logfile,
  616. # define LOGFILE logfile,
  617. #else /* PAGE_ZIP_COMPRESS_DBG */
  618. # define FILE_LOGFILE
  619. # define LOGFILE
  620. #endif /* PAGE_ZIP_COMPRESS_DBG */
  621. /**************************************************************************
  622. Compress the records of a node pointer page. */
  623. static
  624. int
  625. page_zip_compress_node_ptrs(
  626. /*========================*/
  627. /* out: Z_OK, or a zlib error code */
  628. FILE_LOGFILE
  629. z_stream* c_stream, /* in/out: compressed page stream */
  630. const rec_t** recs, /* in: dense page directory
  631. sorted by address */
  632. ulint n_dense, /* in: size of recs[] */
  633. dict_index_t* index, /* in: the index of the page */
  634. byte* storage, /* in: end of dense page directory */
  635. mem_heap_t* heap) /* in: temporary memory heap */
  636. {
  637. int err = Z_OK;
  638. ulint* offsets = NULL;
  639. do {
  640. const rec_t* rec = *recs++;
  641. offsets = rec_get_offsets(rec, index, offsets,
  642. ULINT_UNDEFINED, &heap);
  643. /* Only leaf nodes may contain externally stored columns. */
  644. ut_ad(!rec_offs_any_extern(offsets));
  645. UNIV_MEM_ASSERT_RW(rec, rec_offs_data_size(offsets));
  646. UNIV_MEM_ASSERT_RW(rec - rec_offs_extra_size(offsets),
  647. rec_offs_extra_size(offsets));
  648. /* Compress the extra bytes. */
  649. c_stream->avail_in = rec - REC_N_NEW_EXTRA_BYTES
  650. - c_stream->next_in;
  651. if (c_stream->avail_in) {
  652. err = deflate(c_stream, Z_NO_FLUSH);
  653. if (UNIV_UNLIKELY(err != Z_OK)) {
  654. break;
  655. }
  656. }
  657. ut_ad(!c_stream->avail_in);
  658. /* Compress the data bytes, except node_ptr. */
  659. c_stream->next_in = (byte*) rec;
  660. c_stream->avail_in = rec_offs_data_size(offsets)
  661. - REC_NODE_PTR_SIZE;
  662. ut_ad(c_stream->avail_in);
  663. err = deflate(c_stream, Z_NO_FLUSH);
  664. if (UNIV_UNLIKELY(err != Z_OK)) {
  665. break;
  666. }
  667. ut_ad(!c_stream->avail_in);
  668. memcpy(storage - REC_NODE_PTR_SIZE
  669. * (rec_get_heap_no_new(rec) - 1),
  670. c_stream->next_in, REC_NODE_PTR_SIZE);
  671. c_stream->next_in += REC_NODE_PTR_SIZE;
  672. } while (--n_dense);
  673. return(err);
  674. }
  675. /**************************************************************************
  676. Compress the records of a leaf node of a secondary index. */
  677. static
  678. int
  679. page_zip_compress_sec(
  680. /*==================*/
  681. /* out: Z_OK, or a zlib error code */
  682. FILE_LOGFILE
  683. z_stream* c_stream, /* in/out: compressed page stream */
  684. const rec_t** recs, /* in: dense page directory
  685. sorted by address */
  686. ulint n_dense) /* in: size of recs[] */
  687. {
  688. int err = Z_OK;
  689. ut_ad(n_dense > 0);
  690. do {
  691. const rec_t* rec = *recs++;
  692. /* Compress everything up to this record. */
  693. c_stream->avail_in = rec - REC_N_NEW_EXTRA_BYTES
  694. - c_stream->next_in;
  695. if (UNIV_LIKELY(c_stream->avail_in)) {
  696. UNIV_MEM_ASSERT_RW(c_stream->next_in,
  697. c_stream->avail_in);
  698. err = deflate(c_stream, Z_NO_FLUSH);
  699. if (UNIV_UNLIKELY(err != Z_OK)) {
  700. break;
  701. }
  702. }
  703. ut_ad(!c_stream->avail_in);
  704. ut_ad(c_stream->next_in == rec - REC_N_NEW_EXTRA_BYTES);
  705. /* Skip the REC_N_NEW_EXTRA_BYTES. */
  706. c_stream->next_in = (byte*) rec;
  707. } while (--n_dense);
  708. return(err);
  709. }
  710. /**************************************************************************
  711. Compress a record of a leaf node of a clustered index that contains
  712. externally stored columns. */
  713. static
  714. int
  715. page_zip_compress_clust_ext(
  716. /*========================*/
  717. /* out: Z_OK, or a zlib error code */
  718. FILE_LOGFILE
  719. z_stream* c_stream, /* in/out: compressed page stream */
  720. const rec_t* rec, /* in: record */
  721. const ulint* offsets, /* in: rec_get_offsets(rec) */
  722. ulint trx_id_col, /* in: position of of DB_TRX_ID */
  723. byte* deleted, /* in: dense directory entry pointing
  724. to the head of the free list */
  725. byte* storage, /* in: end of dense page directory */
  726. byte** externs, /* in/out: pointer to the next
  727. available BLOB pointer */
  728. ulint* n_blobs) /* in/out: number of
  729. externally stored columns */
  730. {
  731. int err;
  732. ulint i;
  733. UNIV_MEM_ASSERT_RW(rec, rec_offs_data_size(offsets));
  734. UNIV_MEM_ASSERT_RW(rec - rec_offs_extra_size(offsets),
  735. rec_offs_extra_size(offsets));
  736. for (i = 0; i < rec_offs_n_fields(offsets); i++) {
  737. ulint len;
  738. const byte* src;
  739. if (UNIV_UNLIKELY(i == trx_id_col)) {
  740. ut_ad(!rec_offs_nth_extern(offsets, i));
  741. /* Store trx_id and roll_ptr
  742. in uncompressed form. */
  743. src = rec_get_nth_field(rec, offsets, i, &len);
  744. ut_ad(src + DATA_TRX_ID_LEN
  745. == rec_get_nth_field(rec, offsets,
  746. i + 1, &len));
  747. ut_ad(len == DATA_ROLL_PTR_LEN);
  748. /* Compress any preceding bytes. */
  749. c_stream->avail_in
  750. = src - c_stream->next_in;
  751. if (c_stream->avail_in) {
  752. err = deflate(c_stream, Z_NO_FLUSH);
  753. if (UNIV_UNLIKELY(err != Z_OK)) {
  754. return(err);
  755. }
  756. }
  757. ut_ad(!c_stream->avail_in);
  758. ut_ad(c_stream->next_in == src);
  759. memcpy(storage
  760. - (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)
  761. * (rec_get_heap_no_new(rec) - 1),
  762. c_stream->next_in,
  763. DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  764. c_stream->next_in
  765. += DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN;
  766. /* Skip also roll_ptr */
  767. i++;
  768. } else if (rec_offs_nth_extern(offsets, i)) {
  769. src = rec_get_nth_field(rec, offsets, i, &len);
  770. ut_ad(len >= BTR_EXTERN_FIELD_REF_SIZE);
  771. src += len - BTR_EXTERN_FIELD_REF_SIZE;
  772. c_stream->avail_in = src
  773. - c_stream->next_in;
  774. if (UNIV_LIKELY(c_stream->avail_in)) {
  775. err = deflate(c_stream, Z_NO_FLUSH);
  776. if (UNIV_UNLIKELY(err != Z_OK)) {
  777. return(err);
  778. }
  779. }
  780. ut_ad(!c_stream->avail_in);
  781. ut_ad(c_stream->next_in == src);
  782. /* Reserve space for the data at
  783. the end of the space reserved for
  784. the compressed data and the page
  785. modification log. */
  786. if (UNIV_UNLIKELY
  787. (c_stream->avail_out
  788. <= BTR_EXTERN_FIELD_REF_SIZE)) {
  789. /* out of space */
  790. return(Z_BUF_ERROR);
  791. }
  792. ut_ad(*externs == c_stream->next_out
  793. + c_stream->avail_out
  794. + 1/* end of modif. log */);
  795. c_stream->next_in
  796. += BTR_EXTERN_FIELD_REF_SIZE;
  797. /* Skip deleted records. */
  798. if (UNIV_LIKELY_NULL
  799. (page_zip_dir_find_low(
  800. storage, deleted,
  801. page_offset(rec)))) {
  802. continue;
  803. }
  804. (*n_blobs)++;
  805. c_stream->avail_out
  806. -= BTR_EXTERN_FIELD_REF_SIZE;
  807. *externs -= BTR_EXTERN_FIELD_REF_SIZE;
  808. /* Copy the BLOB pointer */
  809. memcpy(*externs, c_stream->next_in
  810. - BTR_EXTERN_FIELD_REF_SIZE,
  811. BTR_EXTERN_FIELD_REF_SIZE);
  812. }
  813. }
  814. return(Z_OK);
  815. }
  816. /**************************************************************************
  817. Compress the records of a leaf node of a clustered index. */
  818. static
  819. int
  820. page_zip_compress_clust(
  821. /*====================*/
  822. /* out: Z_OK, or a zlib error code */
  823. FILE_LOGFILE
  824. z_stream* c_stream, /* in/out: compressed page stream */
  825. const rec_t** recs, /* in: dense page directory
  826. sorted by address */
  827. ulint n_dense, /* in: size of recs[] */
  828. dict_index_t* index, /* in: the index of the page */
  829. ulint* n_blobs, /* in: 0; out: number of
  830. externally stored columns */
  831. ulint trx_id_col, /* index of the trx_id column */
  832. byte* deleted, /* in: dense directory entry pointing
  833. to the head of the free list */
  834. byte* storage, /* in: end of dense page directory */
  835. mem_heap_t* heap) /* in: temporary memory heap */
  836. {
  837. int err = Z_OK;
  838. ulint* offsets = NULL;
  839. /* BTR_EXTERN_FIELD_REF storage */
  840. byte* externs = storage - n_dense
  841. * (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  842. ut_ad(*n_blobs == 0);
  843. do {
  844. const rec_t* rec = *recs++;
  845. offsets = rec_get_offsets(rec, index, offsets,
  846. ULINT_UNDEFINED, &heap);
  847. ut_ad(rec_offs_n_fields(offsets)
  848. == dict_index_get_n_fields(index));
  849. UNIV_MEM_ASSERT_RW(rec, rec_offs_data_size(offsets));
  850. UNIV_MEM_ASSERT_RW(rec - rec_offs_extra_size(offsets),
  851. rec_offs_extra_size(offsets));
  852. /* Compress the extra bytes. */
  853. c_stream->avail_in = rec - REC_N_NEW_EXTRA_BYTES
  854. - c_stream->next_in;
  855. if (c_stream->avail_in) {
  856. err = deflate(c_stream, Z_NO_FLUSH);
  857. if (UNIV_UNLIKELY(err != Z_OK)) {
  858. goto func_exit;
  859. }
  860. }
  861. ut_ad(!c_stream->avail_in);
  862. ut_ad(c_stream->next_in == rec - REC_N_NEW_EXTRA_BYTES);
  863. /* Compress the data bytes. */
  864. c_stream->next_in = (byte*) rec;
  865. /* Check if there are any externally stored columns.
  866. For each externally stored column, store the
  867. BTR_EXTERN_FIELD_REF separately. */
  868. if (UNIV_UNLIKELY(rec_offs_any_extern(offsets))) {
  869. ut_ad(dict_index_is_clust(index));
  870. err = page_zip_compress_clust_ext(
  871. LOGFILE
  872. c_stream, rec, offsets, trx_id_col,
  873. deleted, storage, &externs, n_blobs);
  874. if (UNIV_UNLIKELY(err != Z_OK)) {
  875. goto func_exit;
  876. }
  877. } else {
  878. ulint len;
  879. const byte* src;
  880. /* Store trx_id and roll_ptr in uncompressed form. */
  881. src = rec_get_nth_field(rec, offsets,
  882. trx_id_col, &len);
  883. ut_ad(src + DATA_TRX_ID_LEN
  884. == rec_get_nth_field(rec, offsets,
  885. trx_id_col + 1, &len));
  886. ut_ad(len == DATA_ROLL_PTR_LEN);
  887. UNIV_MEM_ASSERT_RW(rec, rec_offs_data_size(offsets));
  888. UNIV_MEM_ASSERT_RW(rec - rec_offs_extra_size(offsets),
  889. rec_offs_extra_size(offsets));
  890. /* Compress any preceding bytes. */
  891. c_stream->avail_in = src - c_stream->next_in;
  892. if (c_stream->avail_in) {
  893. err = deflate(c_stream, Z_NO_FLUSH);
  894. if (UNIV_UNLIKELY(err != Z_OK)) {
  895. return(err);
  896. }
  897. }
  898. ut_ad(!c_stream->avail_in);
  899. ut_ad(c_stream->next_in == src);
  900. memcpy(storage
  901. - (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)
  902. * (rec_get_heap_no_new(rec) - 1),
  903. c_stream->next_in,
  904. DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  905. c_stream->next_in
  906. += DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN;
  907. /* Skip also roll_ptr */
  908. ut_ad(trx_id_col + 1 < rec_offs_n_fields(offsets));
  909. }
  910. /* Compress the last bytes of the record. */
  911. c_stream->avail_in = rec + rec_offs_data_size(offsets)
  912. - c_stream->next_in;
  913. if (c_stream->avail_in) {
  914. err = deflate(c_stream, Z_NO_FLUSH);
  915. if (UNIV_UNLIKELY(err != Z_OK)) {
  916. goto func_exit;
  917. }
  918. }
  919. ut_ad(!c_stream->avail_in);
  920. } while (--n_dense);
  921. func_exit:
  922. return(err);
  923. }
  924. /**************************************************************************
  925. Compress a page. */
  926. UNIV_INTERN
  927. ibool
  928. page_zip_compress(
  929. /*==============*/
  930. /* out: TRUE on success, FALSE on failure;
  931. page_zip will be left intact on failure. */
  932. page_zip_des_t* page_zip,/* in: size; out: data, n_blobs,
  933. m_start, m_end, m_nonempty */
  934. const page_t* page, /* in: uncompressed page */
  935. dict_index_t* index, /* in: index of the B-tree node */
  936. mtr_t* mtr) /* in: mini-transaction, or NULL */
  937. {
  938. z_stream c_stream;
  939. int err;
  940. ulint n_fields;/* number of index fields needed */
  941. byte* fields; /* index field information */
  942. byte* buf; /* compressed payload of the page */
  943. byte* buf_end;/* end of buf */
  944. ulint n_dense;
  945. ulint slot_size;/* amount of uncompressed bytes per record */
  946. const rec_t** recs; /* dense page directory, sorted by address */
  947. mem_heap_t* heap;
  948. ulint trx_id_col;
  949. ulint* offsets = NULL;
  950. ulint n_blobs = 0;
  951. byte* storage;/* storage of uncompressed columns */
  952. ullint usec = ut_time_us(NULL);
  953. #ifdef PAGE_ZIP_COMPRESS_DBG
  954. FILE* logfile = NULL;
  955. #endif
  956. ut_a(page_is_comp(page));
  957. ut_a(fil_page_get_type(page) == FIL_PAGE_INDEX);
  958. ut_ad(page_simple_validate_new((page_t*) page));
  959. ut_ad(page_zip_simple_validate(page_zip));
  960. UNIV_MEM_ASSERT_RW(page, UNIV_PAGE_SIZE);
  961. /* Check the data that will be omitted. */
  962. ut_a(!memcmp(page + (PAGE_NEW_INFIMUM - REC_N_NEW_EXTRA_BYTES),
  963. infimum_extra, sizeof infimum_extra));
  964. ut_a(!memcmp(page + PAGE_NEW_INFIMUM,
  965. infimum_data, sizeof infimum_data));
  966. ut_a(page[PAGE_NEW_SUPREMUM - REC_N_NEW_EXTRA_BYTES]
  967. /* info_bits == 0, n_owned <= max */
  968. <= PAGE_DIR_SLOT_MAX_N_OWNED);
  969. ut_a(!memcmp(page + (PAGE_NEW_SUPREMUM - REC_N_NEW_EXTRA_BYTES + 1),
  970. supremum_extra_data, sizeof supremum_extra_data));
  971. if (UNIV_UNLIKELY(!page_get_n_recs(page))) {
  972. ut_a(rec_get_next_offs(page + PAGE_NEW_INFIMUM, TRUE)
  973. == PAGE_NEW_SUPREMUM);
  974. }
  975. if (page_is_leaf(page)) {
  976. n_fields = dict_index_get_n_fields(index);
  977. } else {
  978. n_fields = dict_index_get_n_unique_in_tree(index);
  979. }
  980. /* The dense directory excludes the infimum and supremum records. */
  981. n_dense = page_dir_get_n_heap(page) - PAGE_HEAP_NO_USER_LOW;
  982. #ifdef PAGE_ZIP_COMPRESS_DBG
  983. if (UNIV_UNLIKELY(page_zip_compress_dbg)) {
  984. fprintf(stderr, "compress %p %p %lu %lu %lu\n",
  985. (void*) page_zip, (void*) page,
  986. page_is_leaf(page),
  987. n_fields, n_dense);
  988. }
  989. if (UNIV_UNLIKELY(page_zip_compress_log)) {
  990. /* Create a log file for every compression attempt. */
  991. char logfilename[9];
  992. ut_snprintf(logfilename, sizeof logfilename,
  993. "%08x", page_zip_compress_log++);
  994. logfile = fopen(logfilename, "wb");
  995. if (logfile) {
  996. /* Write the uncompressed page to the log. */
  997. fwrite(page, 1, UNIV_PAGE_SIZE, logfile);
  998. /* Record the compressed size as zero.
  999. This will be overwritten at successful exit. */
  1000. putc(0, logfile);
  1001. putc(0, logfile);
  1002. putc(0, logfile);
  1003. putc(0, logfile);
  1004. }
  1005. }
  1006. #endif /* PAGE_ZIP_COMPRESS_DBG */
  1007. page_zip_stat[page_zip->ssize - 1].compressed++;
  1008. if (UNIV_UNLIKELY(n_dense * PAGE_ZIP_DIR_SLOT_SIZE
  1009. >= page_zip_get_size(page_zip))) {
  1010. goto err_exit;
  1011. }
  1012. heap = mem_heap_create(page_zip_get_size(page_zip)
  1013. + n_fields * (2 + sizeof *offsets)
  1014. + n_dense * ((sizeof *recs)
  1015. - PAGE_ZIP_DIR_SLOT_SIZE)
  1016. + UNIV_PAGE_SIZE * 4
  1017. + (512 << MAX_MEM_LEVEL));
  1018. recs = mem_heap_zalloc(heap, n_dense * sizeof *recs);
  1019. fields = mem_heap_alloc(heap, (n_fields + 1) * 2);
  1020. buf = mem_heap_alloc(heap, page_zip_get_size(page_zip) - PAGE_DATA);
  1021. buf_end = buf + page_zip_get_size(page_zip) - PAGE_DATA;
  1022. /* Compress the data payload. */
  1023. page_zip_set_alloc(&c_stream, heap);
  1024. err = deflateInit2(&c_stream, Z_DEFAULT_COMPRESSION,
  1025. Z_DEFLATED, UNIV_PAGE_SIZE_SHIFT,
  1026. MAX_MEM_LEVEL, Z_DEFAULT_STRATEGY);
  1027. ut_a(err == Z_OK);
  1028. c_stream.next_out = buf;
  1029. /* Subtract the space reserved for uncompressed data. */
  1030. /* Page header and the end marker of the modification log */
  1031. c_stream.avail_out = buf_end - buf - 1;
  1032. /* Dense page directory and uncompressed columns, if any */
  1033. if (page_is_leaf(page)) {
  1034. if (dict_index_is_clust(index)) {
  1035. trx_id_col = dict_index_get_sys_col_pos(
  1036. index, DATA_TRX_ID);
  1037. ut_ad(trx_id_col > 0);
  1038. ut_ad(trx_id_col != ULINT_UNDEFINED);
  1039. slot_size = PAGE_ZIP_DIR_SLOT_SIZE
  1040. + DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN;
  1041. } else {
  1042. /* Signal the absence of trx_id
  1043. in page_zip_fields_encode() */
  1044. ut_ad(dict_index_get_sys_col_pos(index, DATA_TRX_ID)
  1045. == ULINT_UNDEFINED);
  1046. trx_id_col = 0;
  1047. slot_size = PAGE_ZIP_DIR_SLOT_SIZE;
  1048. }
  1049. } else {
  1050. slot_size = PAGE_ZIP_DIR_SLOT_SIZE + REC_NODE_PTR_SIZE;
  1051. trx_id_col = ULINT_UNDEFINED;
  1052. }
  1053. if (UNIV_UNLIKELY(c_stream.avail_out <= n_dense * slot_size
  1054. + 6/* sizeof(zlib header and footer) */)) {
  1055. goto zlib_error;
  1056. }
  1057. c_stream.avail_out -= n_dense * slot_size;
  1058. c_stream.avail_in = page_zip_fields_encode(n_fields, index,
  1059. trx_id_col, fields);
  1060. c_stream.next_in = fields;
  1061. if (UNIV_LIKELY(!trx_id_col)) {
  1062. trx_id_col = ULINT_UNDEFINED;
  1063. }
  1064. UNIV_MEM_ASSERT_RW(c_stream.next_in, c_stream.avail_in);
  1065. err = deflate(&c_stream, Z_FULL_FLUSH);
  1066. if (err != Z_OK) {
  1067. goto zlib_error;
  1068. }
  1069. ut_ad(!c_stream.avail_in);
  1070. page_zip_dir_encode(page, buf_end, recs);
  1071. c_stream.next_in = (byte*) page + PAGE_ZIP_START;
  1072. storage = buf_end - n_dense * PAGE_ZIP_DIR_SLOT_SIZE;
  1073. /* Compress the records in heap_no order. */
  1074. if (UNIV_UNLIKELY(!n_dense)) {
  1075. } else if (!page_is_leaf(page)) {
  1076. /* This is a node pointer page. */
  1077. err = page_zip_compress_node_ptrs(LOGFILE
  1078. &c_stream, recs, n_dense,
  1079. index, storage, heap);
  1080. if (UNIV_UNLIKELY(err != Z_OK)) {
  1081. goto zlib_error;
  1082. }
  1083. } else if (UNIV_LIKELY(trx_id_col == ULINT_UNDEFINED)) {
  1084. /* This is a leaf page in a secondary index. */
  1085. err = page_zip_compress_sec(LOGFILE
  1086. &c_stream, recs, n_dense);
  1087. if (UNIV_UNLIKELY(err != Z_OK)) {
  1088. goto zlib_error;
  1089. }
  1090. } else {
  1091. /* This is a leaf page in a clustered index. */
  1092. err = page_zip_compress_clust(LOGFILE
  1093. &c_stream, recs, n_dense,
  1094. index, &n_blobs, trx_id_col,
  1095. buf_end - PAGE_ZIP_DIR_SLOT_SIZE
  1096. * page_get_n_recs(page),
  1097. storage, heap);
  1098. if (UNIV_UNLIKELY(err != Z_OK)) {
  1099. goto zlib_error;
  1100. }
  1101. }
  1102. /* Finish the compression. */
  1103. ut_ad(!c_stream.avail_in);
  1104. /* Compress any trailing garbage, in case the last record was
  1105. allocated from an originally longer space on the free list,
  1106. or the data of the last record from page_zip_compress_sec(). */
  1107. c_stream.avail_in
  1108. = page_header_get_field(page, PAGE_HEAP_TOP)
  1109. - (c_stream.next_in - page);
  1110. ut_a(c_stream.avail_in <= UNIV_PAGE_SIZE - PAGE_ZIP_START - PAGE_DIR);
  1111. UNIV_MEM_ASSERT_RW(c_stream.next_in, c_stream.avail_in);
  1112. err = deflate(&c_stream, Z_FINISH);
  1113. if (UNIV_UNLIKELY(err != Z_STREAM_END)) {
  1114. zlib_error:
  1115. deflateEnd(&c_stream);
  1116. mem_heap_free(heap);
  1117. err_exit:
  1118. #ifdef PAGE_ZIP_COMPRESS_DBG
  1119. if (logfile) {
  1120. fclose(logfile);
  1121. }
  1122. #endif /* PAGE_ZIP_COMPRESS_DBG */
  1123. page_zip_stat[page_zip->ssize - 1].compressed_usec
  1124. += ut_time_us(NULL) - usec;
  1125. return(FALSE);
  1126. }
  1127. err = deflateEnd(&c_stream);
  1128. ut_a(err == Z_OK);
  1129. ut_ad(buf + c_stream.total_out == c_stream.next_out);
  1130. ut_ad((ulint) (storage - c_stream.next_out) >= c_stream.avail_out);
  1131. /* Valgrind believes that zlib does not initialize some bits
  1132. in the last 7 or 8 bytes of the stream. Make Valgrind happy. */
  1133. UNIV_MEM_VALID(buf, c_stream.total_out);
  1134. /* Zero out the area reserved for the modification log.
  1135. Space for the end marker of the modification log is not
  1136. included in avail_out. */
  1137. memset(c_stream.next_out, 0, c_stream.avail_out + 1/* end marker */);
  1138. #ifdef UNIV_DEBUG
  1139. page_zip->m_start =
  1140. #endif /* UNIV_DEBUG */
  1141. page_zip->m_end = PAGE_DATA + c_stream.total_out;
  1142. page_zip->m_nonempty = FALSE;
  1143. page_zip->n_blobs = n_blobs;
  1144. /* Copy those header fields that will not be written
  1145. in buf_flush_init_for_writing() */
  1146. memcpy(page_zip->data + FIL_PAGE_PREV, page + FIL_PAGE_PREV,
  1147. FIL_PAGE_LSN - FIL_PAGE_PREV);
  1148. memcpy(page_zip->data + FIL_PAGE_TYPE, page + FIL_PAGE_TYPE, 2);
  1149. memcpy(page_zip->data + FIL_PAGE_DATA, page + FIL_PAGE_DATA,
  1150. PAGE_DATA - FIL_PAGE_DATA);
  1151. /* Copy the rest of the compressed page */
  1152. memcpy(page_zip->data + PAGE_DATA, buf,
  1153. page_zip_get_size(page_zip) - PAGE_DATA);
  1154. mem_heap_free(heap);
  1155. #ifdef UNIV_ZIP_DEBUG
  1156. ut_a(page_zip_validate(page_zip, page));
  1157. #endif /* UNIV_ZIP_DEBUG */
  1158. if (mtr) {
  1159. page_zip_compress_write_log(page_zip, page, index, mtr);
  1160. }
  1161. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  1162. #ifdef PAGE_ZIP_COMPRESS_DBG
  1163. if (logfile) {
  1164. /* Record the compressed size of the block. */
  1165. byte sz[4];
  1166. mach_write_to_4(sz, c_stream.total_out);
  1167. fseek(logfile, UNIV_PAGE_SIZE, SEEK_SET);
  1168. fwrite(sz, 1, sizeof sz, logfile);
  1169. fclose(logfile);
  1170. }
  1171. #endif /* PAGE_ZIP_COMPRESS_DBG */
  1172. {
  1173. page_zip_stat_t* zip_stat
  1174. = &page_zip_stat[page_zip->ssize - 1];
  1175. zip_stat->compressed_ok++;
  1176. zip_stat->compressed_usec += ut_time_us(NULL) - usec;
  1177. }
  1178. return(TRUE);
  1179. }
  1180. /**************************************************************************
  1181. Compare two page directory entries. */
  1182. UNIV_INLINE
  1183. ibool
  1184. page_zip_dir_cmp(
  1185. /*=============*/
  1186. /* out: positive if rec1 > rec2 */
  1187. const rec_t* rec1, /* in: rec1 */
  1188. const rec_t* rec2) /* in: rec2 */
  1189. {
  1190. return(rec1 > rec2);
  1191. }
  1192. /**************************************************************************
  1193. Sort the dense page directory by address (heap_no). */
  1194. static
  1195. void
  1196. page_zip_dir_sort(
  1197. /*==============*/
  1198. rec_t** arr, /* in/out: dense page directory */
  1199. rec_t** aux_arr,/* in/out: work area */
  1200. ulint low, /* in: lower bound of the sorting area, inclusive */
  1201. ulint high) /* in: upper bound of the sorting area, exclusive */
  1202. {
  1203. UT_SORT_FUNCTION_BODY(page_zip_dir_sort, arr, aux_arr, low, high,
  1204. page_zip_dir_cmp);
  1205. }
  1206. /**************************************************************************
  1207. Deallocate the index information initialized by page_zip_fields_decode(). */
  1208. static
  1209. void
  1210. page_zip_fields_free(
  1211. /*=================*/
  1212. dict_index_t* index) /* in: dummy index to be freed */
  1213. {
  1214. if (index) {
  1215. dict_table_t* table = index->table;
  1216. mem_heap_free(index->heap);
  1217. mutex_free(&(table->autoinc_mutex));
  1218. mem_heap_free(table->heap);
  1219. }
  1220. }
  1221. /**************************************************************************
  1222. Read the index information for the compressed page. */
  1223. static
  1224. dict_index_t*
  1225. page_zip_fields_decode(
  1226. /*===================*/
  1227. /* out,own: dummy index describing the page,
  1228. or NULL on error */
  1229. const byte* buf, /* in: index information */
  1230. const byte* end, /* in: end of buf */
  1231. ulint* trx_id_col)/* in: NULL for non-leaf pages;
  1232. for leaf pages, pointer to where to store
  1233. the position of the trx_id column */
  1234. {
  1235. const byte* b;
  1236. ulint n;
  1237. ulint i;
  1238. ulint val;
  1239. dict_table_t* table;
  1240. dict_index_t* index;
  1241. /* Determine the number of fields. */
  1242. for (b = buf, n = 0; b < end; n++) {
  1243. if (*b++ & 0x80) {
  1244. b++; /* skip the second byte */
  1245. }
  1246. }
  1247. n--; /* n_nullable or trx_id */
  1248. if (UNIV_UNLIKELY(n > REC_MAX_N_FIELDS)) {
  1249. page_zip_fail(("page_zip_fields_decode: n = %lu\n",
  1250. (ulong) n));
  1251. return(NULL);
  1252. }
  1253. if (UNIV_UNLIKELY(b > end)) {
  1254. page_zip_fail(("page_zip_fields_decode: %p > %p\n",
  1255. (const void*) b, (const void*) end));
  1256. return(NULL);
  1257. }
  1258. table = dict_mem_table_create("ZIP_DUMMY", DICT_HDR_SPACE, n,
  1259. DICT_TF_COMPACT);
  1260. index = dict_mem_index_create("ZIP_DUMMY", "ZIP_DUMMY",
  1261. DICT_HDR_SPACE, 0, n);
  1262. index->table = table;
  1263. index->n_uniq = n;
  1264. /* avoid ut_ad(index->cached) in dict_index_get_n_unique_in_tree */
  1265. index->cached = TRUE;
  1266. /* Initialize the fields. */
  1267. for (b = buf, i = 0; i < n; i++) {
  1268. ulint mtype;
  1269. ulint len;
  1270. val = *b++;
  1271. if (UNIV_UNLIKELY(val & 0x80)) {
  1272. /* fixed length > 62 bytes */
  1273. val = (val & 0x7f) << 8 | *b++;
  1274. len = val >> 1;
  1275. mtype = DATA_FIXBINARY;
  1276. } else if (UNIV_UNLIKELY(val >= 126)) {
  1277. /* variable length with max > 255 bytes */
  1278. len = 0x7fff;
  1279. mtype = DATA_BINARY;
  1280. } else if (val <= 1) {
  1281. /* variable length with max <= 255 bytes */
  1282. len = 0;
  1283. mtype = DATA_BINARY;
  1284. } else {
  1285. /* fixed length < 62 bytes */
  1286. len = val >> 1;
  1287. mtype = DATA_FIXBINARY;
  1288. }
  1289. dict_mem_table_add_col(table, NULL, NULL, mtype,
  1290. val & 1 ? DATA_NOT_NULL : 0, len);
  1291. dict_index_add_col(index, table,
  1292. dict_table_get_nth_col(table, i), 0);
  1293. }
  1294. val = *b++;
  1295. if (UNIV_UNLIKELY(val & 0x80)) {
  1296. val = (val & 0x7f) << 8 | *b++;
  1297. }
  1298. /* Decode the position of the trx_id column. */
  1299. if (trx_id_col) {
  1300. if (!val) {
  1301. val = ULINT_UNDEFINED;
  1302. } else if (UNIV_UNLIKELY(val >= n)) {
  1303. page_zip_fields_free(index);
  1304. index = NULL;
  1305. } else {
  1306. index->type = DICT_CLUSTERED;
  1307. }
  1308. *trx_id_col = val;
  1309. } else {
  1310. /* Decode the number of nullable fields. */
  1311. if (UNIV_UNLIKELY(index->n_nullable > val)) {
  1312. page_zip_fields_free(index);
  1313. index = NULL;
  1314. } else {
  1315. index->n_nullable = val;
  1316. }
  1317. }
  1318. ut_ad(b == end);
  1319. return(index);
  1320. }
  1321. /**************************************************************************
  1322. Populate the sparse page directory from the dense directory. */
  1323. static
  1324. ibool
  1325. page_zip_dir_decode(
  1326. /*================*/
  1327. /* out: TRUE on success,
  1328. FALSE on failure */
  1329. const page_zip_des_t* page_zip,/* in: dense page directory on
  1330. compressed page */
  1331. page_t* page, /* in: compact page with valid header;
  1332. out: trailer and sparse page directory
  1333. filled in */
  1334. rec_t** recs, /* out: dense page directory sorted by
  1335. ascending address (and heap_no) */
  1336. rec_t** recs_aux,/* in/out: scratch area */
  1337. ulint n_dense)/* in: number of user records, and
  1338. size of recs[] and recs_aux[] */
  1339. {
  1340. ulint i;
  1341. ulint n_recs;
  1342. byte* slot;
  1343. n_recs = page_get_n_recs(page);
  1344. if (UNIV_UNLIKELY(n_recs > n_dense)) {
  1345. page_zip_fail(("page_zip_dir_decode 1: %lu > %lu\n",
  1346. (ulong) n_recs, (ulong) n_dense));
  1347. return(FALSE);
  1348. }
  1349. /* Traverse the list of stored records in the sorting order,
  1350. starting from the first user record. */
  1351. slot = page + (UNIV_PAGE_SIZE - PAGE_DIR - PAGE_DIR_SLOT_SIZE);
  1352. UNIV_PREFETCH_RW(slot);
  1353. /* Zero out the page trailer. */
  1354. memset(slot + PAGE_DIR_SLOT_SIZE, 0, PAGE_DIR);
  1355. mach_write_to_2(slot, PAGE_NEW_INFIMUM);
  1356. slot -= PAGE_DIR_SLOT_SIZE;
  1357. UNIV_PREFETCH_RW(slot);
  1358. /* Initialize the sparse directory and copy the dense directory. */
  1359. for (i = 0; i < n_recs; i++) {
  1360. ulint offs = page_zip_dir_get(page_zip, i);
  1361. if (offs & PAGE_ZIP_DIR_SLOT_OWNED) {
  1362. mach_write_to_2(slot, offs & PAGE_ZIP_DIR_SLOT_MASK);
  1363. slot -= PAGE_DIR_SLOT_SIZE;
  1364. UNIV_PREFETCH_RW(slot);
  1365. }
  1366. if (UNIV_UNLIKELY((offs & PAGE_ZIP_DIR_SLOT_MASK)
  1367. < PAGE_ZIP_START + REC_N_NEW_EXTRA_BYTES)) {
  1368. page_zip_fail(("page_zip_dir_decode 2: %u %u %lx\n",
  1369. (unsigned) i, (unsigned) n_recs,
  1370. (ulong) offs));
  1371. return(FALSE);
  1372. }
  1373. recs[i] = page + (offs & PAGE_ZIP_DIR_SLOT_MASK);
  1374. }
  1375. mach_write_to_2(slot, PAGE_NEW_SUPREMUM);
  1376. {
  1377. const page_dir_slot_t* last_slot = page_dir_get_nth_slot(
  1378. page, page_dir_get_n_slots(page) - 1);
  1379. if (UNIV_UNLIKELY(slot != last_slot)) {
  1380. page_zip_fail(("page_zip_dir_decode 3: %p != %p\n",
  1381. (const void*) slot,
  1382. (const void*) last_slot));
  1383. return(FALSE);
  1384. }
  1385. }
  1386. /* Copy the rest of the dense directory. */
  1387. for (; i < n_dense; i++) {
  1388. ulint offs = page_zip_dir_get(page_zip, i);
  1389. if (UNIV_UNLIKELY(offs & ~PAGE_ZIP_DIR_SLOT_MASK)) {
  1390. page_zip_fail(("page_zip_dir_decode 4: %u %u %lx\n",
  1391. (unsigned) i, (unsigned) n_dense,
  1392. (ulong) offs));
  1393. return(FALSE);
  1394. }
  1395. recs[i] = page + offs;
  1396. }
  1397. if (UNIV_LIKELY(n_dense > 1)) {
  1398. page_zip_dir_sort(recs, recs_aux, 0, n_dense);
  1399. }
  1400. return(TRUE);
  1401. }
  1402. /**************************************************************************
  1403. Initialize the REC_N_NEW_EXTRA_BYTES of each record. */
  1404. static
  1405. ibool
  1406. page_zip_set_extra_bytes(
  1407. /*=====================*/
  1408. /* out: TRUE on success,
  1409. FALSE on failure */
  1410. const page_zip_des_t* page_zip,/* in: compressed page */
  1411. page_t* page, /* in/out: uncompressed page */
  1412. ulint info_bits)/* in: REC_INFO_MIN_REC_FLAG or 0 */
  1413. {
  1414. ulint n;
  1415. ulint i;
  1416. ulint n_owned = 1;
  1417. ulint offs;
  1418. rec_t* rec;
  1419. n = page_get_n_recs(page);
  1420. rec = page + PAGE_NEW_INFIMUM;
  1421. for (i = 0; i < n; i++) {
  1422. offs = page_zip_dir_get(page_zip, i);
  1423. if (UNIV_UNLIKELY(offs & PAGE_ZIP_DIR_SLOT_DEL)) {
  1424. info_bits |= REC_INFO_DELETED_FLAG;
  1425. }
  1426. if (UNIV_UNLIKELY(offs & PAGE_ZIP_DIR_SLOT_OWNED)) {
  1427. info_bits |= n_owned;
  1428. n_owned = 1;
  1429. } else {
  1430. n_owned++;
  1431. }
  1432. offs &= PAGE_ZIP_DIR_SLOT_MASK;
  1433. if (UNIV_UNLIKELY(offs < PAGE_ZIP_START
  1434. + REC_N_NEW_EXTRA_BYTES)) {
  1435. page_zip_fail(("page_zip_set_extra_bytes 1:"
  1436. " %u %u %lx\n",
  1437. (unsigned) i, (unsigned) n,
  1438. (ulong) offs));
  1439. return(FALSE);
  1440. }
  1441. rec_set_next_offs_new(rec, offs);
  1442. rec = page + offs;
  1443. rec[-REC_N_NEW_EXTRA_BYTES] = (byte) info_bits;
  1444. info_bits = 0;
  1445. }
  1446. /* Set the next pointer of the last user record. */
  1447. rec_set_next_offs_new(rec, PAGE_NEW_SUPREMUM);
  1448. /* Set n_owned of the supremum record. */
  1449. page[PAGE_NEW_SUPREMUM - REC_N_NEW_EXTRA_BYTES] = (byte) n_owned;
  1450. /* The dense directory excludes the infimum and supremum records. */
  1451. n = page_dir_get_n_heap(page) - PAGE_HEAP_NO_USER_LOW;
  1452. if (i >= n) {
  1453. if (UNIV_LIKELY(i == n)) {
  1454. return(TRUE);
  1455. }
  1456. page_zip_fail(("page_zip_set_extra_bytes 2: %u != %u\n",
  1457. (unsigned) i, (unsigned) n));
  1458. return(FALSE);
  1459. }
  1460. offs = page_zip_dir_get(page_zip, i);
  1461. /* Set the extra bytes of deleted records on the free list. */
  1462. for (;;) {
  1463. if (UNIV_UNLIKELY(!offs)
  1464. || UNIV_UNLIKELY(offs & ~PAGE_ZIP_DIR_SLOT_MASK)) {
  1465. page_zip_fail(("page_zip_set_extra_bytes 3: %lx\n",
  1466. (ulong) offs));
  1467. return(FALSE);
  1468. }
  1469. rec = page + offs;
  1470. rec[-REC_N_NEW_EXTRA_BYTES] = 0; /* info_bits and n_owned */
  1471. if (++i == n) {
  1472. break;
  1473. }
  1474. offs = page_zip_dir_get(page_zip, i);
  1475. rec_set_next_offs_new(rec, offs);
  1476. }
  1477. /* Terminate the free list. */
  1478. rec[-REC_N_NEW_EXTRA_BYTES] = 0; /* info_bits and n_owned */
  1479. rec_set_next_offs_new(rec, 0);
  1480. return(TRUE);
  1481. }
  1482. /**************************************************************************
  1483. Apply the modification log to a record containing externally stored
  1484. columns. Do not copy the fields that are stored separately. */
  1485. static
  1486. const byte*
  1487. page_zip_apply_log_ext(
  1488. /*===================*/
  1489. /* out: pointer to modification log,
  1490. or NULL on failure */
  1491. rec_t* rec, /* in/out: record */
  1492. const ulint* offsets, /* in: rec_get_offsets(rec) */
  1493. ulint trx_id_col, /* in: position of of DB_TRX_ID */
  1494. const byte* data, /* in: modification log */
  1495. const byte* end) /* in: end of modification log */
  1496. {
  1497. ulint i;
  1498. ulint len;
  1499. byte* next_out = rec;
  1500. /* Check if there are any externally stored columns.
  1501. For each externally stored column, skip the
  1502. BTR_EXTERN_FIELD_REF. */
  1503. for (i = 0; i < rec_offs_n_fields(offsets); i++) {
  1504. byte* dst;
  1505. if (UNIV_UNLIKELY(i == trx_id_col)) {
  1506. /* Skip trx_id and roll_ptr */
  1507. dst = rec_get_nth_field(rec, offsets,
  1508. i, &len);
  1509. if (UNIV_UNLIKELY(dst - next_out >= end - data)
  1510. || UNIV_UNLIKELY
  1511. (len < (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN))
  1512. || rec_offs_nth_extern(offsets, i)) {
  1513. page_zip_fail(("page_zip_apply_log_ext:"
  1514. " trx_id len %lu,"
  1515. " %p - %p >= %p - %p\n",
  1516. (ulong) len,
  1517. (const void*) dst,
  1518. (const void*) next_out,
  1519. (const void*) end,
  1520. (const void*) data));
  1521. return(NULL);
  1522. }
  1523. memcpy(next_out, data, dst - next_out);
  1524. data += dst - next_out;
  1525. next_out = dst + (DATA_TRX_ID_LEN
  1526. + DATA_ROLL_PTR_LEN);
  1527. } else if (rec_offs_nth_extern(offsets, i)) {
  1528. dst = rec_get_nth_field(rec, offsets,
  1529. i, &len);
  1530. ut_ad(len
  1531. >= BTR_EXTERN_FIELD_REF_SIZE);
  1532. len += dst - next_out
  1533. - BTR_EXTERN_FIELD_REF_SIZE;
  1534. if (UNIV_UNLIKELY(data + len >= end)) {
  1535. page_zip_fail(("page_zip_apply_log_ext: "
  1536. "ext %p+%lu >= %p\n",
  1537. (const void*) data,
  1538. (ulong) len,
  1539. (const void*) end));
  1540. return(NULL);
  1541. }
  1542. memcpy(next_out, data, len);
  1543. data += len;
  1544. next_out += len
  1545. + BTR_EXTERN_FIELD_REF_SIZE;
  1546. }
  1547. }
  1548. /* Copy the last bytes of the record. */
  1549. len = rec_get_end(rec, offsets) - next_out;
  1550. if (UNIV_UNLIKELY(data + len >= end)) {
  1551. page_zip_fail(("page_zip_apply_log_ext: "
  1552. "last %p+%lu >= %p\n",
  1553. (const void*) data,
  1554. (ulong) len,
  1555. (const void*) end));
  1556. return(NULL);
  1557. }
  1558. memcpy(next_out, data, len);
  1559. data += len;
  1560. return(data);
  1561. }
  1562. /**************************************************************************
  1563. Apply the modification log to an uncompressed page.
  1564. Do not copy the fields that are stored separately. */
  1565. static
  1566. const byte*
  1567. page_zip_apply_log(
  1568. /*===============*/
  1569. /* out: pointer to end of modification log,
  1570. or NULL on failure */
  1571. const byte* data, /* in: modification log */
  1572. ulint size, /* in: maximum length of the log, in bytes */
  1573. rec_t** recs, /* in: dense page directory,
  1574. sorted by address (indexed by
  1575. heap_no - PAGE_HEAP_NO_USER_LOW) */
  1576. ulint n_dense,/* in: size of recs[] */
  1577. ulint trx_id_col,/* in: column number of trx_id in the index,
  1578. or ULINT_UNDEFINED if none */
  1579. ulint heap_status,
  1580. /* in: heap_no and status bits for
  1581. the next record to uncompress */
  1582. dict_index_t* index, /* in: index of the page */
  1583. ulint* offsets)/* in/out: work area for
  1584. rec_get_offsets_reverse() */
  1585. {
  1586. const byte* const end = data + size;
  1587. for (;;) {
  1588. ulint val;
  1589. rec_t* rec;
  1590. ulint len;
  1591. ulint hs;
  1592. val = *data++;
  1593. if (UNIV_UNLIKELY(!val)) {
  1594. return(data - 1);
  1595. }
  1596. if (val & 0x80) {
  1597. val = (val & 0x7f) << 8 | *data++;
  1598. if (UNIV_UNLIKELY(!val)) {
  1599. page_zip_fail(("page_zip_apply_log:"
  1600. " invalid val %x%x\n",
  1601. data[-2], data[-1]));
  1602. return(NULL);
  1603. }
  1604. }
  1605. if (UNIV_UNLIKELY(data >= end)) {
  1606. page_zip_fail(("page_zip_apply_log: %p >= %p\n",
  1607. (const void*) data,
  1608. (const void*) end));
  1609. return(NULL);
  1610. }
  1611. if (UNIV_UNLIKELY((val >> 1) > n_dense)) {
  1612. page_zip_fail(("page_zip_apply_log: %lu>>1 > %lu\n",
  1613. (ulong) val, (ulong) n_dense));
  1614. return(NULL);
  1615. }
  1616. /* Determine the heap number and status bits of the record. */
  1617. rec = recs[(val >> 1) - 1];
  1618. hs = ((val >> 1) + 1) << REC_HEAP_NO_SHIFT;
  1619. hs |= heap_status & ((1 << REC_HEAP_NO_SHIFT) - 1);
  1620. /* This may either be an old record that is being
  1621. overwritten (updated in place, or allocated from
  1622. the free list), or a new record, with the next
  1623. available_heap_no. */
  1624. if (UNIV_UNLIKELY(hs > heap_status)) {
  1625. page_zip_fail(("page_zip_apply_log: %lu > %lu\n",
  1626. (ulong) hs, (ulong) heap_status));
  1627. return(NULL);
  1628. } else if (hs == heap_status) {
  1629. /* A new record was allocated from the heap. */
  1630. if (UNIV_UNLIKELY(val & 1)) {
  1631. /* Only existing records may be cleared. */
  1632. page_zip_fail(("page_zip_apply_log:"
  1633. " attempting to create"
  1634. " deleted rec %lu\n",
  1635. (ulong) hs));
  1636. return(NULL);
  1637. }
  1638. heap_status += 1 << REC_HEAP_NO_SHIFT;
  1639. }
  1640. mach_write_to_2(rec - REC_NEW_HEAP_NO, hs);
  1641. if (val & 1) {
  1642. /* Clear the data bytes of the record. */
  1643. mem_heap_t* heap = NULL;
  1644. ulint* offs;
  1645. offs = rec_get_offsets(rec, index, offsets,
  1646. ULINT_UNDEFINED, &heap);
  1647. memset(rec, 0, rec_offs_data_size(offs));
  1648. if (UNIV_LIKELY_NULL(heap)) {
  1649. mem_heap_free(heap);
  1650. }
  1651. continue;
  1652. }
  1653. #if REC_STATUS_NODE_PTR != TRUE
  1654. # error "REC_STATUS_NODE_PTR != TRUE"
  1655. #endif
  1656. rec_get_offsets_reverse(data, index,
  1657. hs & REC_STATUS_NODE_PTR,
  1658. offsets);
  1659. rec_offs_make_valid(rec, index, offsets);
  1660. /* Copy the extra bytes (backwards). */
  1661. {
  1662. byte* start = rec_get_start(rec, offsets);
  1663. byte* b = rec - REC_N_NEW_EXTRA_BYTES;
  1664. while (b != start) {
  1665. *--b = *data++;
  1666. }
  1667. }
  1668. /* Copy the data bytes. */
  1669. if (UNIV_UNLIKELY(rec_offs_any_extern(offsets))) {
  1670. /* Non-leaf nodes should not contain any
  1671. externally stored columns. */
  1672. if (UNIV_UNLIKELY(hs & REC_STATUS_NODE_PTR)) {
  1673. page_zip_fail(("page_zip_apply_log: "
  1674. "%lu&REC_STATUS_NODE_PTR\n",
  1675. (ulong) hs));
  1676. return(NULL);
  1677. }
  1678. data = page_zip_apply_log_ext(
  1679. rec, offsets, trx_id_col, data, end);
  1680. if (UNIV_UNLIKELY(!data)) {
  1681. return(NULL);
  1682. }
  1683. } else if (UNIV_UNLIKELY(hs & REC_STATUS_NODE_PTR)) {
  1684. len = rec_offs_data_size(offsets)
  1685. - REC_NODE_PTR_SIZE;
  1686. /* Copy the data bytes, except node_ptr. */
  1687. if (UNIV_UNLIKELY(data + len >= end)) {
  1688. page_zip_fail(("page_zip_apply_log: "
  1689. "node_ptr %p+%lu >= %p\n",
  1690. (const void*) data,
  1691. (ulong) len,
  1692. (const void*) end));
  1693. return(NULL);
  1694. }
  1695. memcpy(rec, data, len);
  1696. data += len;
  1697. } else if (UNIV_LIKELY(trx_id_col == ULINT_UNDEFINED)) {
  1698. len = rec_offs_data_size(offsets);
  1699. /* Copy all data bytes of
  1700. a record in a secondary index. */
  1701. if (UNIV_UNLIKELY(data + len >= end)) {
  1702. page_zip_fail(("page_zip_apply_log: "
  1703. "sec %p+%lu >= %p\n",
  1704. (const void*) data,
  1705. (ulong) len,
  1706. (const void*) end));
  1707. return(NULL);
  1708. }
  1709. memcpy(rec, data, len);
  1710. data += len;
  1711. } else {
  1712. /* Skip DB_TRX_ID and DB_ROLL_PTR. */
  1713. ulint l = rec_get_nth_field_offs(offsets,
  1714. trx_id_col, &len);
  1715. byte* b;
  1716. if (UNIV_UNLIKELY(data + l >= end)
  1717. || UNIV_UNLIKELY(len < (DATA_TRX_ID_LEN
  1718. + DATA_ROLL_PTR_LEN))) {
  1719. page_zip_fail(("page_zip_apply_log: "
  1720. "trx_id %p+%lu >= %p\n",
  1721. (const void*) data,
  1722. (ulong) l,
  1723. (const void*) end));
  1724. return(NULL);
  1725. }
  1726. /* Copy any preceding data bytes. */
  1727. memcpy(rec, data, l);
  1728. data += l;
  1729. /* Copy any bytes following DB_TRX_ID, DB_ROLL_PTR. */
  1730. b = rec + l + (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  1731. len = rec_get_end(rec, offsets) - b;
  1732. if (UNIV_UNLIKELY(data + len >= end)) {
  1733. page_zip_fail(("page_zip_apply_log: "
  1734. "clust %p+%lu >= %p\n",
  1735. (const void*) data,
  1736. (ulong) len,
  1737. (const void*) end));
  1738. return(NULL);
  1739. }
  1740. memcpy(b, data, len);
  1741. data += len;
  1742. }
  1743. }
  1744. }
  1745. /**************************************************************************
  1746. Decompress the records of a node pointer page. */
  1747. static
  1748. ibool
  1749. page_zip_decompress_node_ptrs(
  1750. /*==========================*/
  1751. /* out: TRUE on success,
  1752. FALSE on failure */
  1753. page_zip_des_t* page_zip, /* in/out: compressed page */
  1754. z_stream* d_stream, /* in/out: compressed page stream */
  1755. rec_t** recs, /* in: dense page directory
  1756. sorted by address */
  1757. ulint n_dense, /* in: size of recs[] */
  1758. dict_index_t* index, /* in: the index of the page */
  1759. ulint* offsets, /* in/out: temporary offsets */
  1760. mem_heap_t* heap) /* in: temporary memory heap */
  1761. {
  1762. ulint heap_status = REC_STATUS_NODE_PTR
  1763. | PAGE_HEAP_NO_USER_LOW << REC_HEAP_NO_SHIFT;
  1764. ulint slot;
  1765. const byte* storage;
  1766. /* Subtract the space reserved for uncompressed data. */
  1767. d_stream->avail_in -= n_dense
  1768. * (PAGE_ZIP_DIR_SLOT_SIZE + REC_NODE_PTR_SIZE);
  1769. /* Decompress the records in heap_no order. */
  1770. for (slot = 0; slot < n_dense; slot++) {
  1771. rec_t* rec = recs[slot];
  1772. d_stream->avail_out = rec - REC_N_NEW_EXTRA_BYTES
  1773. - d_stream->next_out;
  1774. ut_ad(d_stream->avail_out < UNIV_PAGE_SIZE
  1775. - PAGE_ZIP_START - PAGE_DIR);
  1776. switch (inflate(d_stream, Z_SYNC_FLUSH)) {
  1777. case Z_STREAM_END:
  1778. /* Apparently, n_dense has grown
  1779. since the time the page was last compressed. */
  1780. goto zlib_done;
  1781. case Z_OK:
  1782. case Z_BUF_ERROR:
  1783. if (!d_stream->avail_out) {
  1784. break;
  1785. }
  1786. /* fall through */
  1787. default:
  1788. page_zip_fail(("page_zip_decompress_node_ptrs:"
  1789. " 1 inflate(Z_SYNC_FLUSH)=%s\n",
  1790. d_stream->msg));
  1791. goto zlib_error;
  1792. }
  1793. ut_ad(d_stream->next_out == rec - REC_N_NEW_EXTRA_BYTES);
  1794. /* Prepare to decompress the data bytes. */
  1795. d_stream->next_out = rec;
  1796. /* Set heap_no and the status bits. */
  1797. mach_write_to_2(rec - REC_NEW_HEAP_NO, heap_status);
  1798. heap_status += 1 << REC_HEAP_NO_SHIFT;
  1799. /* Read the offsets. The status bits are needed here. */
  1800. offsets = rec_get_offsets(rec, index, offsets,
  1801. ULINT_UNDEFINED, &heap);
  1802. /* Non-leaf nodes should not have any externally
  1803. stored columns. */
  1804. ut_ad(!rec_offs_any_extern(offsets));
  1805. /* Decompress the data bytes, except node_ptr. */
  1806. d_stream->avail_out = rec_offs_data_size(offsets)
  1807. - REC_NODE_PTR_SIZE;
  1808. switch (inflate(d_stream, Z_SYNC_FLUSH)) {
  1809. case Z_STREAM_END:
  1810. goto zlib_done;
  1811. case Z_OK:
  1812. case Z_BUF_ERROR:
  1813. if (!d_stream->avail_out) {
  1814. break;
  1815. }
  1816. /* fall through */
  1817. default:
  1818. page_zip_fail(("page_zip_decompress_node_ptrs:"
  1819. " 2 inflate(Z_SYNC_FLUSH)=%s\n",
  1820. d_stream->msg));
  1821. goto zlib_error;
  1822. }
  1823. /* Clear the node pointer in case the record
  1824. will be deleted and the space will be reallocated
  1825. to a smaller record. */
  1826. memset(d_stream->next_out, 0, REC_NODE_PTR_SIZE);
  1827. d_stream->next_out += REC_NODE_PTR_SIZE;
  1828. ut_ad(d_stream->next_out == rec_get_end(rec, offsets));
  1829. }
  1830. /* Decompress any trailing garbage, in case the last record was
  1831. allocated from an originally longer space on the free list. */
  1832. d_stream->avail_out = page_header_get_field(page_zip->data,
  1833. PAGE_HEAP_TOP)
  1834. - page_offset(d_stream->next_out);
  1835. if (UNIV_UNLIKELY(d_stream->avail_out > UNIV_PAGE_SIZE
  1836. - PAGE_ZIP_START - PAGE_DIR)) {
  1837. page_zip_fail(("page_zip_decompress_node_ptrs:"
  1838. " avail_out = %u\n",
  1839. d_stream->avail_out));
  1840. goto zlib_error;
  1841. }
  1842. if (UNIV_UNLIKELY(inflate(d_stream, Z_FINISH) != Z_STREAM_END)) {
  1843. page_zip_fail(("page_zip_decompress_node_ptrs:"
  1844. " inflate(Z_FINISH)=%s\n",
  1845. d_stream->msg));
  1846. zlib_error:
  1847. inflateEnd(d_stream);
  1848. return(FALSE);
  1849. }
  1850. /* Note that d_stream->avail_out > 0 may hold here
  1851. if the modification log is nonempty. */
  1852. zlib_done:
  1853. if (UNIV_UNLIKELY(inflateEnd(d_stream) != Z_OK)) {
  1854. ut_error;
  1855. }
  1856. {
  1857. page_t* page = page_align(d_stream->next_out);
  1858. /* Clear the unused heap space on the uncompressed page. */
  1859. memset(d_stream->next_out, 0,
  1860. page_dir_get_nth_slot(page,
  1861. page_dir_get_n_slots(page) - 1)
  1862. - d_stream->next_out);
  1863. }
  1864. #ifdef UNIV_DEBUG
  1865. page_zip->m_start = PAGE_DATA + d_stream->total_in;
  1866. #endif /* UNIV_DEBUG */
  1867. /* Apply the modification log. */
  1868. {
  1869. const byte* mod_log_ptr;
  1870. mod_log_ptr = page_zip_apply_log(d_stream->next_in,
  1871. d_stream->avail_in + 1,
  1872. recs, n_dense,
  1873. ULINT_UNDEFINED, heap_status,
  1874. index, offsets);
  1875. if (UNIV_UNLIKELY(!mod_log_ptr)) {
  1876. return(FALSE);
  1877. }
  1878. page_zip->m_end = mod_log_ptr - page_zip->data;
  1879. page_zip->m_nonempty = mod_log_ptr != d_stream->next_in;
  1880. }
  1881. if (UNIV_UNLIKELY
  1882. (page_zip_get_trailer_len(page_zip,
  1883. dict_index_is_clust(index), NULL)
  1884. + page_zip->m_end >= page_zip_get_size(page_zip))) {
  1885. page_zip_fail(("page_zip_decompress_node_ptrs:"
  1886. " %lu + %lu >= %lu, %lu\n",
  1887. (ulong) page_zip_get_trailer_len(
  1888. page_zip, dict_index_is_clust(index),
  1889. NULL),
  1890. (ulong) page_zip->m_end,
  1891. (ulong) page_zip_get_size(page_zip),
  1892. (ulong) dict_index_is_clust(index)));
  1893. return(FALSE);
  1894. }
  1895. /* Restore the uncompressed columns in heap_no order. */
  1896. storage = page_zip->data + page_zip_get_size(page_zip)
  1897. - n_dense * PAGE_ZIP_DIR_SLOT_SIZE;
  1898. for (slot = 0; slot < n_dense; slot++) {
  1899. rec_t* rec = recs[slot];
  1900. offsets = rec_get_offsets(rec, index, offsets,
  1901. ULINT_UNDEFINED, &heap);
  1902. /* Non-leaf nodes should not have any externally
  1903. stored columns. */
  1904. ut_ad(!rec_offs_any_extern(offsets));
  1905. storage -= REC_NODE_PTR_SIZE;
  1906. memcpy(rec_get_end(rec, offsets) - REC_NODE_PTR_SIZE,
  1907. storage, REC_NODE_PTR_SIZE);
  1908. }
  1909. return(TRUE);
  1910. }
  1911. /**************************************************************************
  1912. Decompress the records of a leaf node of a secondary index. */
  1913. static
  1914. ibool
  1915. page_zip_decompress_sec(
  1916. /*====================*/
  1917. /* out: TRUE on success,
  1918. FALSE on failure */
  1919. page_zip_des_t* page_zip, /* in/out: compressed page */
  1920. z_stream* d_stream, /* in/out: compressed page stream */
  1921. rec_t** recs, /* in: dense page directory
  1922. sorted by address */
  1923. ulint n_dense, /* in: size of recs[] */
  1924. dict_index_t* index, /* in: the index of the page */
  1925. ulint* offsets) /* in/out: temporary offsets */
  1926. {
  1927. ulint heap_status = REC_STATUS_ORDINARY
  1928. | PAGE_HEAP_NO_USER_LOW << REC_HEAP_NO_SHIFT;
  1929. ulint slot;
  1930. ut_a(!dict_index_is_clust(index));
  1931. /* Subtract the space reserved for uncompressed data. */
  1932. d_stream->avail_in -= n_dense * PAGE_ZIP_DIR_SLOT_SIZE;
  1933. for (slot = 0; slot < n_dense; slot++) {
  1934. rec_t* rec = recs[slot];
  1935. /* Decompress everything up to this record. */
  1936. d_stream->avail_out = rec - REC_N_NEW_EXTRA_BYTES
  1937. - d_stream->next_out;
  1938. if (UNIV_LIKELY(d_stream->avail_out)) {
  1939. switch (inflate(d_stream, Z_SYNC_FLUSH)) {
  1940. case Z_STREAM_END:
  1941. /* Apparently, n_dense has grown
  1942. since the time the page was last compressed. */
  1943. goto zlib_done;
  1944. case Z_OK:
  1945. case Z_BUF_ERROR:
  1946. if (!d_stream->avail_out) {
  1947. break;
  1948. }
  1949. /* fall through */
  1950. default:
  1951. page_zip_fail(("page_zip_decompress_sec:"
  1952. " inflate(Z_SYNC_FLUSH)=%s\n",
  1953. d_stream->msg));
  1954. goto zlib_error;
  1955. }
  1956. }
  1957. ut_ad(d_stream->next_out == rec - REC_N_NEW_EXTRA_BYTES);
  1958. /* Skip the REC_N_NEW_EXTRA_BYTES. */
  1959. d_stream->next_out = rec;
  1960. /* Set heap_no and the status bits. */
  1961. mach_write_to_2(rec - REC_NEW_HEAP_NO, heap_status);
  1962. heap_status += 1 << REC_HEAP_NO_SHIFT;
  1963. }
  1964. /* Decompress the data of the last record and any trailing garbage,
  1965. in case the last record was allocated from an originally longer space
  1966. on the free list. */
  1967. d_stream->avail_out = page_header_get_field(page_zip->data,
  1968. PAGE_HEAP_TOP)
  1969. - page_offset(d_stream->next_out);
  1970. if (UNIV_UNLIKELY(d_stream->avail_out > UNIV_PAGE_SIZE
  1971. - PAGE_ZIP_START - PAGE_DIR)) {
  1972. page_zip_fail(("page_zip_decompress_sec:"
  1973. " avail_out = %u\n",
  1974. d_stream->avail_out));
  1975. goto zlib_error;
  1976. }
  1977. if (UNIV_UNLIKELY(inflate(d_stream, Z_FINISH) != Z_STREAM_END)) {
  1978. page_zip_fail(("page_zip_decompress_sec:"
  1979. " inflate(Z_FINISH)=%s\n",
  1980. d_stream->msg));
  1981. zlib_error:
  1982. inflateEnd(d_stream);
  1983. return(FALSE);
  1984. }
  1985. /* Note that d_stream->avail_out > 0 may hold here
  1986. if the modification log is nonempty. */
  1987. zlib_done:
  1988. if (UNIV_UNLIKELY(inflateEnd(d_stream) != Z_OK)) {
  1989. ut_error;
  1990. }
  1991. {
  1992. page_t* page = page_align(d_stream->next_out);
  1993. /* Clear the unused heap space on the uncompressed page. */
  1994. memset(d_stream->next_out, 0,
  1995. page_dir_get_nth_slot(page,
  1996. page_dir_get_n_slots(page) - 1)
  1997. - d_stream->next_out);
  1998. }
  1999. #ifdef UNIV_DEBUG
  2000. page_zip->m_start = PAGE_DATA + d_stream->total_in;
  2001. #endif /* UNIV_DEBUG */
  2002. /* Apply the modification log. */
  2003. {
  2004. const byte* mod_log_ptr;
  2005. mod_log_ptr = page_zip_apply_log(d_stream->next_in,
  2006. d_stream->avail_in + 1,
  2007. recs, n_dense,
  2008. ULINT_UNDEFINED, heap_status,
  2009. index, offsets);
  2010. if (UNIV_UNLIKELY(!mod_log_ptr)) {
  2011. return(FALSE);
  2012. }
  2013. page_zip->m_end = mod_log_ptr - page_zip->data;
  2014. page_zip->m_nonempty = mod_log_ptr != d_stream->next_in;
  2015. }
  2016. if (UNIV_UNLIKELY(page_zip_get_trailer_len(page_zip, FALSE, NULL)
  2017. + page_zip->m_end >= page_zip_get_size(page_zip))) {
  2018. page_zip_fail(("page_zip_decompress_sec: %lu + %lu >= %lu\n",
  2019. (ulong) page_zip_get_trailer_len(
  2020. page_zip, FALSE, NULL),
  2021. (ulong) page_zip->m_end,
  2022. (ulong) page_zip_get_size(page_zip)));
  2023. return(FALSE);
  2024. }
  2025. /* There are no uncompressed columns on leaf pages of
  2026. secondary indexes. */
  2027. return(TRUE);
  2028. }
  2029. /**************************************************************************
  2030. Decompress a record of a leaf node of a clustered index that contains
  2031. externally stored columns. */
  2032. static
  2033. ibool
  2034. page_zip_decompress_clust_ext(
  2035. /*==========================*/
  2036. /* out: TRUE on success */
  2037. z_stream* d_stream, /* in/out: compressed page stream */
  2038. rec_t* rec, /* in/out: record */
  2039. const ulint* offsets, /* in: rec_get_offsets(rec) */
  2040. ulint trx_id_col) /* in: position of of DB_TRX_ID */
  2041. {
  2042. ulint i;
  2043. for (i = 0; i < rec_offs_n_fields(offsets); i++) {
  2044. ulint len;
  2045. byte* dst;
  2046. if (UNIV_UNLIKELY(i == trx_id_col)) {
  2047. /* Skip trx_id and roll_ptr */
  2048. dst = rec_get_nth_field(rec, offsets, i, &len);
  2049. if (UNIV_UNLIKELY(len < DATA_TRX_ID_LEN
  2050. + DATA_ROLL_PTR_LEN)) {
  2051. page_zip_fail(("page_zip_decompress_clust_ext:"
  2052. " len[%lu] = %lu\n",
  2053. (ulong) i, (ulong) len));
  2054. return(FALSE);
  2055. }
  2056. if (rec_offs_nth_extern(offsets, i)) {
  2057. page_zip_fail(("page_zip_decompress_clust_ext:"
  2058. " DB_TRX_ID at %lu is ext\n",
  2059. (ulong) i));
  2060. return(FALSE);
  2061. }
  2062. d_stream->avail_out = dst - d_stream->next_out;
  2063. switch (inflate(d_stream, Z_SYNC_FLUSH)) {
  2064. case Z_STREAM_END:
  2065. case Z_OK:
  2066. case Z_BUF_ERROR:
  2067. if (!d_stream->avail_out) {
  2068. break;
  2069. }
  2070. /* fall through */
  2071. default:
  2072. page_zip_fail(("page_zip_decompress_clust_ext:"
  2073. " 1 inflate(Z_SYNC_FLUSH)=%s\n",
  2074. d_stream->msg));
  2075. return(FALSE);
  2076. }
  2077. ut_ad(d_stream->next_out == dst);
  2078. /* Clear DB_TRX_ID and DB_ROLL_PTR in order to
  2079. avoid uninitialized bytes in case the record
  2080. is affected by page_zip_apply_log(). */
  2081. memset(dst, 0, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  2082. d_stream->next_out += DATA_TRX_ID_LEN
  2083. + DATA_ROLL_PTR_LEN;
  2084. } else if (rec_offs_nth_extern(offsets, i)) {
  2085. dst = rec_get_nth_field(rec, offsets, i, &len);
  2086. ut_ad(len >= BTR_EXTERN_FIELD_REF_SIZE);
  2087. dst += len - BTR_EXTERN_FIELD_REF_SIZE;
  2088. d_stream->avail_out = dst - d_stream->next_out;
  2089. switch (inflate(d_stream, Z_SYNC_FLUSH)) {
  2090. case Z_STREAM_END:
  2091. case Z_OK:
  2092. case Z_BUF_ERROR:
  2093. if (!d_stream->avail_out) {
  2094. break;
  2095. }
  2096. /* fall through */
  2097. default:
  2098. page_zip_fail(("page_zip_decompress_clust_ext:"
  2099. " 2 inflate(Z_SYNC_FLUSH)=%s\n",
  2100. d_stream->msg));
  2101. return(FALSE);
  2102. }
  2103. ut_ad(d_stream->next_out == dst);
  2104. /* Clear the BLOB pointer in case
  2105. the record will be deleted and the
  2106. space will not be reused. Note that
  2107. the final initialization of the BLOB
  2108. pointers (copying from "externs"
  2109. or clearing) will have to take place
  2110. only after the page modification log
  2111. has been applied. Otherwise, we
  2112. could end up with an uninitialized
  2113. BLOB pointer when a record is deleted,
  2114. reallocated and deleted. */
  2115. memset(d_stream->next_out, 0,
  2116. BTR_EXTERN_FIELD_REF_SIZE);
  2117. d_stream->next_out
  2118. += BTR_EXTERN_FIELD_REF_SIZE;
  2119. }
  2120. }
  2121. return(TRUE);
  2122. }
  2123. /**************************************************************************
  2124. Compress the records of a leaf node of a clustered index. */
  2125. static
  2126. ibool
  2127. page_zip_decompress_clust(
  2128. /*======================*/
  2129. /* out: TRUE on success,
  2130. FALSE on failure */
  2131. page_zip_des_t* page_zip, /* in/out: compressed page */
  2132. z_stream* d_stream, /* in/out: compressed page stream */
  2133. rec_t** recs, /* in: dense page directory
  2134. sorted by address */
  2135. ulint n_dense, /* in: size of recs[] */
  2136. dict_index_t* index, /* in: the index of the page */
  2137. ulint trx_id_col, /* index of the trx_id column */
  2138. ulint* offsets, /* in/out: temporary offsets */
  2139. mem_heap_t* heap) /* in: temporary memory heap */
  2140. {
  2141. int err;
  2142. ulint slot;
  2143. ulint heap_status = REC_STATUS_ORDINARY
  2144. | PAGE_HEAP_NO_USER_LOW << REC_HEAP_NO_SHIFT;
  2145. const byte* storage;
  2146. const byte* externs;
  2147. ut_a(dict_index_is_clust(index));
  2148. /* Subtract the space reserved for uncompressed data. */
  2149. d_stream->avail_in -= n_dense * (PAGE_ZIP_DIR_SLOT_SIZE
  2150. + DATA_TRX_ID_LEN
  2151. + DATA_ROLL_PTR_LEN);
  2152. /* Decompress the records in heap_no order. */
  2153. for (slot = 0; slot < n_dense; slot++) {
  2154. rec_t* rec = recs[slot];
  2155. d_stream->avail_out = rec - REC_N_NEW_EXTRA_BYTES
  2156. - d_stream->next_out;
  2157. ut_ad(d_stream->avail_out < UNIV_PAGE_SIZE
  2158. - PAGE_ZIP_START - PAGE_DIR);
  2159. err = inflate(d_stream, Z_SYNC_FLUSH);
  2160. switch (err) {
  2161. case Z_STREAM_END:
  2162. /* Apparently, n_dense has grown
  2163. since the time the page was last compressed. */
  2164. goto zlib_done;
  2165. case Z_OK:
  2166. case Z_BUF_ERROR:
  2167. if (UNIV_LIKELY(!d_stream->avail_out)) {
  2168. break;
  2169. }
  2170. /* fall through */
  2171. default:
  2172. page_zip_fail(("page_zip_decompress_clust:"
  2173. " 1 inflate(Z_SYNC_FLUSH)=%s\n",
  2174. d_stream->msg));
  2175. goto zlib_error;
  2176. }
  2177. ut_ad(d_stream->next_out == rec - REC_N_NEW_EXTRA_BYTES);
  2178. /* Prepare to decompress the data bytes. */
  2179. d_stream->next_out = rec;
  2180. /* Set heap_no and the status bits. */
  2181. mach_write_to_2(rec - REC_NEW_HEAP_NO, heap_status);
  2182. heap_status += 1 << REC_HEAP_NO_SHIFT;
  2183. /* Read the offsets. The status bits are needed here. */
  2184. offsets = rec_get_offsets(rec, index, offsets,
  2185. ULINT_UNDEFINED, &heap);
  2186. /* This is a leaf page in a clustered index. */
  2187. /* Check if there are any externally stored columns.
  2188. For each externally stored column, restore the
  2189. BTR_EXTERN_FIELD_REF separately. */
  2190. if (UNIV_UNLIKELY(rec_offs_any_extern(offsets))) {
  2191. if (UNIV_UNLIKELY
  2192. (!page_zip_decompress_clust_ext(
  2193. d_stream, rec, offsets, trx_id_col))) {
  2194. goto zlib_error;
  2195. }
  2196. } else {
  2197. /* Skip trx_id and roll_ptr */
  2198. ulint len;
  2199. byte* dst = rec_get_nth_field(rec, offsets,
  2200. trx_id_col, &len);
  2201. if (UNIV_UNLIKELY(len < DATA_TRX_ID_LEN
  2202. + DATA_ROLL_PTR_LEN)) {
  2203. page_zip_fail(("page_zip_decompress_clust:"
  2204. " len = %lu\n", (ulong) len));
  2205. goto zlib_error;
  2206. }
  2207. d_stream->avail_out = dst - d_stream->next_out;
  2208. switch (inflate(d_stream, Z_SYNC_FLUSH)) {
  2209. case Z_STREAM_END:
  2210. case Z_OK:
  2211. case Z_BUF_ERROR:
  2212. if (!d_stream->avail_out) {
  2213. break;
  2214. }
  2215. /* fall through */
  2216. default:
  2217. page_zip_fail(("page_zip_decompress_clust:"
  2218. " 2 inflate(Z_SYNC_FLUSH)=%s\n",
  2219. d_stream->msg));
  2220. goto zlib_error;
  2221. }
  2222. ut_ad(d_stream->next_out == dst);
  2223. /* Clear DB_TRX_ID and DB_ROLL_PTR in order to
  2224. avoid uninitialized bytes in case the record
  2225. is affected by page_zip_apply_log(). */
  2226. memset(dst, 0, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  2227. d_stream->next_out += DATA_TRX_ID_LEN
  2228. + DATA_ROLL_PTR_LEN;
  2229. }
  2230. /* Decompress the last bytes of the record. */
  2231. d_stream->avail_out = rec_get_end(rec, offsets)
  2232. - d_stream->next_out;
  2233. switch (inflate(d_stream, Z_SYNC_FLUSH)) {
  2234. case Z_STREAM_END:
  2235. case Z_OK:
  2236. case Z_BUF_ERROR:
  2237. if (!d_stream->avail_out) {
  2238. break;
  2239. }
  2240. /* fall through */
  2241. default:
  2242. page_zip_fail(("page_zip_decompress_clust:"
  2243. " 3 inflate(Z_SYNC_FLUSH)=%s\n",
  2244. d_stream->msg));
  2245. goto zlib_error;
  2246. }
  2247. }
  2248. /* Decompress any trailing garbage, in case the last record was
  2249. allocated from an originally longer space on the free list. */
  2250. d_stream->avail_out = page_header_get_field(page_zip->data,
  2251. PAGE_HEAP_TOP)
  2252. - page_offset(d_stream->next_out);
  2253. if (UNIV_UNLIKELY(d_stream->avail_out > UNIV_PAGE_SIZE
  2254. - PAGE_ZIP_START - PAGE_DIR)) {
  2255. page_zip_fail(("page_zip_decompress_clust:"
  2256. " avail_out = %u\n",
  2257. d_stream->avail_out));
  2258. goto zlib_error;
  2259. }
  2260. if (UNIV_UNLIKELY(inflate(d_stream, Z_FINISH) != Z_STREAM_END)) {
  2261. page_zip_fail(("page_zip_decompress_clust:"
  2262. " inflate(Z_FINISH)=%s\n",
  2263. d_stream->msg));
  2264. zlib_error:
  2265. inflateEnd(d_stream);
  2266. return(FALSE);
  2267. }
  2268. /* Note that d_stream->avail_out > 0 may hold here
  2269. if the modification log is nonempty. */
  2270. zlib_done:
  2271. if (UNIV_UNLIKELY(inflateEnd(d_stream) != Z_OK)) {
  2272. ut_error;
  2273. }
  2274. {
  2275. page_t* page = page_align(d_stream->next_out);
  2276. /* Clear the unused heap space on the uncompressed page. */
  2277. memset(d_stream->next_out, 0,
  2278. page_dir_get_nth_slot(page,
  2279. page_dir_get_n_slots(page) - 1)
  2280. - d_stream->next_out);
  2281. }
  2282. #ifdef UNIV_DEBUG
  2283. page_zip->m_start = PAGE_DATA + d_stream->total_in;
  2284. #endif /* UNIV_DEBUG */
  2285. /* Apply the modification log. */
  2286. {
  2287. const byte* mod_log_ptr;
  2288. mod_log_ptr = page_zip_apply_log(d_stream->next_in,
  2289. d_stream->avail_in + 1,
  2290. recs, n_dense,
  2291. trx_id_col, heap_status,
  2292. index, offsets);
  2293. if (UNIV_UNLIKELY(!mod_log_ptr)) {
  2294. return(FALSE);
  2295. }
  2296. page_zip->m_end = mod_log_ptr - page_zip->data;
  2297. page_zip->m_nonempty = mod_log_ptr != d_stream->next_in;
  2298. }
  2299. if (UNIV_UNLIKELY(page_zip_get_trailer_len(page_zip, TRUE, NULL)
  2300. + page_zip->m_end >= page_zip_get_size(page_zip))) {
  2301. page_zip_fail(("page_zip_decompress_clust: %lu + %lu >= %lu\n",
  2302. (ulong) page_zip_get_trailer_len(
  2303. page_zip, TRUE, NULL),
  2304. (ulong) page_zip->m_end,
  2305. (ulong) page_zip_get_size(page_zip)));
  2306. return(FALSE);
  2307. }
  2308. storage = page_zip->data + page_zip_get_size(page_zip)
  2309. - n_dense * PAGE_ZIP_DIR_SLOT_SIZE;
  2310. externs = storage - n_dense
  2311. * (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  2312. /* Restore the uncompressed columns in heap_no order. */
  2313. for (slot = 0; slot < n_dense; slot++) {
  2314. ulint i;
  2315. ulint len;
  2316. byte* dst;
  2317. rec_t* rec = recs[slot];
  2318. ibool exists = !page_zip_dir_find_free(
  2319. page_zip, page_offset(rec));
  2320. offsets = rec_get_offsets(rec, index, offsets,
  2321. ULINT_UNDEFINED, &heap);
  2322. dst = rec_get_nth_field(rec, offsets,
  2323. trx_id_col, &len);
  2324. ut_ad(len >= DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  2325. storage -= DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN;
  2326. memcpy(dst, storage,
  2327. DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  2328. /* Check if there are any externally stored
  2329. columns in this record. For each externally
  2330. stored column, restore or clear the
  2331. BTR_EXTERN_FIELD_REF. */
  2332. if (!rec_offs_any_extern(offsets)) {
  2333. continue;
  2334. }
  2335. for (i = 0; i < rec_offs_n_fields(offsets); i++) {
  2336. if (!rec_offs_nth_extern(offsets, i)) {
  2337. continue;
  2338. }
  2339. dst = rec_get_nth_field(rec, offsets, i, &len);
  2340. if (UNIV_UNLIKELY(len < BTR_EXTERN_FIELD_REF_SIZE)) {
  2341. page_zip_fail(("page_zip_decompress_clust:"
  2342. " %lu < 20\n",
  2343. (ulong) len));
  2344. return(FALSE);
  2345. }
  2346. dst += len - BTR_EXTERN_FIELD_REF_SIZE;
  2347. if (UNIV_LIKELY(exists)) {
  2348. /* Existing record:
  2349. restore the BLOB pointer */
  2350. externs -= BTR_EXTERN_FIELD_REF_SIZE;
  2351. if (UNIV_UNLIKELY
  2352. (externs < page_zip->data
  2353. + page_zip->m_end)) {
  2354. page_zip_fail(("page_zip_"
  2355. "decompress_clust: "
  2356. "%p < %p + %lu\n",
  2357. (const void*) externs,
  2358. (const void*)
  2359. page_zip->data,
  2360. (ulong)
  2361. page_zip->m_end));
  2362. return(FALSE);
  2363. }
  2364. memcpy(dst, externs,
  2365. BTR_EXTERN_FIELD_REF_SIZE);
  2366. page_zip->n_blobs++;
  2367. } else {
  2368. /* Deleted record:
  2369. clear the BLOB pointer */
  2370. memset(dst, 0,
  2371. BTR_EXTERN_FIELD_REF_SIZE);
  2372. }
  2373. }
  2374. }
  2375. return(TRUE);
  2376. }
  2377. /**************************************************************************
  2378. Decompress a page. This function should tolerate errors on the compressed
  2379. page. Instead of letting assertions fail, it will return FALSE if an
  2380. inconsistency is detected. */
  2381. UNIV_INTERN
  2382. ibool
  2383. page_zip_decompress(
  2384. /*================*/
  2385. /* out: TRUE on success, FALSE on failure */
  2386. page_zip_des_t* page_zip,/* in: data, ssize;
  2387. out: m_start, m_end, m_nonempty, n_blobs */
  2388. page_t* page) /* out: uncompressed page, may be trashed */
  2389. {
  2390. z_stream d_stream;
  2391. dict_index_t* index = NULL;
  2392. rec_t** recs; /* dense page directory, sorted by address */
  2393. ulint n_dense;/* number of user records on the page */
  2394. ulint trx_id_col = ULINT_UNDEFINED;
  2395. mem_heap_t* heap;
  2396. ulint* offsets;
  2397. ullint usec = ut_time_us(NULL);
  2398. ut_ad(page_zip_simple_validate(page_zip));
  2399. UNIV_MEM_ASSERT_W(page, UNIV_PAGE_SIZE);
  2400. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  2401. /* The dense directory excludes the infimum and supremum records. */
  2402. n_dense = page_dir_get_n_heap(page_zip->data) - PAGE_HEAP_NO_USER_LOW;
  2403. if (UNIV_UNLIKELY(n_dense * PAGE_ZIP_DIR_SLOT_SIZE
  2404. >= page_zip_get_size(page_zip))) {
  2405. page_zip_fail(("page_zip_decompress 1: %lu %lu\n",
  2406. (ulong) n_dense,
  2407. (ulong) page_zip_get_size(page_zip)));
  2408. return(FALSE);
  2409. }
  2410. heap = mem_heap_create(n_dense * (3 * sizeof *recs) + UNIV_PAGE_SIZE);
  2411. recs = mem_heap_alloc(heap, n_dense * (2 * sizeof *recs));
  2412. #ifdef UNIV_ZIP_DEBUG
  2413. /* Clear the page. */
  2414. memset(page, 0x55, UNIV_PAGE_SIZE);
  2415. #endif /* UNIV_ZIP_DEBUG */
  2416. UNIV_MEM_INVALID(page, UNIV_PAGE_SIZE);
  2417. /* Copy the page header. */
  2418. memcpy(page, page_zip->data, PAGE_DATA);
  2419. /* Copy the page directory. */
  2420. if (UNIV_UNLIKELY(!page_zip_dir_decode(page_zip, page, recs,
  2421. recs + n_dense, n_dense))) {
  2422. zlib_error:
  2423. mem_heap_free(heap);
  2424. return(FALSE);
  2425. }
  2426. /* Copy the infimum and supremum records. */
  2427. memcpy(page + (PAGE_NEW_INFIMUM - REC_N_NEW_EXTRA_BYTES),
  2428. infimum_extra, sizeof infimum_extra);
  2429. if (UNIV_UNLIKELY(!page_get_n_recs(page))) {
  2430. rec_set_next_offs_new(page + PAGE_NEW_INFIMUM,
  2431. PAGE_NEW_SUPREMUM);
  2432. } else {
  2433. rec_set_next_offs_new(page + PAGE_NEW_INFIMUM,
  2434. page_zip_dir_get(page_zip, 0)
  2435. & PAGE_ZIP_DIR_SLOT_MASK);
  2436. }
  2437. memcpy(page + PAGE_NEW_INFIMUM, infimum_data, sizeof infimum_data);
  2438. memcpy(page + (PAGE_NEW_SUPREMUM - REC_N_NEW_EXTRA_BYTES + 1),
  2439. supremum_extra_data, sizeof supremum_extra_data);
  2440. page_zip_set_alloc(&d_stream, heap);
  2441. if (UNIV_UNLIKELY(inflateInit2(&d_stream, UNIV_PAGE_SIZE_SHIFT)
  2442. != Z_OK)) {
  2443. ut_error;
  2444. }
  2445. d_stream.next_in = page_zip->data + PAGE_DATA;
  2446. /* Subtract the space reserved for
  2447. the page header and the end marker of the modification log. */
  2448. d_stream.avail_in = page_zip_get_size(page_zip) - (PAGE_DATA + 1);
  2449. d_stream.next_out = page + PAGE_ZIP_START;
  2450. d_stream.avail_out = UNIV_PAGE_SIZE - PAGE_ZIP_START;
  2451. /* Decode the zlib header and the index information. */
  2452. if (UNIV_UNLIKELY(inflate(&d_stream, Z_BLOCK) != Z_OK)) {
  2453. page_zip_fail(("page_zip_decompress:"
  2454. " 1 inflate(Z_BLOCK)=%s\n", d_stream.msg));
  2455. goto zlib_error;
  2456. }
  2457. if (UNIV_UNLIKELY(inflate(&d_stream, Z_BLOCK) != Z_OK)) {
  2458. page_zip_fail(("page_zip_decompress:"
  2459. " 2 inflate(Z_BLOCK)=%s\n", d_stream.msg));
  2460. goto zlib_error;
  2461. }
  2462. index = page_zip_fields_decode(
  2463. page + PAGE_ZIP_START, d_stream.next_out,
  2464. page_is_leaf(page) ? &trx_id_col : NULL);
  2465. if (UNIV_UNLIKELY(!index)) {
  2466. goto zlib_error;
  2467. }
  2468. /* Decompress the user records. */
  2469. page_zip->n_blobs = 0;
  2470. d_stream.next_out = page + PAGE_ZIP_START;
  2471. {
  2472. /* Pre-allocate the offsets for rec_get_offsets_reverse(). */
  2473. ulint n = 1 + 1/* node ptr */ + REC_OFFS_HEADER_SIZE
  2474. + dict_index_get_n_fields(index);
  2475. offsets = mem_heap_alloc(heap, n * sizeof(ulint));
  2476. *offsets = n;
  2477. }
  2478. /* Decompress the records in heap_no order. */
  2479. if (!page_is_leaf(page)) {
  2480. /* This is a node pointer page. */
  2481. ulint info_bits;
  2482. if (UNIV_UNLIKELY
  2483. (!page_zip_decompress_node_ptrs(page_zip, &d_stream,
  2484. recs, n_dense, index,
  2485. offsets, heap))) {
  2486. goto err_exit;
  2487. }
  2488. info_bits = mach_read_from_4(page + FIL_PAGE_PREV) == FIL_NULL
  2489. ? REC_INFO_MIN_REC_FLAG : 0;
  2490. if (UNIV_UNLIKELY(!page_zip_set_extra_bytes(page_zip, page,
  2491. info_bits))) {
  2492. goto err_exit;
  2493. }
  2494. } else if (UNIV_LIKELY(trx_id_col == ULINT_UNDEFINED)) {
  2495. /* This is a leaf page in a secondary index. */
  2496. if (UNIV_UNLIKELY(!page_zip_decompress_sec(page_zip, &d_stream,
  2497. recs, n_dense,
  2498. index, offsets))) {
  2499. goto err_exit;
  2500. }
  2501. if (UNIV_UNLIKELY(!page_zip_set_extra_bytes(page_zip,
  2502. page, 0))) {
  2503. err_exit:
  2504. page_zip_fields_free(index);
  2505. mem_heap_free(heap);
  2506. return(FALSE);
  2507. }
  2508. } else {
  2509. /* This is a leaf page in a clustered index. */
  2510. if (UNIV_UNLIKELY(!page_zip_decompress_clust(page_zip,
  2511. &d_stream, recs,
  2512. n_dense, index,
  2513. trx_id_col,
  2514. offsets, heap))) {
  2515. goto err_exit;
  2516. }
  2517. if (UNIV_UNLIKELY(!page_zip_set_extra_bytes(page_zip,
  2518. page, 0))) {
  2519. goto err_exit;
  2520. }
  2521. }
  2522. ut_a(page_is_comp(page));
  2523. UNIV_MEM_ASSERT_RW(page, UNIV_PAGE_SIZE);
  2524. page_zip_fields_free(index);
  2525. mem_heap_free(heap);
  2526. {
  2527. page_zip_stat_t* zip_stat
  2528. = &page_zip_stat[page_zip->ssize - 1];
  2529. zip_stat->decompressed++;
  2530. zip_stat->decompressed_usec += ut_time_us(NULL) - usec;
  2531. }
  2532. /* Update the stat counter for LRU policy. */
  2533. buf_LRU_stat_inc_unzip();
  2534. return(TRUE);
  2535. }
  2536. #ifdef UNIV_ZIP_DEBUG
  2537. /**************************************************************************
  2538. Dump a block of memory on the standard error stream. */
  2539. static
  2540. void
  2541. page_zip_hexdump_func(
  2542. /*==================*/
  2543. const char* name, /* in: name of the data structure */
  2544. const void* buf, /* in: data */
  2545. ulint size) /* in: length of the data, in bytes */
  2546. {
  2547. const byte* s = buf;
  2548. ulint addr;
  2549. const ulint width = 32; /* bytes per line */
  2550. fprintf(stderr, "%s:\n", name);
  2551. for (addr = 0; addr < size; addr += width) {
  2552. ulint i;
  2553. fprintf(stderr, "%04lx ", (ulong) addr);
  2554. i = ut_min(width, size - addr);
  2555. while (i--) {
  2556. fprintf(stderr, "%02x", *s++);
  2557. }
  2558. putc('\n', stderr);
  2559. }
  2560. }
  2561. #define page_zip_hexdump(buf, size) page_zip_hexdump_func(#buf, buf, size)
  2562. /* Flag: make page_zip_validate() compare page headers only */
  2563. UNIV_INTERN ibool page_zip_validate_header_only = FALSE;
  2564. /**************************************************************************
  2565. Check that the compressed and decompressed pages match. */
  2566. UNIV_INTERN
  2567. ibool
  2568. page_zip_validate(
  2569. /*==============*/
  2570. /* out: TRUE if valid, FALSE if not */
  2571. const page_zip_des_t* page_zip,/* in: compressed page */
  2572. const page_t* page) /* in: uncompressed page */
  2573. {
  2574. page_zip_des_t temp_page_zip;
  2575. byte* temp_page_buf;
  2576. page_t* temp_page;
  2577. ibool valid;
  2578. if (memcmp(page_zip->data + FIL_PAGE_PREV, page + FIL_PAGE_PREV,
  2579. FIL_PAGE_LSN - FIL_PAGE_PREV)
  2580. || memcmp(page_zip->data + FIL_PAGE_TYPE, page + FIL_PAGE_TYPE, 2)
  2581. || memcmp(page_zip->data + FIL_PAGE_DATA, page + FIL_PAGE_DATA,
  2582. PAGE_DATA - FIL_PAGE_DATA)) {
  2583. page_zip_fail(("page_zip_validate: page header\n"));
  2584. page_zip_hexdump(page_zip, sizeof *page_zip);
  2585. page_zip_hexdump(page_zip->data, page_zip_get_size(page_zip));
  2586. page_zip_hexdump(page, UNIV_PAGE_SIZE);
  2587. return(FALSE);
  2588. }
  2589. ut_a(page_is_comp(page));
  2590. if (page_zip_validate_header_only) {
  2591. return(TRUE);
  2592. }
  2593. /* page_zip_decompress() expects the uncompressed page to be
  2594. UNIV_PAGE_SIZE aligned. */
  2595. temp_page_buf = ut_malloc(2 * UNIV_PAGE_SIZE);
  2596. temp_page = ut_align(temp_page_buf, UNIV_PAGE_SIZE);
  2597. #ifdef UNIV_DEBUG_VALGRIND
  2598. /* Get detailed information on the valid bits in case the
  2599. UNIV_MEM_ASSERT_RW() checks fail. The v-bits of page[],
  2600. page_zip->data[] or page_zip could be viewed at temp_page[] or
  2601. temp_page_zip in a debugger when running valgrind --db-attach. */
  2602. VALGRIND_GET_VBITS(page, temp_page, UNIV_PAGE_SIZE);
  2603. UNIV_MEM_ASSERT_RW(page, UNIV_PAGE_SIZE);
  2604. VALGRIND_GET_VBITS(page_zip, &temp_page_zip, sizeof temp_page_zip);
  2605. UNIV_MEM_ASSERT_RW(page_zip, sizeof *page_zip);
  2606. VALGRIND_GET_VBITS(page_zip->data, temp_page,
  2607. page_zip_get_size(page_zip));
  2608. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  2609. #endif /* UNIV_DEBUG_VALGRIND */
  2610. temp_page_zip = *page_zip;
  2611. valid = page_zip_decompress(&temp_page_zip, temp_page);
  2612. if (!valid) {
  2613. fputs("page_zip_validate(): failed to decompress\n", stderr);
  2614. goto func_exit;
  2615. }
  2616. if (page_zip->n_blobs != temp_page_zip.n_blobs) {
  2617. page_zip_fail(("page_zip_validate: n_blobs: %u!=%u\n",
  2618. page_zip->n_blobs, temp_page_zip.n_blobs));
  2619. valid = FALSE;
  2620. }
  2621. #ifdef UNIV_DEBUG
  2622. if (page_zip->m_start != temp_page_zip.m_start) {
  2623. page_zip_fail(("page_zip_validate: m_start: %u!=%u\n",
  2624. page_zip->m_start, temp_page_zip.m_start));
  2625. valid = FALSE;
  2626. }
  2627. #endif /* UNIV_DEBUG */
  2628. if (page_zip->m_end != temp_page_zip.m_end) {
  2629. page_zip_fail(("page_zip_validate: m_end: %u!=%u\n",
  2630. page_zip->m_end, temp_page_zip.m_end));
  2631. valid = FALSE;
  2632. }
  2633. if (page_zip->m_nonempty != temp_page_zip.m_nonempty) {
  2634. page_zip_fail(("page_zip_validate(): m_nonempty: %u!=%u\n",
  2635. page_zip->m_nonempty,
  2636. temp_page_zip.m_nonempty));
  2637. valid = FALSE;
  2638. }
  2639. if (memcmp(page + PAGE_HEADER, temp_page + PAGE_HEADER,
  2640. UNIV_PAGE_SIZE - PAGE_HEADER - FIL_PAGE_DATA_END)) {
  2641. page_zip_fail(("page_zip_validate: content\n"));
  2642. valid = FALSE;
  2643. }
  2644. func_exit:
  2645. if (!valid) {
  2646. page_zip_hexdump(page_zip, sizeof *page_zip);
  2647. page_zip_hexdump(page_zip->data, page_zip_get_size(page_zip));
  2648. page_zip_hexdump(page, UNIV_PAGE_SIZE);
  2649. page_zip_hexdump(temp_page, UNIV_PAGE_SIZE);
  2650. }
  2651. ut_free(temp_page_buf);
  2652. return(valid);
  2653. }
  2654. #endif /* UNIV_ZIP_DEBUG */
  2655. #ifdef UNIV_DEBUG
  2656. static
  2657. ibool
  2658. page_zip_header_cmp(
  2659. /*================*/
  2660. /* out: TRUE */
  2661. const page_zip_des_t* page_zip,/* in: compressed page */
  2662. const byte* page) /* in: uncompressed page */
  2663. {
  2664. ut_ad(!memcmp(page_zip->data + FIL_PAGE_PREV, page + FIL_PAGE_PREV,
  2665. FIL_PAGE_LSN - FIL_PAGE_PREV));
  2666. ut_ad(!memcmp(page_zip->data + FIL_PAGE_TYPE, page + FIL_PAGE_TYPE,
  2667. 2));
  2668. ut_ad(!memcmp(page_zip->data + FIL_PAGE_DATA, page + FIL_PAGE_DATA,
  2669. PAGE_DATA - FIL_PAGE_DATA));
  2670. return(TRUE);
  2671. }
  2672. #endif /* UNIV_DEBUG */
  2673. /**************************************************************************
  2674. Write a record on the compressed page that contains externally stored
  2675. columns. The data must already have been written to the uncompressed page. */
  2676. static
  2677. byte*
  2678. page_zip_write_rec_ext(
  2679. /*===================*/
  2680. /* out: end of modification log */
  2681. page_zip_des_t* page_zip, /* in/out: compressed page */
  2682. const page_t* page, /* in: page containing rec */
  2683. const byte* rec, /* in: record being written */
  2684. dict_index_t* index, /* in: record descriptor */
  2685. const ulint* offsets, /* in: rec_get_offsets(rec, index) */
  2686. ulint create, /* in: nonzero=insert, zero=update */
  2687. ulint trx_id_col, /* in: position of DB_TRX_ID */
  2688. ulint heap_no, /* in: heap number of rec */
  2689. byte* storage, /* in: end of dense page directory */
  2690. byte* data) /* in: end of modification log */
  2691. {
  2692. const byte* start = rec;
  2693. ulint i;
  2694. ulint len;
  2695. byte* externs = storage;
  2696. ulint n_ext = rec_offs_n_extern(offsets);
  2697. ut_ad(rec_offs_validate(rec, index, offsets));
  2698. UNIV_MEM_ASSERT_RW(rec, rec_offs_data_size(offsets));
  2699. UNIV_MEM_ASSERT_RW(rec - rec_offs_extra_size(offsets),
  2700. rec_offs_extra_size(offsets));
  2701. externs -= (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)
  2702. * (page_dir_get_n_heap(page) - PAGE_HEAP_NO_USER_LOW);
  2703. /* Note that this will not take into account
  2704. the BLOB columns of rec if create==TRUE. */
  2705. ut_ad(data + rec_offs_data_size(offsets)
  2706. - (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)
  2707. - n_ext * BTR_EXTERN_FIELD_REF_SIZE
  2708. < externs - BTR_EXTERN_FIELD_REF_SIZE * page_zip->n_blobs);
  2709. {
  2710. ulint blob_no = page_zip_get_n_prev_extern(
  2711. page_zip, rec, index);
  2712. byte* ext_end = externs - page_zip->n_blobs
  2713. * BTR_EXTERN_FIELD_REF_SIZE;
  2714. ut_ad(blob_no <= page_zip->n_blobs);
  2715. externs -= blob_no * BTR_EXTERN_FIELD_REF_SIZE;
  2716. if (create) {
  2717. page_zip->n_blobs += n_ext;
  2718. ASSERT_ZERO_BLOB(ext_end - n_ext
  2719. * BTR_EXTERN_FIELD_REF_SIZE);
  2720. memmove(ext_end - n_ext
  2721. * BTR_EXTERN_FIELD_REF_SIZE,
  2722. ext_end,
  2723. externs - ext_end);
  2724. }
  2725. ut_a(blob_no + n_ext <= page_zip->n_blobs);
  2726. }
  2727. for (i = 0; i < rec_offs_n_fields(offsets); i++) {
  2728. const byte* src;
  2729. if (UNIV_UNLIKELY(i == trx_id_col)) {
  2730. ut_ad(!rec_offs_nth_extern(offsets,
  2731. i));
  2732. ut_ad(!rec_offs_nth_extern(offsets,
  2733. i + 1));
  2734. /* Locate trx_id and roll_ptr. */
  2735. src = rec_get_nth_field(rec, offsets,
  2736. i, &len);
  2737. ut_ad(len == DATA_TRX_ID_LEN);
  2738. ut_ad(src + DATA_TRX_ID_LEN
  2739. == rec_get_nth_field(
  2740. rec, offsets,
  2741. i + 1, &len));
  2742. ut_ad(len == DATA_ROLL_PTR_LEN);
  2743. /* Log the preceding fields. */
  2744. ASSERT_ZERO(data, src - start);
  2745. memcpy(data, start, src - start);
  2746. data += src - start;
  2747. start = src + (DATA_TRX_ID_LEN
  2748. + DATA_ROLL_PTR_LEN);
  2749. /* Store trx_id and roll_ptr. */
  2750. memcpy(storage - (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)
  2751. * (heap_no - 1),
  2752. src, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  2753. i++; /* skip also roll_ptr */
  2754. } else if (rec_offs_nth_extern(offsets, i)) {
  2755. src = rec_get_nth_field(rec, offsets,
  2756. i, &len);
  2757. ut_ad(dict_index_is_clust(index));
  2758. ut_ad(len
  2759. >= BTR_EXTERN_FIELD_REF_SIZE);
  2760. src += len - BTR_EXTERN_FIELD_REF_SIZE;
  2761. ASSERT_ZERO(data, src - start);
  2762. memcpy(data, start, src - start);
  2763. data += src - start;
  2764. start = src + BTR_EXTERN_FIELD_REF_SIZE;
  2765. /* Store the BLOB pointer. */
  2766. externs -= BTR_EXTERN_FIELD_REF_SIZE;
  2767. ut_ad(data < externs);
  2768. memcpy(externs, src, BTR_EXTERN_FIELD_REF_SIZE);
  2769. }
  2770. }
  2771. /* Log the last bytes of the record. */
  2772. len = rec_offs_data_size(offsets) - (start - rec);
  2773. ASSERT_ZERO(data, len);
  2774. memcpy(data, start, len);
  2775. data += len;
  2776. return(data);
  2777. }
  2778. /**************************************************************************
  2779. Write an entire record on the compressed page. The data must already
  2780. have been written to the uncompressed page. */
  2781. UNIV_INTERN
  2782. void
  2783. page_zip_write_rec(
  2784. /*===============*/
  2785. page_zip_des_t* page_zip,/* in/out: compressed page */
  2786. const byte* rec, /* in: record being written */
  2787. dict_index_t* index, /* in: the index the record belongs to */
  2788. const ulint* offsets,/* in: rec_get_offsets(rec, index) */
  2789. ulint create) /* in: nonzero=insert, zero=update */
  2790. {
  2791. const page_t* page;
  2792. byte* data;
  2793. byte* storage;
  2794. ulint heap_no;
  2795. byte* slot;
  2796. ut_ad(buf_frame_get_page_zip(rec) == page_zip);
  2797. ut_ad(page_zip_simple_validate(page_zip));
  2798. ut_ad(page_zip_get_size(page_zip)
  2799. > PAGE_DATA + page_zip_dir_size(page_zip));
  2800. ut_ad(rec_offs_comp(offsets));
  2801. ut_ad(rec_offs_validate(rec, index, offsets));
  2802. ut_ad(page_zip->m_start >= PAGE_DATA);
  2803. page = page_align(rec);
  2804. ut_ad(page_zip_header_cmp(page_zip, page));
  2805. ut_ad(page_simple_validate_new((page_t*) page));
  2806. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  2807. UNIV_MEM_ASSERT_RW(rec, rec_offs_data_size(offsets));
  2808. UNIV_MEM_ASSERT_RW(rec - rec_offs_extra_size(offsets),
  2809. rec_offs_extra_size(offsets));
  2810. slot = page_zip_dir_find(page_zip, page_offset(rec));
  2811. ut_a(slot);
  2812. /* Copy the delete mark. */
  2813. if (rec_get_deleted_flag(rec, TRUE)) {
  2814. *slot |= PAGE_ZIP_DIR_SLOT_DEL >> 8;
  2815. } else {
  2816. *slot &= ~(PAGE_ZIP_DIR_SLOT_DEL >> 8);
  2817. }
  2818. ut_ad(rec_get_start((rec_t*) rec, offsets) >= page + PAGE_ZIP_START);
  2819. ut_ad(rec_get_end((rec_t*) rec, offsets) <= page + UNIV_PAGE_SIZE
  2820. - PAGE_DIR - PAGE_DIR_SLOT_SIZE
  2821. * page_dir_get_n_slots(page));
  2822. heap_no = rec_get_heap_no_new(rec);
  2823. ut_ad(heap_no >= PAGE_HEAP_NO_USER_LOW); /* not infimum or supremum */
  2824. ut_ad(heap_no < page_dir_get_n_heap(page));
  2825. /* Append to the modification log. */
  2826. data = page_zip->data + page_zip->m_end;
  2827. ut_ad(!*data);
  2828. /* Identify the record by writing its heap number - 1.
  2829. 0 is reserved to indicate the end of the modification log. */
  2830. if (UNIV_UNLIKELY(heap_no - 1 >= 64)) {
  2831. *data++ = (byte) (0x80 | (heap_no - 1) >> 7);
  2832. ut_ad(!*data);
  2833. }
  2834. *data++ = (byte) ((heap_no - 1) << 1);
  2835. ut_ad(!*data);
  2836. {
  2837. const byte* start = rec - rec_offs_extra_size(offsets);
  2838. const byte* b = rec - REC_N_NEW_EXTRA_BYTES;
  2839. /* Write the extra bytes backwards, so that
  2840. rec_offs_extra_size() can be easily computed in
  2841. page_zip_apply_log() by invoking
  2842. rec_get_offsets_reverse(). */
  2843. while (b != start) {
  2844. *data++ = *--b;
  2845. ut_ad(!*data);
  2846. }
  2847. }
  2848. /* Write the data bytes. Store the uncompressed bytes separately. */
  2849. storage = page_zip->data + page_zip_get_size(page_zip)
  2850. - (page_dir_get_n_heap(page) - PAGE_HEAP_NO_USER_LOW)
  2851. * PAGE_ZIP_DIR_SLOT_SIZE;
  2852. if (page_is_leaf(page)) {
  2853. ulint len;
  2854. if (dict_index_is_clust(index)) {
  2855. ulint trx_id_col;
  2856. trx_id_col = dict_index_get_sys_col_pos(index,
  2857. DATA_TRX_ID);
  2858. ut_ad(trx_id_col != ULINT_UNDEFINED);
  2859. /* Store separately trx_id, roll_ptr and
  2860. the BTR_EXTERN_FIELD_REF of each BLOB column. */
  2861. if (rec_offs_any_extern(offsets)) {
  2862. data = page_zip_write_rec_ext(
  2863. page_zip, page,
  2864. rec, index, offsets, create,
  2865. trx_id_col, heap_no, storage, data);
  2866. } else {
  2867. /* Locate trx_id and roll_ptr. */
  2868. const byte* src
  2869. = rec_get_nth_field(rec, offsets,
  2870. trx_id_col, &len);
  2871. ut_ad(len == DATA_TRX_ID_LEN);
  2872. ut_ad(src + DATA_TRX_ID_LEN
  2873. == rec_get_nth_field(
  2874. rec, offsets,
  2875. trx_id_col + 1, &len));
  2876. ut_ad(len == DATA_ROLL_PTR_LEN);
  2877. /* Log the preceding fields. */
  2878. ASSERT_ZERO(data, src - rec);
  2879. memcpy(data, rec, src - rec);
  2880. data += src - rec;
  2881. /* Store trx_id and roll_ptr. */
  2882. memcpy(storage
  2883. - (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)
  2884. * (heap_no - 1),
  2885. src,
  2886. DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  2887. src += DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN;
  2888. /* Log the last bytes of the record. */
  2889. len = rec_offs_data_size(offsets)
  2890. - (src - rec);
  2891. ASSERT_ZERO(data, len);
  2892. memcpy(data, src, len);
  2893. data += len;
  2894. }
  2895. } else {
  2896. /* Leaf page of a secondary index:
  2897. no externally stored columns */
  2898. ut_ad(dict_index_get_sys_col_pos(index, DATA_TRX_ID)
  2899. == ULINT_UNDEFINED);
  2900. ut_ad(!rec_offs_any_extern(offsets));
  2901. /* Log the entire record. */
  2902. len = rec_offs_data_size(offsets);
  2903. ASSERT_ZERO(data, len);
  2904. memcpy(data, rec, len);
  2905. data += len;
  2906. }
  2907. } else {
  2908. /* This is a node pointer page. */
  2909. ulint len;
  2910. /* Non-leaf nodes should not have any externally
  2911. stored columns. */
  2912. ut_ad(!rec_offs_any_extern(offsets));
  2913. /* Copy the data bytes, except node_ptr. */
  2914. len = rec_offs_data_size(offsets) - REC_NODE_PTR_SIZE;
  2915. ut_ad(data + len < storage - REC_NODE_PTR_SIZE
  2916. * (page_dir_get_n_heap(page) - PAGE_HEAP_NO_USER_LOW));
  2917. ASSERT_ZERO(data, len);
  2918. memcpy(data, rec, len);
  2919. data += len;
  2920. /* Copy the node pointer to the uncompressed area. */
  2921. memcpy(storage - REC_NODE_PTR_SIZE
  2922. * (heap_no - 1),
  2923. rec + len,
  2924. REC_NODE_PTR_SIZE);
  2925. }
  2926. ut_a(!*data);
  2927. ut_ad((ulint) (data - page_zip->data) < page_zip_get_size(page_zip));
  2928. page_zip->m_end = data - page_zip->data;
  2929. page_zip->m_nonempty = TRUE;
  2930. #ifdef UNIV_ZIP_DEBUG
  2931. ut_a(page_zip_validate(page_zip, page_align(rec)));
  2932. #endif /* UNIV_ZIP_DEBUG */
  2933. }
  2934. /***************************************************************
  2935. Parses a log record of writing a BLOB pointer of a record. */
  2936. UNIV_INTERN
  2937. byte*
  2938. page_zip_parse_write_blob_ptr(
  2939. /*==========================*/
  2940. /* out: end of log record or NULL */
  2941. byte* ptr, /* in: redo log buffer */
  2942. byte* end_ptr,/* in: redo log buffer end */
  2943. page_t* page, /* in/out: uncompressed page */
  2944. page_zip_des_t* page_zip)/* in/out: compressed page */
  2945. {
  2946. ulint offset;
  2947. ulint z_offset;
  2948. ut_ad(!page == !page_zip);
  2949. if (UNIV_UNLIKELY
  2950. (end_ptr < ptr + (2 + 2 + BTR_EXTERN_FIELD_REF_SIZE))) {
  2951. return(NULL);
  2952. }
  2953. offset = mach_read_from_2(ptr);
  2954. z_offset = mach_read_from_2(ptr + 2);
  2955. if (UNIV_UNLIKELY(offset < PAGE_ZIP_START)
  2956. || UNIV_UNLIKELY(offset >= UNIV_PAGE_SIZE)
  2957. || UNIV_UNLIKELY(z_offset >= UNIV_PAGE_SIZE)) {
  2958. corrupt:
  2959. recv_sys->found_corrupt_log = TRUE;
  2960. return(NULL);
  2961. }
  2962. if (page) {
  2963. if (UNIV_UNLIKELY(!page_zip)
  2964. || UNIV_UNLIKELY(!page_is_leaf(page))) {
  2965. goto corrupt;
  2966. }
  2967. #ifdef UNIV_ZIP_DEBUG
  2968. ut_a(page_zip_validate(page_zip, page));
  2969. #endif /* UNIV_ZIP_DEBUG */
  2970. memcpy(page + offset,
  2971. ptr + 4, BTR_EXTERN_FIELD_REF_SIZE);
  2972. memcpy(page_zip->data + z_offset,
  2973. ptr + 4, BTR_EXTERN_FIELD_REF_SIZE);
  2974. #ifdef UNIV_ZIP_DEBUG
  2975. ut_a(page_zip_validate(page_zip, page));
  2976. #endif /* UNIV_ZIP_DEBUG */
  2977. }
  2978. return(ptr + (2 + 2 + BTR_EXTERN_FIELD_REF_SIZE));
  2979. }
  2980. /**************************************************************************
  2981. Write a BLOB pointer of a record on the leaf page of a clustered index.
  2982. The information must already have been updated on the uncompressed page. */
  2983. UNIV_INTERN
  2984. void
  2985. page_zip_write_blob_ptr(
  2986. /*====================*/
  2987. page_zip_des_t* page_zip,/* in/out: compressed page */
  2988. const byte* rec, /* in/out: record whose data is being
  2989. written */
  2990. dict_index_t* index, /* in: index of the page */
  2991. const ulint* offsets,/* in: rec_get_offsets(rec, index) */
  2992. ulint n, /* in: column index */
  2993. mtr_t* mtr) /* in: mini-transaction handle,
  2994. or NULL if no logging is needed */
  2995. {
  2996. const byte* field;
  2997. byte* externs;
  2998. const page_t* page = page_align(rec);
  2999. ulint blob_no;
  3000. ulint len;
  3001. ut_ad(buf_frame_get_page_zip(rec) == page_zip);
  3002. ut_ad(page_simple_validate_new((page_t*) page));
  3003. ut_ad(page_zip_simple_validate(page_zip));
  3004. ut_ad(page_zip_get_size(page_zip)
  3005. > PAGE_DATA + page_zip_dir_size(page_zip));
  3006. ut_ad(rec_offs_comp(offsets));
  3007. ut_ad(rec_offs_validate(rec, NULL, offsets));
  3008. ut_ad(rec_offs_any_extern(offsets));
  3009. ut_ad(rec_offs_nth_extern(offsets, n));
  3010. ut_ad(page_zip->m_start >= PAGE_DATA);
  3011. ut_ad(page_zip_header_cmp(page_zip, page));
  3012. ut_ad(page_is_leaf(page));
  3013. ut_ad(dict_index_is_clust(index));
  3014. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  3015. UNIV_MEM_ASSERT_RW(rec, rec_offs_data_size(offsets));
  3016. UNIV_MEM_ASSERT_RW(rec - rec_offs_extra_size(offsets),
  3017. rec_offs_extra_size(offsets));
  3018. blob_no = page_zip_get_n_prev_extern(page_zip, rec, index)
  3019. + rec_get_n_extern_new(rec, index, n);
  3020. ut_a(blob_no < page_zip->n_blobs);
  3021. externs = page_zip->data + page_zip_get_size(page_zip)
  3022. - (page_dir_get_n_heap(page) - PAGE_HEAP_NO_USER_LOW)
  3023. * (PAGE_ZIP_DIR_SLOT_SIZE
  3024. + DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  3025. field = rec_get_nth_field(rec, offsets, n, &len);
  3026. externs -= (blob_no + 1) * BTR_EXTERN_FIELD_REF_SIZE;
  3027. field += len - BTR_EXTERN_FIELD_REF_SIZE;
  3028. memcpy(externs, field, BTR_EXTERN_FIELD_REF_SIZE);
  3029. #ifdef UNIV_ZIP_DEBUG
  3030. ut_a(page_zip_validate(page_zip, page));
  3031. #endif /* UNIV_ZIP_DEBUG */
  3032. if (mtr) {
  3033. byte* log_ptr = mlog_open(
  3034. mtr, 11 + 2 + 2 + BTR_EXTERN_FIELD_REF_SIZE);
  3035. if (UNIV_UNLIKELY(!log_ptr)) {
  3036. return;
  3037. }
  3038. log_ptr = mlog_write_initial_log_record_fast(
  3039. (byte*) field, MLOG_ZIP_WRITE_BLOB_PTR, log_ptr, mtr);
  3040. mach_write_to_2(log_ptr, page_offset(field));
  3041. log_ptr += 2;
  3042. mach_write_to_2(log_ptr, externs - page_zip->data);
  3043. log_ptr += 2;
  3044. memcpy(log_ptr, externs, BTR_EXTERN_FIELD_REF_SIZE);
  3045. log_ptr += BTR_EXTERN_FIELD_REF_SIZE;
  3046. mlog_close(mtr, log_ptr);
  3047. }
  3048. }
  3049. /***************************************************************
  3050. Parses a log record of writing the node pointer of a record. */
  3051. UNIV_INTERN
  3052. byte*
  3053. page_zip_parse_write_node_ptr(
  3054. /*==========================*/
  3055. /* out: end of log record or NULL */
  3056. byte* ptr, /* in: redo log buffer */
  3057. byte* end_ptr,/* in: redo log buffer end */
  3058. page_t* page, /* in/out: uncompressed page */
  3059. page_zip_des_t* page_zip)/* in/out: compressed page */
  3060. {
  3061. ulint offset;
  3062. ulint z_offset;
  3063. ut_ad(!page == !page_zip);
  3064. if (UNIV_UNLIKELY(end_ptr < ptr + (2 + 2 + REC_NODE_PTR_SIZE))) {
  3065. return(NULL);
  3066. }
  3067. offset = mach_read_from_2(ptr);
  3068. z_offset = mach_read_from_2(ptr + 2);
  3069. if (UNIV_UNLIKELY(offset < PAGE_ZIP_START)
  3070. || UNIV_UNLIKELY(offset >= UNIV_PAGE_SIZE)
  3071. || UNIV_UNLIKELY(z_offset >= UNIV_PAGE_SIZE)) {
  3072. corrupt:
  3073. recv_sys->found_corrupt_log = TRUE;
  3074. return(NULL);
  3075. }
  3076. if (page) {
  3077. byte* storage_end;
  3078. byte* field;
  3079. byte* storage;
  3080. ulint heap_no;
  3081. if (UNIV_UNLIKELY(!page_zip)
  3082. || UNIV_UNLIKELY(page_is_leaf(page))) {
  3083. goto corrupt;
  3084. }
  3085. #ifdef UNIV_ZIP_DEBUG
  3086. ut_a(page_zip_validate(page_zip, page));
  3087. #endif /* UNIV_ZIP_DEBUG */
  3088. field = page + offset;
  3089. storage = page_zip->data + z_offset;
  3090. storage_end = page_zip->data + page_zip_get_size(page_zip)
  3091. - (page_dir_get_n_heap(page) - PAGE_HEAP_NO_USER_LOW)
  3092. * PAGE_ZIP_DIR_SLOT_SIZE;
  3093. heap_no = 1 + (storage_end - storage) / REC_NODE_PTR_SIZE;
  3094. if (UNIV_UNLIKELY((storage_end - storage) % REC_NODE_PTR_SIZE)
  3095. || UNIV_UNLIKELY(heap_no < PAGE_HEAP_NO_USER_LOW)
  3096. || UNIV_UNLIKELY(heap_no >= page_dir_get_n_heap(page))) {
  3097. goto corrupt;
  3098. }
  3099. memcpy(field, ptr + 4, REC_NODE_PTR_SIZE);
  3100. memcpy(storage, ptr + 4, REC_NODE_PTR_SIZE);
  3101. #ifdef UNIV_ZIP_DEBUG
  3102. ut_a(page_zip_validate(page_zip, page));
  3103. #endif /* UNIV_ZIP_DEBUG */
  3104. }
  3105. return(ptr + (2 + 2 + REC_NODE_PTR_SIZE));
  3106. }
  3107. /**************************************************************************
  3108. Write the node pointer of a record on a non-leaf compressed page. */
  3109. UNIV_INTERN
  3110. void
  3111. page_zip_write_node_ptr(
  3112. /*====================*/
  3113. page_zip_des_t* page_zip,/* in/out: compressed page */
  3114. byte* rec, /* in/out: record */
  3115. ulint size, /* in: data size of rec */
  3116. ulint ptr, /* in: node pointer */
  3117. mtr_t* mtr) /* in: mini-transaction, or NULL */
  3118. {
  3119. byte* field;
  3120. byte* storage;
  3121. page_t* page = page_align(rec);
  3122. ut_ad(buf_frame_get_page_zip(rec) == page_zip);
  3123. ut_ad(page_simple_validate_new(page));
  3124. ut_ad(page_zip_simple_validate(page_zip));
  3125. ut_ad(page_zip_get_size(page_zip)
  3126. > PAGE_DATA + page_zip_dir_size(page_zip));
  3127. ut_ad(page_rec_is_comp(rec));
  3128. ut_ad(page_zip->m_start >= PAGE_DATA);
  3129. ut_ad(page_zip_header_cmp(page_zip, page));
  3130. ut_ad(!page_is_leaf(page));
  3131. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  3132. UNIV_MEM_ASSERT_RW(rec, size);
  3133. storage = page_zip->data + page_zip_get_size(page_zip)
  3134. - (page_dir_get_n_heap(page) - PAGE_HEAP_NO_USER_LOW)
  3135. * PAGE_ZIP_DIR_SLOT_SIZE
  3136. - (rec_get_heap_no_new(rec) - 1) * REC_NODE_PTR_SIZE;
  3137. field = rec + size - REC_NODE_PTR_SIZE;
  3138. #if defined UNIV_DEBUG || defined UNIV_ZIP_DEBUG
  3139. ut_a(!memcmp(storage, field, REC_NODE_PTR_SIZE));
  3140. #endif /* UNIV_DEBUG || UNIV_ZIP_DEBUG */
  3141. #if REC_NODE_PTR_SIZE != 4
  3142. # error "REC_NODE_PTR_SIZE != 4"
  3143. #endif
  3144. mach_write_to_4(field, ptr);
  3145. memcpy(storage, field, REC_NODE_PTR_SIZE);
  3146. if (mtr) {
  3147. byte* log_ptr = mlog_open(mtr,
  3148. 11 + 2 + 2 + REC_NODE_PTR_SIZE);
  3149. if (UNIV_UNLIKELY(!log_ptr)) {
  3150. return;
  3151. }
  3152. log_ptr = mlog_write_initial_log_record_fast(
  3153. field, MLOG_ZIP_WRITE_NODE_PTR, log_ptr, mtr);
  3154. mach_write_to_2(log_ptr, page_offset(field));
  3155. log_ptr += 2;
  3156. mach_write_to_2(log_ptr, storage - page_zip->data);
  3157. log_ptr += 2;
  3158. memcpy(log_ptr, field, REC_NODE_PTR_SIZE);
  3159. log_ptr += REC_NODE_PTR_SIZE;
  3160. mlog_close(mtr, log_ptr);
  3161. }
  3162. }
  3163. /**************************************************************************
  3164. Write the trx_id and roll_ptr of a record on a B-tree leaf node page. */
  3165. UNIV_INTERN
  3166. void
  3167. page_zip_write_trx_id_and_roll_ptr(
  3168. /*===============================*/
  3169. page_zip_des_t* page_zip,/* in/out: compressed page */
  3170. byte* rec, /* in/out: record */
  3171. const ulint* offsets,/* in: rec_get_offsets(rec, index) */
  3172. ulint trx_id_col,/* in: column number of TRX_ID in rec */
  3173. dulint trx_id, /* in: transaction identifier */
  3174. dulint roll_ptr)/* in: roll_ptr */
  3175. {
  3176. byte* field;
  3177. byte* storage;
  3178. page_t* page = page_align(rec);
  3179. ulint len;
  3180. ut_ad(buf_frame_get_page_zip(rec) == page_zip);
  3181. ut_ad(page_simple_validate_new(page));
  3182. ut_ad(page_zip_simple_validate(page_zip));
  3183. ut_ad(page_zip_get_size(page_zip)
  3184. > PAGE_DATA + page_zip_dir_size(page_zip));
  3185. ut_ad(rec_offs_validate(rec, NULL, offsets));
  3186. ut_ad(rec_offs_comp(offsets));
  3187. ut_ad(page_zip->m_start >= PAGE_DATA);
  3188. ut_ad(page_zip_header_cmp(page_zip, page));
  3189. ut_ad(page_is_leaf(page));
  3190. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  3191. storage = page_zip->data + page_zip_get_size(page_zip)
  3192. - (page_dir_get_n_heap(page) - PAGE_HEAP_NO_USER_LOW)
  3193. * PAGE_ZIP_DIR_SLOT_SIZE
  3194. - (rec_get_heap_no_new(rec) - 1)
  3195. * (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  3196. #if DATA_TRX_ID + 1 != DATA_ROLL_PTR
  3197. # error "DATA_TRX_ID + 1 != DATA_ROLL_PTR"
  3198. #endif
  3199. field = rec_get_nth_field(rec, offsets, trx_id_col, &len);
  3200. ut_ad(len == DATA_TRX_ID_LEN);
  3201. ut_ad(field + DATA_TRX_ID_LEN
  3202. == rec_get_nth_field(rec, offsets, trx_id_col + 1, &len));
  3203. ut_ad(len == DATA_ROLL_PTR_LEN);
  3204. #if defined UNIV_DEBUG || defined UNIV_ZIP_DEBUG
  3205. ut_a(!memcmp(storage, field, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN));
  3206. #endif /* UNIV_DEBUG || UNIV_ZIP_DEBUG */
  3207. #if DATA_TRX_ID_LEN != 6
  3208. # error "DATA_TRX_ID_LEN != 6"
  3209. #endif
  3210. mach_write_to_6(field, trx_id);
  3211. #if DATA_ROLL_PTR_LEN != 7
  3212. # error "DATA_ROLL_PTR_LEN != 7"
  3213. #endif
  3214. mach_write_to_7(field + DATA_TRX_ID_LEN, roll_ptr);
  3215. memcpy(storage, field, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  3216. UNIV_MEM_ASSERT_RW(rec, rec_offs_data_size(offsets));
  3217. UNIV_MEM_ASSERT_RW(rec - rec_offs_extra_size(offsets),
  3218. rec_offs_extra_size(offsets));
  3219. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  3220. }
  3221. #ifdef UNIV_ZIP_DEBUG
  3222. /* Set this variable in a debugger to disable page_zip_clear_rec().
  3223. The only observable effect should be the compression ratio due to
  3224. deleted records not being zeroed out. In rare cases, there can be
  3225. page_zip_validate() failures on the node_ptr, trx_id and roll_ptr
  3226. columns if the space is reallocated for a smaller record. */
  3227. UNIV_INTERN ibool page_zip_clear_rec_disable;
  3228. #endif /* UNIV_ZIP_DEBUG */
  3229. /**************************************************************************
  3230. Clear an area on the uncompressed and compressed page, if possible. */
  3231. static
  3232. void
  3233. page_zip_clear_rec(
  3234. /*===============*/
  3235. page_zip_des_t* page_zip,/* in/out: compressed page */
  3236. byte* rec, /* in: record to clear */
  3237. dict_index_t* index, /* in: index of rec */
  3238. const ulint* offsets)/* in: rec_get_offsets(rec, index) */
  3239. {
  3240. ulint heap_no;
  3241. page_t* page = page_align(rec);
  3242. /* page_zip_validate() would fail here if a record
  3243. containing externally stored columns is being deleted. */
  3244. ut_ad(rec_offs_validate(rec, index, offsets));
  3245. ut_ad(!page_zip_dir_find(page_zip, page_offset(rec)));
  3246. ut_ad(page_zip_dir_find_free(page_zip, page_offset(rec)));
  3247. ut_ad(page_zip_header_cmp(page_zip, page));
  3248. heap_no = rec_get_heap_no_new(rec);
  3249. ut_ad(heap_no >= PAGE_HEAP_NO_USER_LOW);
  3250. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  3251. UNIV_MEM_ASSERT_RW(rec, rec_offs_data_size(offsets));
  3252. UNIV_MEM_ASSERT_RW(rec - rec_offs_extra_size(offsets),
  3253. rec_offs_extra_size(offsets));
  3254. if (
  3255. #ifdef UNIV_ZIP_DEBUG
  3256. !page_zip_clear_rec_disable &&
  3257. #endif /* UNIV_ZIP_DEBUG */
  3258. page_zip->m_end
  3259. + 1 + ((heap_no - 1) >= 64)/* size of the log entry */
  3260. + page_zip_get_trailer_len(page_zip,
  3261. dict_index_is_clust(index), NULL)
  3262. < page_zip_get_size(page_zip)) {
  3263. byte* data;
  3264. /* Clear only the data bytes, because the allocator and
  3265. the decompressor depend on the extra bytes. */
  3266. memset(rec, 0, rec_offs_data_size(offsets));
  3267. if (!page_is_leaf(page)) {
  3268. /* Clear node_ptr on the compressed page. */
  3269. byte* storage = page_zip->data
  3270. + page_zip_get_size(page_zip)
  3271. - (page_dir_get_n_heap(page)
  3272. - PAGE_HEAP_NO_USER_LOW)
  3273. * PAGE_ZIP_DIR_SLOT_SIZE;
  3274. memset(storage - (heap_no - 1) * REC_NODE_PTR_SIZE,
  3275. 0, REC_NODE_PTR_SIZE);
  3276. } else if (dict_index_is_clust(index)) {
  3277. /* Clear trx_id and roll_ptr on the compressed page. */
  3278. byte* storage = page_zip->data
  3279. + page_zip_get_size(page_zip)
  3280. - (page_dir_get_n_heap(page)
  3281. - PAGE_HEAP_NO_USER_LOW)
  3282. * PAGE_ZIP_DIR_SLOT_SIZE;
  3283. memset(storage - (heap_no - 1)
  3284. * (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN),
  3285. 0, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  3286. }
  3287. /* Log that the data was zeroed out. */
  3288. data = page_zip->data + page_zip->m_end;
  3289. ut_ad(!*data);
  3290. if (UNIV_UNLIKELY(heap_no - 1 >= 64)) {
  3291. *data++ = (byte) (0x80 | (heap_no - 1) >> 7);
  3292. ut_ad(!*data);
  3293. }
  3294. *data++ = (byte) ((heap_no - 1) << 1 | 1);
  3295. ut_ad(!*data);
  3296. ut_ad((ulint) (data - page_zip->data)
  3297. < page_zip_get_size(page_zip));
  3298. page_zip->m_end = data - page_zip->data;
  3299. page_zip->m_nonempty = TRUE;
  3300. } else if (page_is_leaf(page) && dict_index_is_clust(index)) {
  3301. /* Do not clear the record, because there is not enough space
  3302. to log the operation. */
  3303. if (rec_offs_any_extern(offsets)) {
  3304. ulint i;
  3305. for (i = rec_offs_n_fields(offsets); i--; ) {
  3306. /* Clear all BLOB pointers in order to make
  3307. page_zip_validate() pass. */
  3308. if (rec_offs_nth_extern(offsets, i)) {
  3309. ulint len;
  3310. byte* field = rec_get_nth_field(
  3311. rec, offsets, i, &len);
  3312. memset(field + len
  3313. - BTR_EXTERN_FIELD_REF_SIZE,
  3314. 0, BTR_EXTERN_FIELD_REF_SIZE);
  3315. }
  3316. }
  3317. }
  3318. }
  3319. #ifdef UNIV_ZIP_DEBUG
  3320. ut_a(page_zip_validate(page_zip, page));
  3321. #endif /* UNIV_ZIP_DEBUG */
  3322. }
  3323. /**************************************************************************
  3324. Write the "deleted" flag of a record on a compressed page. The flag must
  3325. already have been written on the uncompressed page. */
  3326. UNIV_INTERN
  3327. void
  3328. page_zip_rec_set_deleted(
  3329. /*=====================*/
  3330. page_zip_des_t* page_zip,/* in/out: compressed page */
  3331. const byte* rec, /* in: record on the uncompressed page */
  3332. ulint flag) /* in: the deleted flag (nonzero=TRUE) */
  3333. {
  3334. byte* slot = page_zip_dir_find(page_zip, page_offset(rec));
  3335. ut_a(slot);
  3336. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  3337. if (flag) {
  3338. *slot |= (PAGE_ZIP_DIR_SLOT_DEL >> 8);
  3339. } else {
  3340. *slot &= ~(PAGE_ZIP_DIR_SLOT_DEL >> 8);
  3341. }
  3342. #ifdef UNIV_ZIP_DEBUG
  3343. ut_a(page_zip_validate(page_zip, page_align(rec)));
  3344. #endif /* UNIV_ZIP_DEBUG */
  3345. }
  3346. /**************************************************************************
  3347. Write the "owned" flag of a record on a compressed page. The n_owned field
  3348. must already have been written on the uncompressed page. */
  3349. UNIV_INTERN
  3350. void
  3351. page_zip_rec_set_owned(
  3352. /*===================*/
  3353. page_zip_des_t* page_zip,/* in/out: compressed page */
  3354. const byte* rec, /* in: record on the uncompressed page */
  3355. ulint flag) /* in: the owned flag (nonzero=TRUE) */
  3356. {
  3357. byte* slot = page_zip_dir_find(page_zip, page_offset(rec));
  3358. ut_a(slot);
  3359. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  3360. if (flag) {
  3361. *slot |= (PAGE_ZIP_DIR_SLOT_OWNED >> 8);
  3362. } else {
  3363. *slot &= ~(PAGE_ZIP_DIR_SLOT_OWNED >> 8);
  3364. }
  3365. }
  3366. /**************************************************************************
  3367. Insert a record to the dense page directory. */
  3368. UNIV_INTERN
  3369. void
  3370. page_zip_dir_insert(
  3371. /*================*/
  3372. page_zip_des_t* page_zip,/* in/out: compressed page */
  3373. const byte* prev_rec,/* in: record after which to insert */
  3374. const byte* free_rec,/* in: record from which rec was
  3375. allocated, or NULL */
  3376. byte* rec) /* in: record to insert */
  3377. {
  3378. ulint n_dense;
  3379. byte* slot_rec;
  3380. byte* slot_free;
  3381. ut_ad(prev_rec != rec);
  3382. ut_ad(page_rec_get_next((rec_t*) prev_rec) == rec);
  3383. ut_ad(page_zip_simple_validate(page_zip));
  3384. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  3385. if (page_rec_is_infimum(prev_rec)) {
  3386. /* Use the first slot. */
  3387. slot_rec = page_zip->data + page_zip_get_size(page_zip);
  3388. } else {
  3389. byte* end = page_zip->data + page_zip_get_size(page_zip);
  3390. byte* start = end - page_zip_dir_user_size(page_zip);
  3391. if (UNIV_LIKELY(!free_rec)) {
  3392. /* PAGE_N_RECS was already incremented
  3393. in page_cur_insert_rec_zip(), but the
  3394. dense directory slot at that position
  3395. contains garbage. Skip it. */
  3396. start += PAGE_ZIP_DIR_SLOT_SIZE;
  3397. }
  3398. slot_rec = page_zip_dir_find_low(start, end,
  3399. page_offset(prev_rec));
  3400. ut_a(slot_rec);
  3401. }
  3402. /* Read the old n_dense (n_heap may have been incremented). */
  3403. n_dense = page_dir_get_n_heap(page_zip->data)
  3404. - (PAGE_HEAP_NO_USER_LOW + 1);
  3405. if (UNIV_LIKELY_NULL(free_rec)) {
  3406. /* The record was allocated from the free list.
  3407. Shift the dense directory only up to that slot.
  3408. Note that in this case, n_dense is actually
  3409. off by one, because page_cur_insert_rec_zip()
  3410. did not increment n_heap. */
  3411. ut_ad(rec_get_heap_no_new(rec) < n_dense + 1
  3412. + PAGE_HEAP_NO_USER_LOW);
  3413. ut_ad(rec >= free_rec);
  3414. slot_free = page_zip_dir_find(page_zip, page_offset(free_rec));
  3415. ut_ad(slot_free);
  3416. slot_free += PAGE_ZIP_DIR_SLOT_SIZE;
  3417. } else {
  3418. /* The record was allocated from the heap.
  3419. Shift the entire dense directory. */
  3420. ut_ad(rec_get_heap_no_new(rec) == n_dense
  3421. + PAGE_HEAP_NO_USER_LOW);
  3422. /* Shift to the end of the dense page directory. */
  3423. slot_free = page_zip->data + page_zip_get_size(page_zip)
  3424. - PAGE_ZIP_DIR_SLOT_SIZE * n_dense;
  3425. }
  3426. /* Shift the dense directory to allocate place for rec. */
  3427. memmove(slot_free - PAGE_ZIP_DIR_SLOT_SIZE, slot_free,
  3428. slot_rec - slot_free);
  3429. /* Write the entry for the inserted record.
  3430. The "owned" and "deleted" flags must be zero. */
  3431. mach_write_to_2(slot_rec - PAGE_ZIP_DIR_SLOT_SIZE, page_offset(rec));
  3432. }
  3433. /**************************************************************************
  3434. Shift the dense page directory and the array of BLOB pointers
  3435. when a record is deleted. */
  3436. UNIV_INTERN
  3437. void
  3438. page_zip_dir_delete(
  3439. /*================*/
  3440. page_zip_des_t* page_zip,/* in/out: compressed page */
  3441. byte* rec, /* in: record to delete */
  3442. dict_index_t* index, /* in: index of rec */
  3443. const ulint* offsets,/* in: rec_get_offsets(rec) */
  3444. const byte* free) /* in: previous start of the free list */
  3445. {
  3446. byte* slot_rec;
  3447. byte* slot_free;
  3448. ulint n_ext;
  3449. page_t* page = page_align(rec);
  3450. ut_ad(rec_offs_validate(rec, index, offsets));
  3451. ut_ad(rec_offs_comp(offsets));
  3452. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  3453. UNIV_MEM_ASSERT_RW(rec, rec_offs_data_size(offsets));
  3454. UNIV_MEM_ASSERT_RW(rec - rec_offs_extra_size(offsets),
  3455. rec_offs_extra_size(offsets));
  3456. slot_rec = page_zip_dir_find(page_zip, page_offset(rec));
  3457. ut_a(slot_rec);
  3458. /* This could not be done before page_zip_dir_find(). */
  3459. page_header_set_field(page, page_zip, PAGE_N_RECS,
  3460. (ulint)(page_get_n_recs(page) - 1));
  3461. if (UNIV_UNLIKELY(!free)) {
  3462. /* Make the last slot the start of the free list. */
  3463. slot_free = page_zip->data + page_zip_get_size(page_zip)
  3464. - PAGE_ZIP_DIR_SLOT_SIZE
  3465. * (page_dir_get_n_heap(page_zip->data)
  3466. - PAGE_HEAP_NO_USER_LOW);
  3467. } else {
  3468. slot_free = page_zip_dir_find_free(page_zip,
  3469. page_offset(free));
  3470. ut_a(slot_free < slot_rec);
  3471. /* Grow the free list by one slot by moving the start. */
  3472. slot_free += PAGE_ZIP_DIR_SLOT_SIZE;
  3473. }
  3474. if (UNIV_LIKELY(slot_rec > slot_free)) {
  3475. memmove(slot_free + PAGE_ZIP_DIR_SLOT_SIZE,
  3476. slot_free,
  3477. slot_rec - slot_free);
  3478. }
  3479. /* Write the entry for the deleted record.
  3480. The "owned" and "deleted" flags will be cleared. */
  3481. mach_write_to_2(slot_free, page_offset(rec));
  3482. if (!page_is_leaf(page) || !dict_index_is_clust(index)) {
  3483. ut_ad(!rec_offs_any_extern(offsets));
  3484. goto skip_blobs;
  3485. }
  3486. n_ext = rec_offs_n_extern(offsets);
  3487. if (UNIV_UNLIKELY(n_ext)) {
  3488. /* Shift and zero fill the array of BLOB pointers. */
  3489. ulint blob_no;
  3490. byte* externs;
  3491. byte* ext_end;
  3492. blob_no = page_zip_get_n_prev_extern(page_zip, rec, index);
  3493. ut_a(blob_no + n_ext <= page_zip->n_blobs);
  3494. externs = page_zip->data + page_zip_get_size(page_zip)
  3495. - (page_dir_get_n_heap(page) - PAGE_HEAP_NO_USER_LOW)
  3496. * (PAGE_ZIP_DIR_SLOT_SIZE
  3497. + DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  3498. ext_end = externs - page_zip->n_blobs
  3499. * BTR_EXTERN_FIELD_REF_SIZE;
  3500. externs -= blob_no * BTR_EXTERN_FIELD_REF_SIZE;
  3501. page_zip->n_blobs -= n_ext;
  3502. /* Shift and zero fill the array. */
  3503. memmove(ext_end + n_ext * BTR_EXTERN_FIELD_REF_SIZE, ext_end,
  3504. (page_zip->n_blobs - blob_no)
  3505. * BTR_EXTERN_FIELD_REF_SIZE);
  3506. memset(ext_end, 0, n_ext * BTR_EXTERN_FIELD_REF_SIZE);
  3507. }
  3508. skip_blobs:
  3509. /* The compression algorithm expects info_bits and n_owned
  3510. to be 0 for deleted records. */
  3511. rec[-REC_N_NEW_EXTRA_BYTES] = 0; /* info_bits and n_owned */
  3512. page_zip_clear_rec(page_zip, rec, index, offsets);
  3513. }
  3514. /**************************************************************************
  3515. Add a slot to the dense page directory. */
  3516. UNIV_INTERN
  3517. void
  3518. page_zip_dir_add_slot(
  3519. /*==================*/
  3520. page_zip_des_t* page_zip, /* in/out: compressed page */
  3521. ulint is_clustered) /* in: nonzero for clustered index,
  3522. zero for others */
  3523. {
  3524. ulint n_dense;
  3525. byte* dir;
  3526. byte* stored;
  3527. ut_ad(page_is_comp(page_zip->data));
  3528. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  3529. /* Read the old n_dense (n_heap has already been incremented). */
  3530. n_dense = page_dir_get_n_heap(page_zip->data)
  3531. - (PAGE_HEAP_NO_USER_LOW + 1);
  3532. dir = page_zip->data + page_zip_get_size(page_zip)
  3533. - PAGE_ZIP_DIR_SLOT_SIZE * n_dense;
  3534. if (!page_is_leaf(page_zip->data)) {
  3535. ut_ad(!page_zip->n_blobs);
  3536. stored = dir - n_dense * REC_NODE_PTR_SIZE;
  3537. } else if (UNIV_UNLIKELY(is_clustered)) {
  3538. /* Move the BLOB pointer array backwards to make space for the
  3539. roll_ptr and trx_id columns and the dense directory slot. */
  3540. byte* externs;
  3541. stored = dir - n_dense
  3542. * (DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  3543. externs = stored
  3544. - page_zip->n_blobs * BTR_EXTERN_FIELD_REF_SIZE;
  3545. ASSERT_ZERO(externs
  3546. - (PAGE_ZIP_DIR_SLOT_SIZE
  3547. + DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN),
  3548. PAGE_ZIP_DIR_SLOT_SIZE
  3549. + DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
  3550. memmove(externs - (PAGE_ZIP_DIR_SLOT_SIZE
  3551. + DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN),
  3552. externs, stored - externs);
  3553. } else {
  3554. stored = dir
  3555. - page_zip->n_blobs * BTR_EXTERN_FIELD_REF_SIZE;
  3556. ASSERT_ZERO(stored - PAGE_ZIP_DIR_SLOT_SIZE,
  3557. PAGE_ZIP_DIR_SLOT_SIZE);
  3558. }
  3559. /* Move the uncompressed area backwards to make space
  3560. for one directory slot. */
  3561. memmove(stored - PAGE_ZIP_DIR_SLOT_SIZE, stored, dir - stored);
  3562. }
  3563. /***************************************************************
  3564. Parses a log record of writing to the header of a page. */
  3565. UNIV_INTERN
  3566. byte*
  3567. page_zip_parse_write_header(
  3568. /*========================*/
  3569. /* out: end of log record or NULL */
  3570. byte* ptr, /* in: redo log buffer */
  3571. byte* end_ptr,/* in: redo log buffer end */
  3572. page_t* page, /* in/out: uncompressed page */
  3573. page_zip_des_t* page_zip)/* in/out: compressed page */
  3574. {
  3575. ulint offset;
  3576. ulint len;
  3577. ut_ad(ptr && end_ptr);
  3578. ut_ad(!page == !page_zip);
  3579. if (UNIV_UNLIKELY(end_ptr < ptr + (1 + 1))) {
  3580. return(NULL);
  3581. }
  3582. offset = (ulint) *ptr++;
  3583. len = (ulint) *ptr++;
  3584. if (UNIV_UNLIKELY(!len) || UNIV_UNLIKELY(offset + len >= PAGE_DATA)) {
  3585. corrupt:
  3586. recv_sys->found_corrupt_log = TRUE;
  3587. return(NULL);
  3588. }
  3589. if (UNIV_UNLIKELY(end_ptr < ptr + len)) {
  3590. return(NULL);
  3591. }
  3592. if (page) {
  3593. if (UNIV_UNLIKELY(!page_zip)) {
  3594. goto corrupt;
  3595. }
  3596. #ifdef UNIV_ZIP_DEBUG
  3597. ut_a(page_zip_validate(page_zip, page));
  3598. #endif /* UNIV_ZIP_DEBUG */
  3599. memcpy(page + offset, ptr, len);
  3600. memcpy(page_zip->data + offset, ptr, len);
  3601. #ifdef UNIV_ZIP_DEBUG
  3602. ut_a(page_zip_validate(page_zip, page));
  3603. #endif /* UNIV_ZIP_DEBUG */
  3604. }
  3605. return(ptr + len);
  3606. }
  3607. /**************************************************************************
  3608. Write a log record of writing to the uncompressed header portion of a page. */
  3609. UNIV_INTERN
  3610. void
  3611. page_zip_write_header_log(
  3612. /*======================*/
  3613. const byte* data, /* in: data on the uncompressed page */
  3614. ulint length, /* in: length of the data */
  3615. mtr_t* mtr) /* in: mini-transaction */
  3616. {
  3617. byte* log_ptr = mlog_open(mtr, 11 + 1 + 1);
  3618. ulint offset = page_offset(data);
  3619. ut_ad(offset < PAGE_DATA);
  3620. ut_ad(offset + length < PAGE_DATA);
  3621. #if PAGE_DATA > 255
  3622. # error "PAGE_DATA > 255"
  3623. #endif
  3624. ut_ad(length < 256);
  3625. /* If no logging is requested, we may return now */
  3626. if (UNIV_UNLIKELY(!log_ptr)) {
  3627. return;
  3628. }
  3629. log_ptr = mlog_write_initial_log_record_fast(
  3630. (byte*) data, MLOG_ZIP_WRITE_HEADER, log_ptr, mtr);
  3631. *log_ptr++ = (byte) offset;
  3632. *log_ptr++ = (byte) length;
  3633. mlog_close(mtr, log_ptr);
  3634. mlog_catenate_string(mtr, data, length);
  3635. }
  3636. /**************************************************************************
  3637. Reorganize and compress a page. This is a low-level operation for
  3638. compressed pages, to be used when page_zip_compress() fails.
  3639. On success, a redo log entry MLOG_ZIP_PAGE_COMPRESS will be written.
  3640. The function btr_page_reorganize() should be preferred whenever possible.
  3641. IMPORTANT: if page_zip_reorganize() is invoked on a leaf page of a
  3642. non-clustered index, the caller must update the insert buffer free
  3643. bits in the same mini-transaction in such a way that the modification
  3644. will be redo-logged. */
  3645. UNIV_INTERN
  3646. ibool
  3647. page_zip_reorganize(
  3648. /*================*/
  3649. /* out: TRUE on success, FALSE on failure;
  3650. page and page_zip will be left intact
  3651. on failure. */
  3652. buf_block_t* block, /* in/out: page with compressed page;
  3653. on the compressed page, in: size;
  3654. out: data, n_blobs,
  3655. m_start, m_end, m_nonempty */
  3656. dict_index_t* index, /* in: index of the B-tree node */
  3657. mtr_t* mtr) /* in: mini-transaction */
  3658. {
  3659. page_zip_des_t* page_zip = buf_block_get_page_zip(block);
  3660. page_t* page = buf_block_get_frame(block);
  3661. buf_block_t* temp_block;
  3662. page_t* temp_page;
  3663. ulint log_mode;
  3664. ut_ad(mtr_memo_contains(mtr, block, MTR_MEMO_PAGE_X_FIX));
  3665. ut_ad(page_is_comp(page));
  3666. /* Note that page_zip_validate(page_zip, page) may fail here. */
  3667. UNIV_MEM_ASSERT_RW(page, UNIV_PAGE_SIZE);
  3668. UNIV_MEM_ASSERT_RW(page_zip->data, page_zip_get_size(page_zip));
  3669. /* Disable logging */
  3670. log_mode = mtr_set_log_mode(mtr, MTR_LOG_NONE);
  3671. temp_block = buf_block_alloc(0);
  3672. temp_page = temp_block->frame;
  3673. btr_search_drop_page_hash_index(block);
  3674. /* Copy the old page to temporary space */
  3675. buf_frame_copy(temp_page, page);
  3676. /* Recreate the page: note that global data on page (possible
  3677. segment headers, next page-field, etc.) is preserved intact */
  3678. page_create(block, mtr, TRUE);
  3679. block->check_index_page_at_flush = TRUE;
  3680. /* Copy the records from the temporary space to the recreated page;
  3681. do not copy the lock bits yet */
  3682. page_copy_rec_list_end_no_locks(block, temp_block,
  3683. page_get_infimum_rec(temp_page),
  3684. index, mtr);
  3685. /* Copy max trx id to recreated page */
  3686. page_set_max_trx_id(block, NULL, page_get_max_trx_id(temp_page));
  3687. /* Restore logging. */
  3688. mtr_set_log_mode(mtr, log_mode);
  3689. if (UNIV_UNLIKELY(!page_zip_compress(page_zip, page, index, mtr))) {
  3690. /* Restore the old page and exit. */
  3691. buf_frame_copy(page, temp_page);
  3692. buf_block_free(temp_block);
  3693. return(FALSE);
  3694. }
  3695. lock_move_reorganize_page(block, temp_block);
  3696. buf_block_free(temp_block);
  3697. return(TRUE);
  3698. }
  3699. /**************************************************************************
  3700. Copy the records of a page byte for byte. Do not copy the page header
  3701. or trailer, except those B-tree header fields that are directly
  3702. related to the storage of records. */
  3703. UNIV_INTERN
  3704. void
  3705. page_zip_copy_recs(
  3706. /*===============*/
  3707. page_zip_des_t* page_zip, /* out: copy of src_zip
  3708. (n_blobs, m_start, m_end,
  3709. m_nonempty, data[0..size-1]) */
  3710. page_t* page, /* out: copy of src */
  3711. const page_zip_des_t* src_zip, /* in: compressed page */
  3712. const page_t* src, /* in: page */
  3713. dict_index_t* index, /* in: index of the B-tree */
  3714. mtr_t* mtr) /* in: mini-transaction */
  3715. {
  3716. ut_ad(mtr_memo_contains_page(mtr, page, MTR_MEMO_PAGE_X_FIX));
  3717. ut_ad(mtr_memo_contains_page(mtr, (page_t*) src, MTR_MEMO_PAGE_X_FIX));
  3718. #ifdef UNIV_ZIP_DEBUG
  3719. ut_a(page_zip_validate(src_zip, src));
  3720. #endif /* UNIV_ZIP_DEBUG */
  3721. ut_a(page_zip_get_size(page_zip) == page_zip_get_size(src_zip));
  3722. if (UNIV_UNLIKELY(src_zip->n_blobs)) {
  3723. ut_a(page_is_leaf(src));
  3724. ut_a(dict_index_is_clust(index));
  3725. }
  3726. UNIV_MEM_ASSERT_W(page, UNIV_PAGE_SIZE);
  3727. UNIV_MEM_ASSERT_W(page_zip->data, page_zip_get_size(page_zip));
  3728. UNIV_MEM_ASSERT_RW(src, UNIV_PAGE_SIZE);
  3729. UNIV_MEM_ASSERT_RW(src_zip->data, page_zip_get_size(page_zip));
  3730. /* Copy those B-tree page header fields that are related to
  3731. the records stored in the page. Do not copy the field
  3732. PAGE_MAX_TRX_ID. Skip the rest of the page header and
  3733. trailer. On the compressed page, there is no trailer. */
  3734. #if PAGE_MAX_TRX_ID + 8 != PAGE_HEADER_PRIV_END
  3735. # error "PAGE_MAX_TRX_ID + 8 != PAGE_HEADER_PRIV_END"
  3736. #endif
  3737. memcpy(PAGE_HEADER + page, PAGE_HEADER + src,
  3738. PAGE_MAX_TRX_ID);
  3739. memcpy(PAGE_DATA + page, PAGE_DATA + src,
  3740. UNIV_PAGE_SIZE - PAGE_DATA - FIL_PAGE_DATA_END);
  3741. memcpy(PAGE_HEADER + page_zip->data, PAGE_HEADER + src_zip->data,
  3742. PAGE_MAX_TRX_ID);
  3743. memcpy(PAGE_DATA + page_zip->data, PAGE_DATA + src_zip->data,
  3744. page_zip_get_size(page_zip) - PAGE_DATA);
  3745. /* Copy all fields of src_zip to page_zip, except the pointer
  3746. to the compressed data page. */
  3747. {
  3748. page_zip_t* data = page_zip->data;
  3749. memcpy(page_zip, src_zip, sizeof *page_zip);
  3750. page_zip->data = data;
  3751. }
  3752. ut_ad(page_zip_get_trailer_len(page_zip,
  3753. dict_index_is_clust(index), NULL)
  3754. + page_zip->m_end < page_zip_get_size(page_zip));
  3755. if (!page_is_leaf(src)
  3756. && UNIV_UNLIKELY(mach_read_from_4(src + FIL_PAGE_PREV) == FIL_NULL)
  3757. && UNIV_LIKELY(mach_read_from_4(page
  3758. + FIL_PAGE_PREV) != FIL_NULL)) {
  3759. /* Clear the REC_INFO_MIN_REC_FLAG of the first user record. */
  3760. ulint offs = rec_get_next_offs(page + PAGE_NEW_INFIMUM,
  3761. TRUE);
  3762. if (UNIV_LIKELY(offs != PAGE_NEW_SUPREMUM)) {
  3763. rec_t* rec = page + offs;
  3764. ut_a(rec[-REC_N_NEW_EXTRA_BYTES]
  3765. & REC_INFO_MIN_REC_FLAG);
  3766. rec[-REC_N_NEW_EXTRA_BYTES] &= ~ REC_INFO_MIN_REC_FLAG;
  3767. }
  3768. }
  3769. #ifdef UNIV_ZIP_DEBUG
  3770. ut_a(page_zip_validate(page_zip, page));
  3771. #endif /* UNIV_ZIP_DEBUG */
  3772. page_zip_compress_write_log(page_zip, page, index, mtr);
  3773. }
  3774. /**************************************************************************
  3775. Parses a log record of compressing an index page. */
  3776. UNIV_INTERN
  3777. byte*
  3778. page_zip_parse_compress(
  3779. /*====================*/
  3780. /* out: end of log record or NULL */
  3781. byte* ptr, /* in: buffer */
  3782. byte* end_ptr,/* in: buffer end */
  3783. page_t* page, /* out: uncompressed page */
  3784. page_zip_des_t* page_zip)/* out: compressed page */
  3785. {
  3786. ulint size;
  3787. ulint trailer_size;
  3788. ut_ad(ptr && end_ptr);
  3789. ut_ad(!page == !page_zip);
  3790. if (UNIV_UNLIKELY(ptr + (2 + 2) > end_ptr)) {
  3791. return(NULL);
  3792. }
  3793. size = mach_read_from_2(ptr);
  3794. ptr += 2;
  3795. trailer_size = mach_read_from_2(ptr);
  3796. ptr += 2;
  3797. if (UNIV_UNLIKELY(ptr + 8 + size + trailer_size > end_ptr)) {
  3798. return(NULL);
  3799. }
  3800. if (page) {
  3801. if (UNIV_UNLIKELY(!page_zip)
  3802. || UNIV_UNLIKELY(page_zip_get_size(page_zip) < size)) {
  3803. corrupt:
  3804. recv_sys->found_corrupt_log = TRUE;
  3805. return(NULL);
  3806. }
  3807. memcpy(page_zip->data + FIL_PAGE_PREV, ptr, 4);
  3808. memcpy(page_zip->data + FIL_PAGE_NEXT, ptr + 4, 4);
  3809. memcpy(page_zip->data + FIL_PAGE_TYPE, ptr + 8, size);
  3810. memset(page_zip->data + FIL_PAGE_TYPE + size, 0,
  3811. page_zip_get_size(page_zip) - trailer_size
  3812. - (FIL_PAGE_TYPE + size));
  3813. memcpy(page_zip->data + page_zip_get_size(page_zip)
  3814. - trailer_size, ptr + 8 + size, trailer_size);
  3815. if (UNIV_UNLIKELY(!page_zip_decompress(page_zip, page))) {
  3816. goto corrupt;
  3817. }
  3818. }
  3819. return(ptr + 8 + size + trailer_size);
  3820. }
  3821. /**************************************************************************
  3822. Calculate the compressed page checksum. */
  3823. UNIV_INTERN
  3824. ulint
  3825. page_zip_calc_checksum(
  3826. /*===================*/
  3827. /* out: page checksum */
  3828. const void* data, /* in: compressed page */
  3829. ulint size) /* in: size of compressed page */
  3830. {
  3831. /* Exclude FIL_PAGE_SPACE_OR_CHKSUM, FIL_PAGE_LSN,
  3832. and FIL_PAGE_FILE_FLUSH_LSN from the checksum. */
  3833. const Bytef* s = data;
  3834. uLong adler;
  3835. ut_ad(size > FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID);
  3836. adler = adler32(0L, s + FIL_PAGE_OFFSET,
  3837. FIL_PAGE_LSN - FIL_PAGE_OFFSET);
  3838. adler = adler32(adler, s + FIL_PAGE_TYPE, 2);
  3839. adler = adler32(adler, s + FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID,
  3840. size - FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID);
  3841. return((ulint) adler);
  3842. }